11 minute read

Most systems we build today are delivered as containers. Container registries and associated technologies are an important cog in this ecosystem. As the container ecosystem matures, there is an increased need to consume associated artefacts like Helm packages, software bill of materials, evidence of provenance, machine learning data sets etc from the same storage. There are even upcoming use cases like WebAssembly libraries that need a home. Container registries have evolved to become more than their initial need.

The OCI Working Group for Reference Types are planning changes to the OCI spec to support these scenarios. In this post we will have a look at how we got here and how projects like ORAS are driving innovation when it comes to storing artefacts and how it’s redefining what a container registry is.

Note: There have been some recent updates to the OCI image spec and ORAS (August 2023) and they are covered here.

Intro to OCI

You have no doubt heard of Docker and containers. Since Docker donated their technology to the open source community, a large community of people including tech giants have come together to make containers the defacto unit of software delivery.

The Open Container Initiative (OCI) was launched in 2015 by Docker and other industry leaders as an open governance structure project. Over the years Docker has kept donating more stuff to the open source community.

But OCI is not a replacement for Docker. Docker is a platform while OCI exists with the sole purpose of creating open industry standards around container formats and runtimes.

From the OCI website: https://opencontainers.org/about/overview/

The OCI currently contains three specifications: the Runtime Specification (runtime-spec), the Image Specification (image-spec) and the Distribution Specification (distribution-spec).

Over the years OCI have defined their own specification and standards to support various technical and business needs.

Comparing Docker Image v2 schema 2 vs OCI 1.0 Image schema

Docker vs OCI image manifest Click to enlarge.

As you can observe the key differences are just in the mediaType fields. Instead of the application/vnd.docker.* the OCI spec has application/vnd.oci.*. The OCI spec additionally supports annotations as well.

Same story with the Index Manifest

The image index (fat manifest) is a higher-level manifest which points to specific image manifests, ideal for one or more platforms. This is useful when storing multi architecture images.

I won’t do a side by side comparison here but you will see the same differences in mediaType there as well.

That’s great for images, but what about other artefacts?

We live in a container world, in fact we live in a Kubernetes world. So container registries have become paramount in this ecosystem.

But your software system might not be composed of just container images. What about thing like Helm Charts? You may also have files or other supply chain assets like SBOMs as well.

If you need those files inside your k8s cluster, you used to have 2 options.

  • Store the file in some blob storage and allow the cluster to pull it down as required. But what about versioning, replication, edge and disconnected scenarios etc?
  • Store your file inside a container image and store it in a container registry. At least this way the dependencies are in the same place as the container image. But this feels like cheating.

As the world kept moving more and more workloads to k8s, the industry realized we need a way to store more than container images in container registries and we needed to support that as a first class concept.

Think about it, the container registry is the best place to store it. Artefacts can be versioned and the inherent nature of the registry where manifests and blob content can be stored separately made it ideal.

Container registries needed to metamorphosize into artefact registries.

Steve Lasker makes this argument more eloquently than I did.

Enter OCI v1.1 Specification

With OCI v1.1 spec we finally got support for artefacts as a first class concept.

Content other than OCI container images MAY be packaged using the image manifest. When this is done, the config.mediaType value MUST be set to a value specific to the artifact type or the empty value. If the config.mediaType is set to the empty value, the artifactType MUST be defined. If the artifact does not need layers, a single layer SHOULD be included with a non-zero size. The suggested content for an unused layers array is the empty descriptor.

  • an [image].artifactType field was also introduced.

    This OPTIONAL property contains the type of an artifact when the manifest is used for an artifact. This MUST be set when config.mediaType is set to the empty value. If defined, the value MUST comply with RFC 6838, including the naming requirements in its section 4.2, and MAY be registered with IANA. Implementations storing or copying image manifests MUST NOT error on encountering an artifactType that is unknown to the implementation.

  • This meant artefact authors could now leverage the existing image manifest to store artefacts in a way that works with the Content Addressable Storage (CAS) capabilities of OCI Distribution.

  • The OCI image manifest 1.1 spec also introduced the subject field.

    This OPTIONAL property specifies a descriptor of another manifest. This value, used by the referrers API, indicates a relationship to the specified manifest.

    This would allow artefacts/manifests to be linked. i.e. An SBOM could be linked/attached to the container image it represented.

  • The OCI distribution spec 1.1 introduced the Referrers API. This allowed clients to query for related artefacts.

Not All Good News Though

  • The use of the config.mediaType was not ideal. the ideal field would have been [image].mediaType (top-level) but for backwards compatibility reasons they could not. More about that in this post by Dan Lorenc here.

  • This resulted in a lot of artefacts implementations simply leaving the [image].mediaType empty and relying on the config blob to be set to a custom type. Not all the registries supported this or had limits on what type of values were supported.

Pushing This Further With ORAS

The ORAS (OCI Registry As Storage) project aims to “Distribute Artifacts Across OCI Registries With Ease”.

ORAS extends the OCI 1.1 specification and allows artefacts to be used in an easily discoverable way. This is done by storing independent but softly linked artefacts without making any changes to the existing image manifest. This makes it ideal for supply chain scenarios where you have many artefacts accompanying container image.

The below object graph shows such a scenario where a container image, SBOM and their signatures to verify provenance. They are associated with the container image using the subject field.

Artefact association

How Does ORAS Extend The OCI 1.1 Spec?

The following is from the “Comparing the ORAS Artifact Manifest and OCI Image Manifest” section.

OCI Artifacts defines how to implement stand-alone artifacts that can fit within the constraints of the image-spec. OCI Artifacts uses the manifest.config.mediaType to identify the artifact is something other than a container image. While this validated the ability to generalize the Content Addressable Storage (CAS) capabilities of OCI Distribution, a new set of artifacts require additional capabilities that aren’t constrained to the image-spec. ORAS Artifacts provide a more generic means to store a wider range of artifact types, including references between artifacts.

The addition of a new manifest does not change, nor impact the image.manifest. By defining the artifact.manifest and the referrers/ api, registries and clients opt-into new capabilities, without breaking existing registry and client behaviour.

The high-level differences between the oci.image.manifest and the oras.artifact.manifest:

OCI Image Manifest ORAS Artifacts Manifest
config REQUIRED config OPTIONAL as it’s just another entry in the blobs collection with a config mediaType
layers REQUIRED blobs are OPTIONAL, which were renamed from layers to reflect general usage
layers ORDINAL blobs are defined by the specific artifact spec. For example, Helm utilizes two independent, non-ordinal blobs, while other artifact types like container images may require blobs to be ordinal
manifest.config.mediaType used to uniquely identify artifact types. manifest.artifactType added to lift the workaround for using manifest.config.mediaType on a REQUIRED, but not always used config property. Decoupling config.mediaType from artifactType enables artifacts to OPTIONALLY share config schemas.
  subject OPTIONAL, enabling an artifact to extend another artifact (SBOM, Signatures, Nydus, Scan Results)
  /referrers api for discovering referenced artifacts, with the ability to filter by artifactType
  Lifecycle management defined, starting to provide standard expectations for how users can manage their content

For more info, see:

ORAS Artefact Manifest

The ORAS Artifact manifest is similar to the OCI image manifest, but removes constraints defined on the image-manifest such as a required config object and required & ordinal layers

ORAS artefact manifest introduced their own mediaType field with the value application/vnd.cncf.oras.artifact.manifest.v1+json

Full spec can be found here.

ORAS Artefact Spec Future

There are no future releases or work items planned.

The output of this project has been proposed to the OCI Reference Types Working Group. Future discussions about artifacts in OCI registries should happen in the OCI distribution-spec & image-spec repositories.

The idea is to get the proposed changes adopted via the OCI spec upstream and make the artefact use common across all registries and clients that way.

Update: 04-Aug-2023

The OCI working group have made an announcement on what proposals from ORAS they have incorporated.

These include

  • artifactType as a top level field. Preferred over config.mediaType for new artefacts.
  • subject field to be used establishing relationships between.
  • /v2/<name>/referrers/<digest> referrers API endpoint to query relationships based on the subject descriptor.

I have created a pull request for the OCI image spec repo to update its artefact usage guidance.

Update: 12-Aug-2023

  • My changes from the above PR have been incorporated into a new PR which can be found here.
  • The ORAS project is also updating its guidance based on that. The PR for that is here.

This was my first time contributing to the OCI (opencontainers) project and ORAS and I enjoyed the conversation and process of PR review very much.

If you see a gap in the guidance or spec, please feel free to create an issue or a PR to fix it. The folks over there are a good bunch of people to work with.

What this means for ORAS?

This means the ORAS artefact manifest spec will now considered to be deprecated. You can start using the OCI 1.1 image spec to store artefacts. The intention of the project has been satisfied in getting the OCI image spec to adopt some of its (ORAS artefact spec) recommendations.

You can keep using the ORAS CLI and SDK tools to interact with OCI 1.1 registries. In fact this is the preferred way rather than writing your own logic based on the runtime spec. ORAS SDK handles everything for you.

ORAS Use Cases And Adopters

A full list can be found here.

Supply Chain Artefacts

There are some examples below on how to use ORAS to store supply chain artefacts and sign them using Notation.

Using ORAS CLI

To install ORAS CLI on Linux:

VERSION="1.0.0"
curl -LO "https://github.com/oras-project/oras/releases/download/v${VERSION}/oras_${VERSION}_linux_amd64.tar.gz"
mkdir -p oras-install/
tar -zxf oras_${VERSION}_*.tar.gz -C oras-install/
sudo mv oras-install/oras /usr/local/bin/
rm -rf oras_${VERSION}_*.tar.gz oras-install/

Other platforms are listed here.

You will need an compatible registry like Zot. A list of supported registries are listed here.

To run Zot:

docker run -d -p 5000:5000 --name oras-quickstart ghcr.io/project-zot/zot-linux-amd64:latest

Create a sample file:

echo "hello world" > artifact.txt

Push the artefact:

oras push --plain-http localhost:5000/hello-artifact:v1 \
    --artifact-type application/vnd.acme.rocket.config \
    artifact.txt:text/plain

Uploading a948904f2f0f artifact.txt
Uploaded  a948904f2f0f artifact.txt
Pushed [registry] localhost:5000/hello-artifact:v1
Digest: sha256:bcdd6799fed0fca0eaedfc1c642f3d1dd7b8e78b43986a89935d6fe217a09cee    

Attach an artefact:

echo "hello world" > hi.txt
oras attach --artifact-type doc/example localhost:5000/hello-artifact:v1 hi.txt

Pull an artefact:

oras pull localhost:5000/hello-artifact:v1

Downloading a948904f2f0f artifact.txt
Downloaded  a948904f2f0f artifact.txt
Pulled [registry] localhost:5000/hello-artifact:v1
Digest: sha256:19e1b5170646a1500a1ac56bad28675ab72dc49038e69ba56eb7556ec478859f

Discover the referrers:

oras discover localhost:5000/hello-artifact:v1

Discovered 1 artifact referencing v1
Digest: sha256:327db68f73d0ed53d528d927a6703c00739d7c1076e50762c3f6641b51b76fdc

Artifact Type   Digest
doc/example     sha256:bcdd6799fed0fca0eaedfc1c642f3d1dd7b8e78b43986a89935d6fe217a09cee

Closing

Hope this post gave you a deeper understanding of the state of artefacts in container registries and how the OCI 1.1 spec and projects like ORAS are trying to push the industry in a direction that allows for standardised registries and clients.

If you have any feedback or questions, please reach out to me on twitter @dasiths or post them here.

Happy coding.

Leave a comment