-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Speeding Up Pulling Container Images/CRI-O Additional Storage Support #110809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mburke5678
wants to merge
9
commits into
openshift:main
Choose a base branch
from
mburke5678:nodes-crio-addt-storage
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
2f7dce4
Speeding Up Pulling Container Images/CRI-O Additional Storage Support
b444a93
Fix vale
4fdcc57
proofread
4f9d1d6
proofread
4f3efff
proofread
6a8c2f2
edits per saschagrunert
5cc3a56
edits per saschagrunert
c722f92
edits per saschagrunert
cd3e5a3
Fix link
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,96 @@ | ||
| // Module included in the following assemblies: | ||
| // | ||
| // * nodes/nodes/nodes-nodes-additional-crio-storage.adoc | ||
|
|
||
| :_mod-docs-content-type: CONCEPT | ||
| [id="nodes-nodes-additional-crio-storage-about_{context}"] | ||
| = About additional storage locations for CRI-O | ||
|
|
||
| [role="_abstract"] | ||
| You can configure additional storage locations for the CRI-O container engine that give you control over where CRI-O stores and retrieves OCI artifacts, complete container images, and container image layers. Using dedicated storage locations for these CRI-O objects can reduce startup time and make your applications run more efficiently through dedicated solid-state drive (SSD) storage, shared image caches, or lazy pulling. | ||
|
|
||
| By default, CRI-O stores all container data under a single root directory, `/var/lib/containers/storage`. This works well for typical workloads, but can create problems in clusters that use large images or artifacts, such as artificial intelligence and machine learning (AI/ML) workloads and other scenarios. | ||
|
|
||
| For example, large OCI artifacts, such as machine learning models, are stored on the default location, consuming space and preventing the use of faster dedicated storage. By configuring the `additionalArtifactStores` parameter, you can store large AI/ML models on high-performance storage (SSD) separate from the root file system. As a result, your workloads can experience faster start times and your clusters can use storage more efficiently. | ||
|
|
||
| Also, you could use the `additionalImageStores` parameter to mount an NFS share with pre-populated images across all worker nodes. Nodes read from the shared cache instead of pulling from an external registry. This is useful in disconnected environments or when many nodes run the same workloads. | ||
|
|
||
| With the `additionalLayerStores` parameter, you could enable lazy pulling through a third-party storage plugin, such as stargz-store. With lazy pulling, containers start after downloading only the required file chunks. The remaining data is fetched during runtime. | ||
|
|
||
| After you configure any of these new storage locations, the Machine Config Operator (MCO) reboots the affected nodes with the new configuration. After the reboot, CRI-O begins resolving storage from the additional locations. | ||
|
|
||
| Additional storage for OCI artifacts:: | ||
| Use the `additionalArtifactStores` field in a container runtime config to specify read-only locations where CRI-O resolves OCI artifacts, such as machine learning models pulled as OCI volume images. CRI-O checks these locations in order before falling back to the default storage location. CRI-O requires an `artifacts/` subdirectory within each configured path. For example, if the path is `/mnt/ssd-artifacts`, place the artifacts in the `/mnt/ssd-artifacts/artifacts/` directory. | ||
| + | ||
| The following example container runtime config configures storage for OCI artifacts. | ||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: machineconfiguration.openshift.io/v1 | ||
| kind: ContainerRuntimeConfig | ||
| metadata: | ||
| name: ssd-artifact-stores | ||
| spec: | ||
| machineConfigPoolSelector: | ||
| matchLabels: | ||
| pools.operator.machineconfiguration.openshift.io/worker: "" | ||
| containerRuntimeConfig: | ||
| additionalArtifactStores: | ||
| - path: /mnt/ssd-artifacts | ||
| - path: /mnt/nfs-shared-artifacts | ||
| ---- | ||
| + | ||
| When you create the container runtime config, the Machine Config Operator (MCO) writes the configuration into the `/etc/crio/crio.conf.d/01-ctrcfg-additionalArtifactStores` file on the target nodes. | ||
|
|
||
| Additional storage for container images:: | ||
| Use the `additionalImageStores` field to specify read-only container image caches on shared or high-performance storage. When CRI-O needs an image, it checks the additional image stores first. If the image exists there, no registry pull happens. | ||
| + | ||
| The following example container runtime config configures storage for container images. | ||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: machineconfiguration.openshift.io/v1 | ||
| kind: ContainerRuntimeConfig | ||
| metadata: | ||
| name: shared-image-cache | ||
| spec: | ||
| machineConfigPoolSelector: | ||
| matchLabels: | ||
| pools.operator.machineconfiguration.openshift.io/worker: "" | ||
| containerRuntimeConfig: | ||
| additionalImageStores: | ||
| - path: /mnt/nfs-image-cache | ||
| - path: /mnt/ssd-images | ||
| ---- | ||
| + | ||
| When you create the container runtime config, the Machine Config Operator (MCO) writes the configuration into the `/etc/containers/storage.conf` file on the target nodes. | ||
|
|
||
| Additional container image layers for lazy pulling:: | ||
| Use the `additionalLayerStores` field to enable lazy pulling through a third-party storage plugin. | ||
| + | ||
| Note that CRI-O falls back to a standard image pull in the following cases: | ||
| + | ||
| -- | ||
| * The registry does not support HTTP range requests. | ||
| * The image is in standard OCI format, not a lazy-pull-compatible format such as eStargz or Nydus. | ||
| * The storage plugin is not running. | ||
| -- | ||
| + | ||
| The following example container runtime config configures container image layers for lazy pulling. | ||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: machineconfiguration.openshift.io/v1 | ||
| kind: ContainerRuntimeConfig | ||
| metadata: | ||
| name: lazy-pulling | ||
| spec: | ||
| machineConfigPoolSelector: | ||
| matchLabels: | ||
| pools.operator.machineconfiguration.openshift.io/worker: "" | ||
| containerRuntimeConfig: | ||
| additionalLayerStores: | ||
| - path: /var/lib/stargz-store | ||
| ---- | ||
| + | ||
| When you create the container runtime config, the Machine Config Operator (MCO) writes the configuration into the `/etc/containers/storage.conf` on the target nodes. |
167 changes: 167 additions & 0 deletions
167
modules/nodes-nodes-additional-crio-storage-configuring.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,167 @@ | ||
| // Module included in the following assemblies: | ||
| // | ||
| // * nodes/nodes/nodes-nodes-additional-crio-storage.adoc | ||
|
|
||
| :_mod-docs-content-type: PROCEDURE | ||
| [id="nodes-nodes-additional-crio-storage-configuring_{context}"] | ||
| = Configuring additional storage locations for CRI-O | ||
|
|
||
| [role="_abstract"] | ||
| You can configure additional the CRI-O container engine storage locations for OCI artifacts, container images, or container image layers by using the `ContainerRuntimeConfig` custom resource (CR). | ||
|
|
||
| Use the `additionalArtifactStores`, `additionalImageStores`, and `additionalLayerStores` fields to specify read-only locations where CRI-O stores and resolves these objects. CRI-O checks these locations in order before falling back to the default storage location. | ||
|
|
||
| [IMPORTANT] | ||
| ==== | ||
| When using multiple `ContainerRuntimeConfig` resources, merge all additional storage configurations into a single `ContainerRuntimeConfig` for each machine config pool. Multiple `ContainerRuntimeConfig` resources affecting the same configuration file might result in only a subset of the changes taking effect. | ||
| ==== | ||
|
|
||
| .Prerequisites | ||
|
|
||
| * You enabled the required Technology Preview features for your cluster by adding the `TechPreviewNoUpgrade` feature set to the `FeatureGate` CR named `cluster`. For information about enabling Feature Gates, see "Enabling features using feature gates". | ||
| + | ||
| [WARNING] | ||
| ==== | ||
| Enabling the `TechPreviewNoUpgrade` feature set on your cluster cannot be undone and prevents minor version updates. This feature set allows you to enable these Technology Preview features on test clusters, where you can fully test them. Do not enable this feature set on production clusters. | ||
| ==== | ||
|
|
||
| * If you are configuring the `additionalImageStores` or `additionalLayerStores` parameter, the target storage paths must exist and be accessible on the nodes and the container image or layers must be present in the directory. | ||
|
|
||
| * If you are configuring the `additionalLayerStores` parameter, you must meet the following additional prerequisites: | ||
|
|
||
| ** A supported storage plugin binary must be installed on each node, such as Stargz Store or Nydus Storage Plugin. See "Additional resources" for more information. You must have installed the plugin by using one of the following methods: | ||
| *** Use a daemon set to run the plugin as a privileged container. | ||
| *** Use a machine config to install the binary and configure it as a systemd service. | ||
| *** Use Image mode for OpenShift to install the plugin in a custom {op-system} image. | ||
|
|
||
| ** You converted the container images to a lazy-pull-compatible format, such as eStargz or Nydus. | ||
| ** Your container registry must support HTTP range requests. | ||
|
|
||
| .Procedure | ||
|
|
||
| . Create a YAML file for the `ContainerRuntimeConfig` CR: | ||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: machineconfiguration.openshift.io/v1 | ||
| kind: ContainerRuntimeConfig | ||
| metadata: | ||
| name: crio-additional-stores | ||
| spec: | ||
| machineConfigPoolSelector: | ||
| matchLabels: | ||
| pools.operator.machineconfiguration.openshift.io/worker: "" | ||
| containerRuntimeConfig: | ||
| additionalArtifactStores: | ||
| - path: /mnt/ssd-artifacts | ||
| - path: /mnt/nfs-shared-artifacts | ||
| additionalImageStores: | ||
| - path: /mnt/nfs-image-cache | ||
| - path: /mnt/ssd-images | ||
| additionalLayerStores: | ||
| - path: /var/lib/stargz-store | ||
| ---- | ||
| where: | ||
| + | ||
| -- | ||
| `spec.machineConfigPoolSelector`:: Specifies a label associated with the nodes that you want to update. | ||
| `spec.containerRuntimeConfig.additionalArtifactStores.path`:: Optional: Specifies the path to the directory that contains OCI artifacts. CRI-O searches for content in an `artifacts/` subdirectory within this path. You can specify up to 10 directories. | ||
| `spec.containerRuntimeConfig.additionalImageStores.path`:: Optional: Specifies the path to an NFS share or other location that contains pre-populated container images. You can specify up to 10 directories. | ||
| `spec.containerRuntimeConfig.additionalLayerStores.path`:: Optional: Specifies the path to the directory that contains lazy-pull-compatible-formatted container image layers. You can specify up to 5 directories. | ||
| -- | ||
| + | ||
| You can configure any combination of these three additional CRI-O storage locations. | ||
| + | ||
| The specified path must meet the following criteria: | ||
| + | ||
| -- | ||
| * Contains between 1 and 256 characters | ||
| * Is an absolute path, starting with the `/` character | ||
| * Contains only alphanumeric characters: `a-z`, `A-Z`, `0-9`, `/`, `.`, `_`, and `-` | ||
| * Cannot contain consecutive forward slashes | ||
| -- | ||
| + | ||
| For a layer store, the MCO automatically appends the `:ref` suffix to the path when writing to the `storage.conf` file. This suffix switches the container storage library from storing actual image layers (blobs) to storing references (pointers) to where those layers can be found, which is required for the lazy-pulling plugins. You do not need to include the suffix in the `ContainerRuntimeConfig` path. | ||
| + | ||
| [NOTE] | ||
| ==== | ||
| If a path does not exist or is inaccessible at runtime, CRI-O generates a warning and continues with the remaining stores. The default storage location is always used as a fallback. | ||
| ==== | ||
|
|
||
| . Create the `ContainerRuntimeConfig` CR by running the following command: | ||
| + | ||
| [source,terminal] | ||
| ---- | ||
| $ oc create -f <container-runtime-config>.yaml | ||
| ---- | ||
| + | ||
| Replace `<container-runtime-config>` with the name of the YAML file. | ||
| + | ||
| After you configure any of these new storage locations, the Machine Config Operator (MCO) reboots the affected nodes with the new configuration. | ||
|
|
||
| .Verification | ||
|
|
||
| . After the nodes have returned to the `Ready` status, check that the new stores have been added to the node configuration: | ||
|
|
||
| .. Start a debug pod by running the following command: | ||
| + | ||
| [source,terminal] | ||
| ---- | ||
| $ oc debug node/<node_name> | ||
| ---- | ||
| + | ||
| where: | ||
|
|
||
| <node_name>:: Specifies the name of one of the nodes in the affected machine config pool. | ||
|
|
||
| .. Set `/host` as the root directory within the debug shell: | ||
| + | ||
| [source,terminal] | ||
| ---- | ||
| sh-5.1# chroot /host | ||
| ---- | ||
|
|
||
| ** For an artifact store, review the contents of the `/etc/crio/crio.conf.d/01-ctrcfg-additionalArtifactStores` file by using the following command: | ||
| + | ||
| [source,terminal] | ||
| ---- | ||
| sh-5.1# cat /etc/crio/crio.conf.d/01-ctrcfg-additionalArtifactStores | ||
| ---- | ||
| + | ||
| .Example output | ||
|
mburke5678 marked this conversation as resolved.
|
||
| [source,terminal] | ||
| ---- | ||
| [crio] | ||
| [crio.runtime] | ||
| additional_artifact_stores = ["/mnt/ssd-artifacts", "/mnt/nfs-shared-artifacts"] | ||
| ---- | ||
|
|
||
| ** For an image store, review the contents of the `etc/containers/storage.conf` file by using the following command: | ||
| + | ||
| [source,terminal] | ||
| ---- | ||
| sh-5.1# cat /etc/containers/storage.conf | ||
| ---- | ||
| + | ||
| .Example output | ||
|
mburke5678 marked this conversation as resolved.
|
||
| [source,terminal] | ||
| ---- | ||
| [storage] | ||
| [storage.options] | ||
| additionalimagestores = ["/var/lib/additional-images"] | ||
| ---- | ||
|
|
||
| ** For a layer store, review the contents of the `etc/containers/storage.conf` file by using the following command: | ||
| + | ||
| [source,terminal] | ||
| ---- | ||
| sh-5.1# cat /etc/containers/storage.conf | ||
| ---- | ||
| + | ||
| .Example output | ||
| [source,terminal] | ||
| ---- | ||
| [storage] | ||
| [storage.options] | ||
| additionallayerstores = ["/var/lib/stargz-store:ref"] | ||
| ---- | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| :_mod-docs-content-type: ASSEMBLY | ||
| [id="nodes-nodes-additional-crio-storage"] | ||
| = Additional CRI-O storage locations for faster container startup | ||
| include::_attributes/common-attributes.adoc[] | ||
| :context: adding-node-iso | ||
|
mburke5678 marked this conversation as resolved.
|
||
|
|
||
| toc::[] | ||
|
|
||
| [role="_abstract"] | ||
| You can configure additional storage locations for the CRI-O container engine by using the `ContainerRuntimeConfig` custom resource (CR). Three new fields let you specify where CRI-O stores and resolves container image layers, complete container images, and OCI artifacts. By specifying locations other than the default, you can reduce application startup time, make your applications run more efficiently, and configure lazy pulling. | ||
|
|
||
| include::modules/nodes-nodes-additional-crio-storage-about.adoc[leveloffset=+1] | ||
| include::modules/nodes-nodes-additional-crio-storage-configuring.adoc[leveloffset=+1] | ||
|
|
||
| == Additional resources | ||
|
|
||
| * link:https://github.com/containerd/stargz-snapshotter[Stargz Store plugin] | ||
| * link:https://github.com/containerd/stargz-snapshotter/blob/main/docs/INSTALL.md[Install Stargz Snapshotter and Stargz Store] | ||
| * link:https://github.com/containers/nydus-storage-plugin[Nydus Storage Plugin] | ||
| * link:https://github.com/containerd/stargz-snapshotter/blob/main/docs/estargz.md[eStargz format] | ||
| * link:https://nydus.dev/[Nydus format] | ||
| * xref:../../nodes/jobs/nodes-pods-daemonsets.adoc#nodes-pods-daemonsets[Running background tasks on nodes automatically with daemon sets] | ||
| * xref:../../machine_configuration/machine-configs-configure.adoc#machine-configs-configure[Using machine config objects to configure nodes] | ||
| * xref:../../machine_configuration/mco-coreos-layering.adoc#mco-coreos-layering[Image mode for OpenShift] | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.