Skip to content

Commit 6a7691b

Browse files
committed
docker governance
1 parent f1185d4 commit 6a7691b

1 file changed

Lines changed: 97 additions & 53 deletions

File tree

content/cumulus-linux-517/System-Configuration/Docker-with-Cumulus-Linux.md

Lines changed: 97 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -444,92 +444,136 @@ log-level json-file
444444

445445
## Docker Resource Tiering System
446446

447-
Cumulus Linux includes a built-in Docker resource tiering system that acts as an automated governance layer for system resources. In a shared environment where users or automated scripts can launch arbitrary Docker containers, there is a significant risk that high-load compilation, a memory leak, or a stress test can consume 100 percent of the CPU or RAM, causing critical network services or the host OS itself to become unresponsive. The resource tiering system prevents resource exhaustion (Denial of Service) from untrusted workloads while guaranteeing performance for critical system applications.
447+
Cumulus Linux includes a Docker Resource Tiering Governance system that protects the switch from noisy-neighbor CPU and memory exhaustion by classifying container workloads into governance tiers and placing them into the correct `systemd` slice at container creation time.
448448

449-
The built-in Docker resource tiering system is enabled by default and active after installation. The policy agent enforces runtime policy entirely in the background; no configuration is required.
449+
The Governance system is not a long-running background agent; the packaged docker wrapper applies governance at container birth. The `cumulus-docker-resource-tier-governance.service` systemd service programs the tier slice limits and reconciles existing governance-managed containers when you start, reload, restart, or stop the service.
450450

451-
Cumulus Linux uses slices to apply shared resource limits to a group of processes:
452-
- `docker-restricted.slice` is the default tier for third-party containers that are not whitelisted. This tier uses 10 percent host CPU and 10 percent host memory.
453-
- `docker-limited.slice` is the intermediate tier for selected workloads that need more headroom than restricted but still need to be capped. This tier uses 50 percent host CPU and 50 percent host memory.
454-
- `docker.slice` is the trusted tier for trusted NVIDIA containers and other explicitly trusted workloads. There is no host CPU or host memory cap.
451+
When governance is enabled, containers use one of these tiers:
452+
- `docker.slice` is the Trusted tier. Containers in this tier are not subject to CPU or memory caps.
453+
- `docker-limited.slice` is the Limited tier. This tier is for workloads that need more headroom than the default tier but must still remain bounded.
454+
- `docker-restricted.slice` is the Restricted tier. This is the default tier for images that are not whitelisted.
455455

456-
To disable the resource tiering system, stop, then disable the cumulus Docker policy agent:
456+
By default, the Restricted and Limited tiers use aggregate limits of 10% and 50% of total host CPU and total host memory. These are shared per-tier limits, not per-container reservations. If multiple containers run in the same tier, they share the tier budget.
457457

458-
```
459-
cumulus@switch:~$ systemctl stop cumulus-docker-policy-agent.service
460-
cumulus@switch:~$ systemctl disable cumulus-docker-policy-agent.service
461-
```
458+
The packaged docker wrapper classifies supported docker container creation commands and places new containers directly into the correct parent slice with the appropriate `cgroup-parent`.
462459

463-
To re-enable the resource tiering system, enable, then start the cumulus Docker policy agent:
460+
The `cumulus-docker-resource-tier-governance.service` service reads the current policy, programs the Restricted and Limited slice limits, and reconciles existing governance-managed containers when you apply policy. If you install the Docker Compose plugin, Docker compose workloads are also governed at creation time.
464461

465-
```
466-
cumulus@switch:~$ systemctl enable cumulus-docker-policy-agent.service
467-
cumulus@switch:~$ systemctl start cumulus-docker-policy-agent.service
468-
```
462+
Governance does not move running container processes between control groups. If an existing governance-managed container needs to change tiers, the service recreates it in the correct slice. If the container is supervised by `systemd`, the service restarts the owning unit so the container is recreated through its normal service lifecycle.
463+
464+
### Governance Configuration
469465

470-
<!--
471-
### Container Resources
466+
Governance uses these configuration files:
472467

473-
You can customize these values by editing the `/etc/cumulus/docker/resources.conf` file.
468+
- `/etc/cumulus/docker/resource-tier-governance/whitelist.json` controls which images are classified as Trusted or Limited.
469+
- `/etc/cumulus/docker/resource-tier-governance/quotas.conf` controls the aggregate CPU and memory percentages for the Restricted and Limited tiers.
470+
471+
Images that are not listed in either the Trusted or Limited tier default to the Restricted tier.
472+
473+
The following example shows an `/etc/cumulus/docker/resource-tier-governance/whitelist.json` file:
474474

475475
```
476-
cumulus@switch: sudo nano /etc/cumulus/docker/resources.conf
477-
RESTRICTED_PERCENT=10 # Tighter jail for unknown apps
478-
LIMITED_PERCENT=60 # Slightly more room for limited apps
476+
{
477+
"trusted_images": [],
478+
"limited_images": [
479+
"nv-gnmi",
480+
"nv-umf",
481+
"nv-grpctunnel",
482+
"nv-otel-collector",
483+
"docker-wjh"
484+
]
485+
}
479486
```
480487

481-
After editing the `/etc/cumulus/docker/resources.conf` file, you must restart `cumulus-docker-resource-limit-calculator.service`.
488+
{{%notice note%}}
489+
- Whitelist entries can match repository names, tags, or digests as seen by Docker on the local system. Matching is case-insensitive.
490+
- Do not list the same image in both the Trusted and Limited tiers. If an image appears in both places, the Trusted tier takes precedence.
491+
{{%/notice%}}
492+
493+
The following example shows an `/etc/cumulus/docker/resource-tier-governance/quotas.conf` file:
482494

483495
```
484-
cumulus@switch: systemctl restart cumulus-docker-resource-limit-calculator.service
496+
RESTRICTED_DOCKER_RESOURCE_QUOTA=10
497+
LIMITED_DOCKER_RESOURCE_QUOTA=50
485498
```
486499

487-
The docker image whitelist maintains the list of trusted and limited images and is located in the `/etc/cumulus/docker/whitelist.json` file.
500+
### Enable and Disable Governance
488501

489-
By default, the `/etc/cumulus/docker/whitelist.json` file ships with the following content.
502+
The Docker Resource Tiering Governance system is installed and enabled by default. To disable governance immediately and prevent it from being enabled again at boot, stop and disable the service:
490503

491504
```
492-
cumulus@switch: sudo cat /etc/cumulus/docker/whitelist.json
493-
{
494-
"trusted_images": [ ],
495-
"limited_images": ["docker-wjh"]
496-
}
505+
cumulus@switch:~$ sudo systemctl stop cumulus-docker-resource-tier-governance.service
506+
cumulus@switch:~$ sudo systemctl disable cumulus-docker-resource-tier-governance.service
497507
```
498508

499-
You can edit this file to add trusted and limited images.
509+
When you stop the service, Cumulus Linux:
510+
- Disables governance for future container births.
511+
- Promotes existing governance-managed containers to `docker.slice`.
512+
- Clears the Restricted and Limited slice caps.
513+
- Leaves containers with a custom parent control group (cgroup) unchanged.
514+
- Leaves unmanaged containers that do not carry governance metadata unchanged.
515+
516+
When you disable governance, new containers default to the Trusted tier unless you select another parent slice.
517+
518+
To re-enable governance:
500519

501520
```
502-
cumulus@switch: sudo nano /etc/cumulus/docker/whitelist.json
503-
{
504-
"trusted_images": [
505-
"internal-app",
506-
"postgres"
507-
],
508-
"limited_images": [
509-
"jenkins-agent",
510-
"python-worker"
511-
]
512-
}
521+
cumulus@switch:~$ sudo systemctl enable cumulus-docker-resource-tier-governance.service
522+
cumulus@switch:~$ sudo systemctl start cumulus-docker-resource-tier-governance.service
513523
```
514524

515-
To show memory resource usage for containers, run the Linux `sudo cat /sys/fs/cgroup/cumulus-docker-trusted/memory.current` command.
525+
{{%notice note%}}
526+
- Enable and disable control boot-time activation only. Runtime policy changes occur when you start, reload, restart, or stop the service.
527+
- If the service is already active, `systemctl start` does not reapply policy. Use `systemctl reload` to apply updated governance policy to a running system.
528+
{{%/notice%}}
516529

517-
```
518-
cumulus@switch: sudo cat /sys/fs/cgroup/cumulus-docker-trusted/memory.current
530+
### Promote, Demote, and Whitelist Images
519531

520-
```
532+
To move an image between tiers, edit the `/etc/cumulus/docker/resource-tier-governance/whitelist.json` file:
533+
- To promote an image to Trusted, add it to `trusted_images` and remove it from `limited_images`.
534+
- To move an image to Limited, add it to `limited_images` and remove it from `trusted_images`.
535+
- To demote an image back to Restricted, remove it from both `limited_images` and `trusted_images`.
521536

522-
To show CPU resource usage for containers, run the Linux `sudo cat sys/fs/cgroup/cumulus-docker-trusted/cpu.stat` command and `sudo cat /sys/fs/cgroup/cumulus-docker-limited/cpu.stat` command.
537+
After updating the whitelist, apply the change with the `sudo systemctl reload cumulus-docker-resource-tier-governance.service` command.
523538

524-
```
525-
cumulus@switch: sudo cat /sys/fs/cgroup/cumulus-docker-limited/cpu.stat
539+
Use `systemctl reload` instead of `systemctl restart` for normal policy updates.
540+
- `systemctl reload` re-reads the whitelist and quota files, reprograms the Restricted and Limited slice limits, and reconciles existing governance-managed containers against the updated policy. This avoids the extra churn caused by a full stop-then-start cycle.
541+
- `systemctl restart` is more disruptive because it executes a full stop followed by start. The stop path disables governance and promotes governance-managed containers to docker.slice. The start path enables governance again and might recreate or restart the containers a second time to place them back into Limited or Restricted.
542+
543+
{{%notice note%}}
544+
- `systemctl restart` might cause unnecessary container churn during normal whitelist updates. Use `systemctl reload` when governance is already active. If the service is currently inactive, run the `sudo systemctl start cumulus-docker-resource-tier-governance.service` command.
545+
- If you update the `/etc/cumulus/docker/resource-tier-governance/quotas.conf` file, apply the change with the `sudo systemctl reload cumulus-docker-resource-tier-governance.service` command. For quota-only changes, `reload` updates the slice limits live and does not require recreating containers just to program the new slice cap values.
546+
{{%/notice%}}
526547

548+
### Considerations
549+
550+
- Existing containers are reconsidered only when you apply governance with a service action such as `start`, `reload`, `restart`, or `stop`.
551+
- Editing the `/etc/cumulus/docker/resource-tier-governance/whitelist.json` file or the `/etc/cumulus/docker/resource-tier-governance/quotas.conf` file does not retier existing containers until you apply governance again.
552+
- Containers created before you install the governance wrapper, or other containers without governance metadata, are treated as unmanaged and are left untouched. To bring these containers under governance, stop and remove them, then recreate them.
553+
- Containers created with an explicit Docker `--cgroup-parent` or `Compose cgroup_parent` remain under user control and are not re-tiered by governance.
554+
- Retiering an existing container can require a stop, remove, and recreate operation or, for workloads that `systemd` supervises, a restart of the owning unit. Plan policy changes accordingly for stateful or production workloads.
555+
- Governance enforces aggregate ceilings per tier. It does not provide guaranteed minimum CPU or memory reservations for each container inside a tier.
556+
557+
### Verify Container Placement and Tier Limits
558+
559+
To check the configuration of a container parent slice:
560+
561+
```
562+
cumulus@switch:~$ docker inspect -f '{{.HostConfig.CgroupParent}}' <container_name>
527563
```
528564

529-
To show which container processes are trusted and which are limited, run the `sudo cat /sys/fs/cgroup/cumulus-docker-trusted/cgroup.procs` command or the `sudo cat /sys/fs/cgroup/cumulus-docker-limited/cgroup.procs` command:
565+
To verify that the running process is actually under the expected slice:
530566

531567
```
532-
cumulus@switch: sudo cat /sys/fs/cgroup/cumulus-docker-limited/cgroup.procs
568+
cumulus@switch:~$ PID=$(docker inspect -f '{{.State.Pid}}' <container_name>)
569+
awk -F: '$1=="0" {print $3}' /proc/$PID/cgroup
570+
```
571+
572+
To inspect the slice limits programmed by governance:
533573

534574
```
535-
-->
575+
cumulus@switch:~$ systemctl show docker-restricted.slice -p CPUQuotaPerSecUSec -p MemoryMax
576+
cumulus@switch:~$ systemctl show docker-limited.slice -p CPUQuotaPerSecUSec -p MemoryMax
577+
```
578+
579+
The authoritative CPU and memory limits are applied to the slice, not to the individual container scope. Verify both container placement and slice properties when checking governance behavior.

0 commit comments

Comments
 (0)