Skip to content

Add support for cgroup-only device nodes#314

Draft
elezar wants to merge 3 commits into
cncf-tags:mainfrom
elezar:add-support-for-wildcard-devices
Draft

Add support for cgroup-only device nodes#314
elezar wants to merge 3 commits into
cncf-tags:mainfrom
elezar:add-support-for-wildcard-devices

Conversation

@elezar

@elezar elezar commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

This change proposes a mechanism to allow for the specification of cgroup-only device nodes (See #302). This is represented by a device node with a path equal to *. For such a device node, the device node information is NOT injected into the OCI runtime spec, but a cgroup rule is added for the major and minor number of the device. This is similar to the Docker --device-cgroup-rule argument and means that the device node NEED not exist on the host.

In addition, a mojor or minor number of -1 as specified in opencontainers/cgroup https://github.com/opencontainers/cgroups/blob/d492e44fbd24c97075c46c33e67ae43b5c7fa0ac/devices/config/device.go#L10 and used in runc is treated as a "wildcard" device matching all major (or minor) numbers.

Note that runc skips device nodes with an empty path: https://github.com/opencontainers/runc/blob/edbed618bff99edaaa9d622f9f8687004fe5e50a/libcontainer/rootfs_linux.go#L984-L988 assuming they are only provided for cgroup access.

elezar added 2 commits April 17, 2026 16:46
This change moves adding device nodes to the OCI spec to a new function
so as to make future extensions easier to implement.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the add-support-for-wildcard-devices branch from 2dcb801 to cde499e Compare April 17, 2026 14:47
Comment on lines +414 to +421
var major *int64
if d.Major != -1 {
major = &d.Major
}
var minor *int64
if d.Minor != -1 {
minor = &d.Minor
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels as if this is the CORE change here since this SHOULD allow us to inject the cgroup rule even without performing additional checks as long as SOME non-zero device major is specified. This works because we explicitly ignore errors getting device node information if at least the Major and Type is is specified. What we need is the -1 to nil mapping since nil is interpreted as * by the OCI runtime spec.

@klihub klihub Apr 20, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a mental note... It's a pity we don't have Major and Minor as *int64 instead, so we could differentiate between zero- and omitted values.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wonder, do we need "safety net' to prevent accidental "allow all devices" setting?

Comment thread pkg/cdi/container-edits.go Outdated
Comment thread pkg/cdi/container-edits.go Outdated
Comment thread pkg/cdi/container-edits.go Outdated
Comment thread pkg/cdi/container-edits.go Outdated
func (d *DeviceNode) isWildcardDevice() bool {
if d.Minor == -1 || d.Major == -1 {
return true
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot we (and if we can, shouldn't we) always require that the device path is always set to "*" for such a device ? I think it would be less confusing if we didn't allow it, and doing it otherwise also might give the false impression that something might be filled in from the same device on the host...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My one thought here was whether we wanted to use the path for naming (from a user perspective), but maybe always using * is better. That is to say that a wildcard device ALWAYS has a path that is exactly equal to "*" and MAY include specific major and minor numbers or -1 as a wildcard major or minor number.

@klihub klihub requested a review from bart0sh April 17, 2026 20:04
@klihub

klihub commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

@elezar This looks fine to me as a feature to have, but I have a few questions about some implementation/design details.

@elezar elezar force-pushed the add-support-for-wildcard-devices branch from cde499e to bdc6b5a Compare April 20, 2026 11:58

@klihub klihub left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now.

@klihub klihub self-requested a review April 20, 2026 14:22

@klihub klihub left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now. We'll need to update documentation and tests, though.

@elezar

elezar commented Apr 20, 2026

Copy link
Copy Markdown
Contributor Author

This LGTM now. We'll need to update documentation and tests, though.

Yes, I just wanted to get some early feedback before adding more of those.

Another question. I suppose that this would justify a spec bump, or would we consider back-porting this to older versions? If we do bump the spec, then we would have to update the minimum version checks.

@klihub

klihub commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

This LGTM now. We'll need to update documentation and tests, though.

Yes, I just wanted to get some early feedback before adding more of those.

Another question. I suppose that this would justify a spec bump, or would we consider back-porting this to older versions? If we do bump the spec, then we would have to update the minimum version checks.

Yes, I think this probably warrants a spec version bump. In principle it changes how we semantically interpret the spec. Although now that we require the device name for a wildcard injection to be a literal "*", the practical difference is reduced to a "it would have failed earlier but now it won't and will have a special meaning", and it hard to imagine that someone could possibly rely on the old failing behavior. But better be safe.

About backporting (without bumping the Spec version), I think that would not help those who need the new behavior, if/when they ran with an already released runtime version. That would just unexpectedly fail.

@elezar elezar force-pushed the add-support-for-wildcard-devices branch from bdc6b5a to 07292c1 Compare April 21, 2026 11:58
Comment thread pkg/cdi/container-edits.go Outdated
@elezar elezar force-pushed the add-support-for-wildcard-devices branch 3 times, most recently from 4be965e to 680d858 Compare April 21, 2026 14:35
This change allows the following device node CDI spec to be considered
valid:

path: "*"
type: "c"
major: 195
minor: -1
permissions: "rwm"

When this is included in the spec, no device nodes are added to the OCI
runtime spec, but the linux resources (cgroups) are updated to allow
access to all devices in the same way as the Docker `--device-cgroup-rule`
flag.

Note that a major or minor number of -1 is equivalent to a * cgroup rule.

This is useful when starting a container that needs to be able to create
device nodes that do not exist when it is started.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the add-support-for-wildcard-devices branch from 680d858 to 7646b1b Compare April 21, 2026 14:37
@elezar elezar changed the title Add support for wildcard devices Add support for cgroup-only device nodes Apr 21, 2026
@elezar elezar added this to the CDI Specification v1.2.0 milestone Apr 21, 2026
@elezar elezar marked this pull request as ready for review April 21, 2026 14:44
@elezar elezar marked this pull request as draft April 21, 2026 14:44
Comment thread specs-go/version.go
v080: requiresV080,
v100: requiresV100,
v110: requiresV110,
v120: requiresV120,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self. I still need to add tests for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants