Skip to content

Add GPU/Device Support and Fix Symlink Deduplication Issues#176

Open
akshaver wants to merge 19 commits intomintoolkit:masterfrom
akshaver:ashaver/nv_runtime
Open

Add GPU/Device Support and Fix Symlink Deduplication Issues#176
akshaver wants to merge 19 commits intomintoolkit:masterfrom
akshaver:ashaver/nv_runtime

Conversation

@akshaver
Copy link
Copy Markdown

@akshaver akshaver commented Feb 23, 2026

Fixes-102, Fixes-306, Fixes-401, Fixes-579

What

This MR introduces four patches that add GPU support to docker-slim and fix critical symlink processing bugs:

  1. Implement docker device request in CLI - Adds --cro-device-request and --cro-runtime CLI flags to enable GPU access during container profiling. Allows passing device requests as JSON (e.g., --cro-device-request '{"Count":-1, "Capabilities":[["gpu"]]}' --cro-runtime nvidia).

  2. Fix issues with duplicate symlink processing - Resolves a bug where files accessed through multiple symlinked paths (e.g., /usr/local/cuda/ vs /usr/local/cuda-12.9/) would be copied multiple times, with later copies randomly overwriting with 0-byte content. Implements inode-based deduplication to ensure only one canonical copy is kept.

  3. Example use case with nvidia runtime - Adds documentation and example scripts demonstrating GPU workloads:

    • test_nvidia_smi.sh - Slims ubuntu to run nvidia-smi
    • test_nvidia_pytorch.sh - Slims nvidia-pytorch to run CUDA tests
  4. Non-trivial example of slimming vllm - Adds a comprehensive example for slimming VLLM (LLM inference) containers with full API test suite validation.

  5. Fixes PTRACE handling bugs Exits an infinite loop that occurs when syscall had a NULL path
    argument. Don't drop parent paths when ENOENT-only child paths.

Why

  • GPU Support (Print os.Args prior to exiting. #102, #401): Many modern workloads require GPU access, especially in ML/AI contexts. Without --cro-device-request and --cro-runtime, it was impossible to profile containers that required GPU access during execution, making docker-slim unusable for CUDA-based images.

  • Symlink Bug (#306, #579): The symlink deduplication bug caused slimmed images to have corrupted (0-byte) library files, particularly in NVIDIA CUDA containers where /usr/local/cuda symlinks to versioned directories like /usr/local/cuda-12.9. This made slimmed images non-functional, as critical .so files would be empty.

How Tested

  • Unit Tests: Added comprehensive test coverage for:

    • Device request JSON parsing and CLI flag handling (clifvgetter_test.go, config_test.go, device_request_test.go)
    • Symlink deduplication logic (dedup_test.go)
    • Ptrace return status handling (ptrace_test.go)
    • File copy operations (fsutil_test.go)
  • Integration Tests:

    • test_nvidia_smi.sh - Verifies slimmed ubuntu image can run nvidia-smi with identical output to original
    • test_nvidia_pytorch.sh - Verifies slimmed pytorch image passes CUDA tests
    • test_nvidia_vllm.sh - Comprehensive VLLM API test suite comparing original vs slimmed image behavior (15 endpoint tests)
  • Manual Testing: Successfully slimmed VLLM containers with GPU workloads and verified functional parity.

Now one can pass in the cro runtime and GPU like
  --cro-device-request '{"Count":-1, \
    "Capabilities":[["gpu"]]}' \
  --cro-runtime nvidia \
  ...
Correct issues with duplicate paths from symlink
processing, when they both point to the same
inode. This fixes a behavior where slimtookit
would randomly generate o-byte files for the
actual file referenced by the symlink.
Infinite loop in getStringParam: when a syscall had a NULL path
argument, PtracePeekData returned EIO with count=0. The function
silenced EIO, never advanced the pointer, and had no exit condition,
burning 100% CPU until exit. Fix: return early
on NULL pointer and on any PtracePeekData error.
…ubdir false positive

OKReturnStatus for checkFile syscalls intentionally accepts ENOENT (-2)
to track Python import search paths. When Python imports a namespace
package (a directory without __init__.py), it probes for several
__init__.* variants -- all returning ENOENT -- before stat'ing the
directory itself (success). The radix tree walk in FileActivity() then
sees the directory as a prefix of these ghost child paths and marks it
IsSubdir=true, excluding it from the slim image. Since the ghost
children don't exist on disk, neither the directory nor any files are
preserved.

Add HasSuccessfulAccess to FSActivityInfo, set it only when retVal==0,
and require it in the IsSubdir determination so that ENOENT-only child
paths cannot cause a parent directory to be dropped.
@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot bot commented Feb 23, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (3 files)
  • pkg/monitor/ptrace/ptrace.go
  • pkg/monitor/ptrace/ptrace_test.go
  • pkg/report/container_report.go

Reviewed by grok-code-fast-1 · 119,163 tokens

@dosubot
Copy link
Copy Markdown

dosubot bot commented Feb 23, 2026

Related Documentation

Checked 2 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@kcq
Copy link
Copy Markdown
Contributor

kcq commented Feb 23, 2026

thank you @akshaver for creating the PR here!

@kcq
Copy link
Copy Markdown
Contributor

kcq commented Feb 23, 2026

auggie review

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Feb 23, 2026

🤖 Augment PR Summary

Summary: This PR adds GPU/device support for profiling runs and fixes several correctness issues in file/symlink processing and ptrace-based monitoring.

Changes:

  • Added `--cro-device-request` CLI flag and plumbed it through `ContainerRunOptions` into the container inspector to populate Docker `HostConfig.DeviceRequests` for GPU access.
  • Hardened seccomp/profile generation and sensor artifact processing to handle missing/disabled PT monitor reports safely.
  • Implemented inode-based deduplication and symlink-aware copy behavior in the sensor artifact store to prevent duplicate copies and 0-byte overwrites (notably for CUDA symlink trees).
  • Improved ptrace monitor robustness (NULL path pointers, ENOENT/ENOTDIR handling) and added a success marker used to avoid dropping parent paths due to “ghost” children.
  • Added unit tests for CLI flag extraction, config struct behavior, device-request JSON parsing, ptrace return-status logic, and fsutil copy behavior.
  • Added NVIDIA runtime examples and scripts (nvidia-smi, PyTorch, VLLM) plus a VLLM API test harness for end-to-end validation.

Technical Notes: Device requests are passed as JSON matching Docker’s DeviceRequest schema; artifact dedup relies on filesystem metadata to select a single canonical copy when multiple paths reference the same underlying file.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

@akshaver
Copy link
Copy Markdown
Author

Will try to address these comments today or tomorrow, so the PR doesn't lose momentum.

akshaver and others added 2 commits February 27, 2026 21:01
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
@akshaver
Copy link
Copy Markdown
Author

@kcq let me know if you need anything else with the PR.

@akshaver
Copy link
Copy Markdown
Author

akshaver commented Mar 2, 2026

The recent kilo-code changes are breaking the vllm integration test. Tracking down why.

…Status

The kilo-code-bot suggested changing the HasSuccessfulAccess condition from
`e.retVal == 0 || p.SyscallType() == OpenFileType` to `p.OKReturnStatus(e.retVal)`,
arguing that the former incorrectly treated all OpenFileType syscalls as successful.

However, the original expression was correct:

- For OpenFileType (open, openat, readlink): the outer condition on line 276-278
  already filters to successful calls via `p.OKReturnStatus(e.retVal)`, which
  requires fd >= 0. The `|| p.SyscallType() == OpenFileType` clause is redundant,
  not wrong -- it simply ensures HasSuccessfulAccess=true for events that already
  passed the success filter.

- For CheckFileType (stat, access, lstat): `e.retVal == 0` correctly marks only
  files that actually exist on disk. The replacement `p.OKReturnStatus(e.retVal)`
  also accepts ENOENT (-2) and ENOTDIR (-20), causing non-existent "ghost" paths
  to be marked as HasSuccessfulAccess=true.

This broke the namespace package fix (da8eb2c): when Python probes for
__init__.cpython-311.so, __init__.so, __init__.py in a directory that has none
of them (namespace package), those ENOENT stat results created ghost children
with HasSuccessfulAccess=true. The FileActivity() radix tree walk then marked
the parent directory as IsSubdir and excluded it from the slim image. Since the
ghost children don't exist on disk, neither the directory nor its contents were
preserved, causing libraries to go missing.
@akshaver akshaver force-pushed the ashaver/nv_runtime branch from 8c0c5e6 to a2ad80e Compare March 2, 2026 18:41
@akshaver
Copy link
Copy Markdown
Author

akshaver commented Mar 2, 2026

Reverted back to pre kilo-code-bot suggestions.

The kilo-code-bot suggested changing the HasSuccessfulAccess condition from
e.retVal == 0 || p.SyscallType() == OpenFileType to p.OKReturnStatus(e.retVal),
arguing that the former incorrectly treated all OpenFileType syscalls as successful.

Original expression was correct:

  • For OpenFileType (open, openat, readlink): the outer condition on line 276-278
    already filters to successful calls via p.OKReturnStatus(e.retVal), which
    requires fd >= 0. The || p.SyscallType() == OpenFileType clause is redundant,
    not wrong -- it simply ensures HasSuccessfulAccess=true for events that already
    passed the success filter.

  • For CheckFileType (stat, access, lstat): e.retVal == 0 correctly marks only
    files that actually exist on disk. The replacement p.OKReturnStatus(e.retVal)
    also accepts ENOENT (-2) and ENOTDIR (-20), causing non-existent "ghost" paths
    to be marked as HasSuccessfulAccess=true.

This broke the namespace package fix (da8eb2c): when Python probes for
init.cpython-311.so, init.so, init.py in a directory that has none
of them (namespace package), those ENOENT stat results created ghost children
with HasSuccessfulAccess=true. The FileActivity() radix tree walk then marked
the parent directory as IsSubdir and excluded it from the slim image. Since the
ghost children don't exist on disk, neither the directory nor its contents were
preserved, causing libraries to go missing.

@akshaver
Copy link
Copy Markdown
Author

akshaver commented Mar 3, 2026

The recent kilo-code changes are breaking the vllm integration test. Tracking down why.

Resolved. kilo-code-bot was wrong.

@kcq
Copy link
Copy Markdown
Contributor

kcq commented Mar 5, 2026

Thank you @akshaver for investigating and addressing all of those! Really appreciate it!

@akshaver
Copy link
Copy Markdown
Author

Gently checking in. Is there anything else I can help with on this?

@kcq
Copy link
Copy Markdown
Contributor

kcq commented Mar 28, 2026

Gently checking in. Is there anything else I can help with on this?

Still need to review the side effects of this change (where not found paths are also saved) for the stat calls in PT because it's the opposite of the intended behavior (it was meant to select only successful calls for existing paths):

func (ref *checkFileSyscallProcessor) OKReturnStatus(retVal uint64) bool {
	// Accept successful stat calls (0) and also failed attempts that indicate
	// the application was looking for the file. This is important for Python
	// imports which check multiple locations before finding the right file.
	// Track ENOENT (file not found) and ENOTDIR (not a directory) in addition to success.
	intRetVal := getIntVal(retVal)
	return intRetVal == 0 || intRetVal == -2 || intRetVal == -20 // 0=success, -2=ENOENT, -20=ENOTDIR
}

@akshaver
Copy link
Copy Markdown
Author

akshaver commented Apr 2, 2026

Working through an understanding that on my end. I also passed the comment and code through claude. I'm guessing I need to work through the possible open bugs, maybe consider renaming for clarity.

--------------claude -------------

TL;DR — What Changed, What Was Intended, and What May Be Broken

Original architecture intent

The ptrace monitor was designed to track only successful file-system calls —
files that actually exist on disk. For stat-type syscalls (stat, lstat,
access, newfstatat, etc.), success means retVal == 0. The resulting
fsActivity map was meant to be a faithful picture of what the application
found, not what it looked for. Downstream consumers (FileActivity() radix
tree walk, prepareArtifact(), MonitorDataEvent publishing) all assumed every
entry in fsActivity corresponded to a real on-disk path.

What the PR changed and why

OKReturnStatus for checkFileSyscallProcessor was broadened to also accept
ENOENT (-2) and ENOTDIR (-20), so that Python import search paths would be tracked.
Python probes multiple locations (__init__.cpython-311.so, __init__.so,
__init__.py) before finding the right file; without tracking the failed probes,
the slim image could miss paths the runtime needs to search. This means
fsActivity now contains "ghost" paths — entries for files that don't exist.

A second-level guard (HasSuccessfulAccess) was introduced to distinguish real
paths from ghosts, and FileActivity() was updated to check it before marking
parent directories as IsSubdir.

Possible open bugs in this MR

  1. Ghost leaf paths leak into the final report. FileActivity() only filters
    paths marked IsSubdir=true. Ghost leaf paths (no children) have
    HasSuccessfulAccess=false but pass through unfiltered. They appear in
    Report.FSActivity and get processed by prepareArtifact() for files that
    don't exist. In a Python app probing hundreds of import locations, this inflates
    the report and produces "missing in rawNames" log noise. (See side effect Bump actions/download-artifact from 4.1.2 to 4.1.4 #2, Bump test/bats from ab2a86e to 36451c9 #3
    below.)

  2. Ghost paths are published as MDETypeArtifact events. Every path that
    passes the OKReturnStatus gate gets published via app.del.Publish(),
    regardless of whether the file exists. Event stream consumers may receive
    artifact events for non-existent files. (See side effect Bump test/bats from 36451c9 to db381c7 #4 below.)

  3. The OKReturnStatus / FailedReturnStatus naming is misleading. ENOENT
    returns true for both methods simultaneously. This already led to a
    plausible-looking but semantically wrong bot fix (3f8dcb4) that collapsed the
    gate and flag into the same condition, breaking namespace packages. The confusing
    naming makes future misuse likely. (See the inconsistency table below.)


Summary

checkFileSyscallProcessor.OKReturnStatus accepts ENOENT (-2) and ENOTDIR (-20)
in addition to success (0). This was intentional — Python imports probe multiple
paths before finding the right file. However, it means non-existent "ghost" paths
enter fsActivity
, which has downstream consequences beyond the already-mitigated
IsSubdir false positive.

Architecture: The Two-Level Design

Level 1 — The Gate (processFileActivity, line 276-278)

if (p.SyscallType() == CheckFileType || p.SyscallType() == OpenFileType) &&
    p.OKReturnStatus(e.retVal) {

This decides whether a path enters fsActivity at all. Because
checkFileSyscallProcessor.OKReturnStatus accepts ENOENT/ENOTDIR, stat calls for
non-existent files pass this gate and get recorded.

For openFileSyscallProcessor, OKReturnStatus requires fd >= 0 (true success),
so only actually-opened files pass the gate.

Level 2 — The Flag (HasSuccessfulAccess, line 289-303)

HasSuccessfulAccess: e.retVal == 0 || p.SyscallType() == OpenFileType

This marks whether the path actually exists on disk. For CheckFileType, only
retVal == 0 sets it. For OpenFileType, the outer gate already ensures success
(fd >= 0), so || p.SyscallType() == OpenFileType is redundant but correct.

Used downstream in FileActivity() (line 520) to prevent ghost children from
causing IsSubdir = true on parent directories.

The OKCall / OKReturnStatus Inconsistency

These four methods on checkFileSyscallProcessor are not complementary pairs:

retVal OKCall OKReturnStatus FailedCall FailedReturnStatus
0 (success) true true false false
-2 (ENOENT) false true true true
-20 (ENOTDIR) false true true true
-1 (EPERM) false false true true

ENOENT is simultaneously "OK" (via OKReturnStatus) and "Failed" (via
FailedReturnStatus). This is by design — it means "the call failed, but the
path is interesting and should be tracked" — but the naming doesn't convey this
dual meaning. A name like TrackableReturnStatus or InterestingReturnStatus
would be clearer.

Identified Side Effects

1. IsSubdir False Positive — MITIGATED ✓

Scenario: Python probes /usr/lib/foo/__init__.cpython-311.so,
__init__.so, __init__.py (all ENOENT), then stats /usr/lib/foo/ (success).

Before the fix (da8eb2c), the radix tree walk saw ghost children under
/usr/lib/foo/ and marked it IsSubdir=true, excluding it from the slim image.

Mitigation: HasSuccessfulAccess check in FileActivity() (line 520) ensures
only children that actually exist on disk can trigger IsSubdir.

2. Ghost Paths in FSActivity Report — NOT MITIGATED

FileActivity() filters out paths marked IsSubdir=true, but ghost leaf paths
(with no children of their own) pass through to the final Report.FSActivity.

Concrete scenario: Python stats /usr/lib/foo/__init__.cpython-311.so → ENOENT.
This path enters fsActivity with HasSuccessfulAccess=false. Since it has no
child paths, FileActivity() doesn't filter it. It appears in the report as a
tracked artifact, even though the file doesn't exist.

Impact: Report bloat. In a Python app that probes hundreds of import paths across
many packages, this can significantly inflate the FSActivity map. However, these
ghost entries have HasSuccessfulAccess=false, so downstream consumers that check
this flag are not affected.

3. Ghost Paths Trigger prepareArtifact() — NOT MITIGATED

In artifact.go (line 851-868), every path in Report.FSActivity is looked up in
rawNames and processed via prepareArtifact(). Ghost paths won't be found in
rawNames (since they don't exist on disk), producing "missing in rawNames"
debug-level log messages but otherwise failing silently.

Impact: Minor — debug log noise and wasted lookups, but no incorrect behavior.

4. Ghost Paths Published as MonitorDataEvents — NOT MITIGATED

In processFileActivity (line 314-337), every path that passes the gate gets
published as an MDETypeArtifact event, regardless of HasSuccessfulAccess.

Impact: Event stream consumers see artifact events for non-existent files.
Whether this matters depends on how those events are consumed downstream.

The HasSuccessfulAccess Revert History

Commit HasSuccessfulAccess expression Status
da8eb2c e.retVal == 0 Correct for CheckFileType, wrong for OpenFileType
77da28d e.retVal == 0 || p.SyscallType() == OpenFileType Correct for both
3f8dcb4 p.OKReturnStatus(e.retVal) (kilo-code-bot) BROKEN — ENOENT ghosts get HasSuccessfulAccess=true
a2ad80e e.retVal == 0 || p.SyscallType() == OpenFileType Reverted to correct

The kilo-code-bot change was plausible-looking but semantically wrong: it collapsed
Level 1 (gate) and Level 2 (flag) into the same condition, defeating the purpose
of HasSuccessfulAccess.

Possible Improvements (Not Prescriptive)

  1. Filter ghost paths in FileActivity(): After the radix walk, also exclude
    paths with HasSuccessfulAccess=false. This would eliminate side effects 2-4
    and reduce report size. However, it would lose the signal that the application
    tried to access those paths, which might be useful for debugging.

  2. Rename OKReturnStatus to TrackableReturnStatus or
    ShouldTrackPath to make the dual semantics explicit.

  3. Add HasSuccessfulAccess check to event publishing (line 314-337)
    to avoid publishing ghost artifacts.

  4. Split the gate: Instead of a permissive OKReturnStatus, have
    processFileActivity call a separate ShouldTrackPath method, keeping
    OKReturnStatus for its original meaning (success only). This would make the
    code match the naming and avoid future confusion.
    -----------------/claude----------------

akshaver added 2 commits April 2, 2026 20:46
OKReturnStatus for checkFileSyscallProcessor was broadened in 9aca3e9 to
accept ENOENT (-2) and ENOTDIR (-20) for Python import tracking. However,
ghost paths (non-existent files) that entered fsActivity via this gate
were never used by the artifact pipeline — prepareArtifact() discards them
at os.Lstat(). Meanwhile they caused three side effects:

1. Ghost leaf paths leaked into Report.FSActivity and triggered
   prepareArtifact() calls for non-existent files
2. Ghost paths were published as MDETypeArtifact events
3. The permissive OKReturnStatus semantics caused a naming trap that led
   to a regression (3f8dcb4) where HasSuccessfulAccess was set using
   OKReturnStatus, breaking namespace package directories

This reverts OKReturnStatus to the upstream original (retVal == 0),
removes the HasSuccessfulAccess field and its plumbing (no longer needed
since only real paths enter fsActivity), and simplifies the FileActivity()
radix tree walk.

Tests updated to verify: ENOENT paths are rejected, no ghost paths in
reports, namespace package directories are preserved, IsSubdir works
correctly for real children.
@akshaver
Copy link
Copy Markdown
Author

akshaver commented Apr 2, 2026

Added 3 tests for those cases, then removed the corresponding code that caused the issue. Then I re-ran the original test_vllm.sh test, which still works.

@kcq Let me know if that fixes the issue that concerned you, or if there's still something else I can help address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants