Commit 950f4b4
Add GPU/Device Support and Fix Symlink Deduplication Issues (#176)
* Implement docker device request in CLI
Now one can pass in the cro runtime and GPU like
--cro-device-request '{"Count":-1, \
"Capabilities":[["gpu"]]}' \
--cro-runtime nvidia \
...
* Fix issues with duplicate symlink processing
Correct issues with duplicate paths from symlink
processing, when they both point to the same
inode. This fixes a behavior where slimtookit
would randomly generate o-byte files for the
actual file referenced by the symlink.
* fix: resolve ptrace monitor bugs causing hangs with multiprocess apps
Infinite loop in getStringParam: when a syscall had a NULL path
argument, PtracePeekData returned EIO with count=0. The function
silenced EIO, never advanced the pointer, and had no exit condition,
burning 100% CPU until exit. Fix: return early
on NULL pointer and on any PtracePeekData error.
* fix null ref crash when ptrace disabled
* fix: prevent namespace package directories from being excluded by IsSubdir false positive
OKReturnStatus for checkFile syscalls intentionally accepts ENOENT (-2)
to track Python import search paths. When Python imports a namespace
package (a directory without __init__.py), it probes for several
__init__.* variants -- all returning ENOENT -- before stat'ing the
directory itself (success). The radix tree walk in FileActivity() then
sees the directory as a prefix of these ghost child paths and marks it
IsSubdir=true, excluding it from the slim image. Since the ghost
children don't exist on disk, neither the directory nor any files are
preserved.
Add HasSuccessfulAccess to FSActivityInfo, set it only when retVal==0,
and require it in the IsSubdir determination so that ENOENT-only child
paths cannot cause a parent directory to be dropped.
* example use case with nvidia runtime
* Fix HasSuccessfulAccess for open-type syscalls where success is fd >= 0, not retVal == 0
* Key deduplicateFileMap by (dev, inode) instead of inode alone to avoid false dedup across mounts
* Fail fast on invalid --cro-device-request JSON instead of silently dropping the device config
* Remove unused SIGNAL_PIPE named pipe from test_vllm.sh
* Update pkg/monitor/ptrace/ptrace.go
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
* Update pkg/monitor/ptrace/ptrace.go
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
* patch up kilo-code-bot fix
* correct indentation
* Revert HasSuccessfulAccess to use retVal==0 check instead of OKReturnStatus
The kilo-code-bot suggested changing the HasSuccessfulAccess condition from
`e.retVal == 0 || p.SyscallType() == OpenFileType` to `p.OKReturnStatus(e.retVal)`,
arguing that the former incorrectly treated all OpenFileType syscalls as successful.
However, the original expression was correct:
- For OpenFileType (open, openat, readlink): the outer condition on line 276-278
already filters to successful calls via `p.OKReturnStatus(e.retVal)`, which
requires fd >= 0. The `|| p.SyscallType() == OpenFileType` clause is redundant,
not wrong -- it simply ensures HasSuccessfulAccess=true for events that already
passed the success filter.
- For CheckFileType (stat, access, lstat): `e.retVal == 0` correctly marks only
files that actually exist on disk. The replacement `p.OKReturnStatus(e.retVal)`
also accepts ENOENT (-2) and ENOTDIR (-20), causing non-existent "ghost" paths
to be marked as HasSuccessfulAccess=true.
This broke the namespace package fix (da8eb2c): when Python probes for
__init__.cpython-311.so, __init__.so, __init__.py in a directory that has none
of them (namespace package), those ENOENT stat results created ghost children
with HasSuccessfulAccess=true. The FileActivity() radix tree walk then marked
the parent directory as IsSubdir and excluded it from the slim image. Since the
ghost children don't exist on disk, neither the directory nor its contents were
preserved, causing libraries to go missing.
* chore: moved to vLLM v0.17.1-cu130 for testing
* lint: fixed indenting in ptrace.go with gofmt
* add unit tests for bugs related to ghost paths
* Revert OKReturnStatus to success-only and remove HasSuccessfulAccess
OKReturnStatus for checkFileSyscallProcessor was broadened in 9aca3e9 to
accept ENOENT (-2) and ENOTDIR (-20) for Python import tracking. However,
ghost paths (non-existent files) that entered fsActivity via this gate
were never used by the artifact pipeline — prepareArtifact() discards them
at os.Lstat(). Meanwhile they caused three side effects:
1. Ghost leaf paths leaked into Report.FSActivity and triggered
prepareArtifact() calls for non-existent files
2. Ghost paths were published as MDETypeArtifact events
3. The permissive OKReturnStatus semantics caused a naming trap that led
to a regression (3f8dcb4) where HasSuccessfulAccess was set using
OKReturnStatus, breaking namespace package directories
This reverts OKReturnStatus to the upstream original (retVal == 0),
removes the HasSuccessfulAccess field and its plumbing (no longer needed
since only real paths enter fsActivity), and simplifies the FileActivity()
radix tree walk.
Tests updated to verify: ENOENT paths are rejected, no ghost paths in
reports, namespace package directories are preserved, IsSubdir works
correctly for real children.
---------
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>1 parent 5117670 commit 950f4b4
23 files changed
Lines changed: 2778 additions & 36 deletions
File tree
- examples/nvidia_runtime
- pkg
- app
- master
- command
- build
- config
- inspectors/container
- security/seccomp
- sensor
- artifact
- monitor
- monitor/ptrace
- util/fsutil
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
0 commit comments