Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .claude/skills/debug/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
name: debug
description: Run commands inside a remote Docker container via the file-based command relay (tools/debugger). Use when the user says "run in Docker", "run on GPU", "debug remotely", "run test in container", "check nvidia-smi", "run pytest in Docker", or needs to execute any command inside a Docker container that shares the repo filesystem. Requires the user to have started server.sh inside the container first.
---

# Remote Docker Debugger

Execute commands inside a Docker container from the host using the file-based command relay.

Comment thread
cjluo-nv marked this conversation as resolved.
**Read `tools/debugger/CLAUDE.md` for full usage details** — it has the protocol, examples, and troubleshooting.

## Quick Reference

```bash
# Check connection
bash tools/debugger/client.sh status

# Connect to server (user must start server.sh in Docker first)
bash tools/debugger/client.sh handshake

# Run a command
bash tools/debugger/client.sh run "<command>"

# Long-running command (default timeout is 600s)
bash tools/debugger/client.sh --timeout 1800 run "<command>"

# Reconnect after server restart
bash tools/debugger/client.sh flush
bash tools/debugger/client.sh handshake
```
2 changes: 2 additions & 0 deletions tools/debugger/client.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@ case "$SUBCOMMAND" in
# Generate a unique command ID (timestamp + PID to avoid collisions)
cmd_id="$(date +%s%N)_$$"

echo "[client] Running: $*"
Comment thread
cjluo-nv marked this conversation as resolved.

# Write the command file atomically (tmp + mv)
echo "$*" > "$CMD_DIR/$cmd_id.sh.tmp"
mv "$CMD_DIR/$cmd_id.sh.tmp" "$CMD_DIR/$cmd_id.sh"
Expand Down
30 changes: 21 additions & 9 deletions tools/debugger/server.sh
Original file line number Diff line number Diff line change
Expand Up @@ -87,17 +87,28 @@ fi
rm -rf "$RELAY_DIR"
mkdir -p "$CMD_DIR" "$RESULT_DIR"

# Install modelopt in editable mode (skip if already editable-installed from WORKDIR)
if python -c "
import modelopt, os
assert os.path.realpath(modelopt.__path__[0]).startswith(os.path.realpath('$WORKDIR'))
" 2>/dev/null; then
# Ensure modelopt is editable-installed from WORKDIR
check_modelopt_local() {
python -c "
import modelopt, os, sys
actual = os.path.realpath(modelopt.__path__[0])
expected = os.path.realpath('$WORKDIR')
if not actual.startswith(expected):
print(f'modelopt loaded from {actual}, expected under {expected}', file=sys.stderr)
sys.exit(1)
" 2>&1
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
}

if check_modelopt_local >/dev/null 2>&1; then
echo "[server] modelopt already editable-installed from $WORKDIR, skipping pip install."
else
echo "[server] Installing modelopt (pip install -e .[dev]) ..."
(cd "$WORKDIR" && pip install -e ".[dev]") || {
echo "[server] WARNING: pip install failed (exit=$?), continuing anyway."
}
(cd "$WORKDIR" && pip install -e ".[dev]")
if ! check_modelopt_local; then
Comment thread
coderabbitai[bot] marked this conversation as resolved.
echo "[server] ERROR: modelopt is not running from the local folder ($WORKDIR)."
echo "[server] Try: pip install -e '.[dev]' inside the container, then restart the server."
exit 1
fi
echo "[server] Install done."
fi

Expand Down Expand Up @@ -130,7 +141,8 @@ while true; do

for cmd_file in "$CMD_DIR"/*.sh; do
cmd_id="$(basename "$cmd_file" .sh)"
echo "[server] Executing command $cmd_id..."
cmd_content=$(cat "$cmd_file")
echo "[server] Executing command $cmd_id: $cmd_content"
Comment thread
cjluo-nv marked this conversation as resolved.
Outdated

# Execute the command, tee stdout+stderr to console and result file
(cd "$WORKDIR" && bash "$cmd_file" 2>&1) | tee "$RESULT_DIR/$cmd_id.log" || true
Expand Down
Loading