Skip to content

docs(reference): document Docker-driver compute path alongside legacy k3s#3433

Open
latenighthackathon wants to merge 2 commits into
NVIDIA:mainfrom
latenighthackathon:docs/architecture-docker-driver
Open

docs(reference): document Docker-driver compute path alongside legacy k3s#3433
latenighthackathon wants to merge 2 commits into
NVIDIA:mainfrom
latenighthackathon:docs/architecture-docker-driver

Conversation

@latenighthackathon
Copy link
Copy Markdown
Contributor

@latenighthackathon latenighthackathon commented May 13, 2026

Summary

Refresh docs/reference/architecture.md and docs/get-started/prerequisites.md so they reflect the dual-path compute model that landed in 0.0.39 via #3001 (Linux) and #3383 (macOS Apple Silicon). Both pages currently describe the OpenShell sandbox exclusively as a Kubernetes pod inside a gateway-embedded k3s cluster, which is no longer accurate for the platforms most NemoClaw users run on.

Problem

Per src/lib/onboard.ts:3411-3416, isLinuxDockerDriverGatewayEnabled returns true for platform === "linux" and for platform === "darwin" && arch === "arm64", so the Docker-driver path is the default with no opt-in on those platforms. The legacy k3s path still applies to macOS Intel, Windows, and WSL2.

The release notes for 0.0.39 in docs/about/release-notes.md call out the Docker-driver migration, but architecture.md and prerequisites.md were not refreshed in #3375 (release-prep PR) and still tell readers the sandbox is always a Kubernetes pod. New users reading the architecture page form the wrong mental model on day one.

See #3432 for the full gap analysis with verified-against-code reproduction steps.

Changes

docs/reference/architecture.md:

  • Replace the universal "embeds a k3s cluster" prose at the top of Deployment Topology with a paragraph that names both compute paths and a small platform table mapping each platform to its compute path and sandbox shape.
  • Identify the existing Mermaid diagram as the legacy k3s path and note that the Docker-driver path replaces the embedded k3s cluster + sandbox pod with a single sibling Docker container under the same Landlock + seccomp + netns confinement.
  • Update the layering table: Gateway container row now distinguishes "On the legacy k3s path, also hosts the embedded k3s control plane"; the k3s row is labelled "(legacy path only)"; the Sandbox row splits into two ("Sandbox container (Docker driver path)" and "Sandbox pod (legacy k3s path)").

docs/get-started/prerequisites.md:

  • RAM-pressure sentence: drop the standalone "k3s" listing. k3s is an internal detail of the gateway container image, not a separate top-level process users budget RAM for. The new wording lists "the Docker daemon, the OpenShell gateway container, and the sandbox runtime" so it remains accurate on both compute paths.
  • fuse-overlayfs note: reword "kernel-level nested-overlay limitation in k3s" to "kernel-level nested-overlay limitation in the OpenShell gateway image". The autofix gates on platform === "linux" && !isWslHost && runtime === "docker" && dockerStorageDriver === "overlayfs" && dockerUsesContainerdSnapshotter (see src/lib/onboard/preflight.ts:478), so referring to the gateway image rather than k3s directly is more accurate now that Linux uses the Docker driver.

No diagram is added or replaced. The existing legacy-k3s diagram is preserved verbatim and re-labelled by the surrounding prose.

Test plan

  • npx prek run --files docs/get-started/prerequisites.md docs/reference/architecture.md passes (gitleaks, markdownlint, docs-to-skills dry-run, skills YAML)
  • commitlint passes
  • make docs (no Sphinx in container; rendered preview will appear via NemoClaw's docs build job)

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Refs #3432.

Signed-off-by: latenighthackathon latenighthackathon@users.noreply.github.com

Summary by CodeRabbit

  • Documentation
    • Clarified hardware prerequisites to explain which runtime components run concurrently during sandbox image export and reiterated guidance for systems with <8 GB RAM.
    • Updated deployment architecture docs to describe the split runtime topology, distinguishing the sandbox-as-container versus sandbox-as-pod configurations and clarifying the gateway control-plane behavior across platform paths.

Review Change Stack

Review Change Stack

… k3s

Clarifies that the OpenShell sandbox runs as a sibling Docker container on
Linux and macOS Apple Silicon (Docker-driver path, default since 0.0.39 via
NVIDIA#3001 and NVIDIA#3383) or as a pod in the embedded k3s cluster on macOS Intel,
Windows, and WSL2 (legacy path).

architecture.md gets a new platform/compute-path table at the top of
Deployment Topology, the existing Mermaid diagram is identified as the
legacy k3s path, and the layering table now lists both Sandbox container
and Sandbox pod variants. prerequisites.md drops k3s from the RAM-pressure
sentence (k3s is an internal detail of the gateway container, not a
separate process the user budgets for) and rewords the fuse-overlayfs note
to refer to the OpenShell gateway image rather than k3s directly.

Refs NVIDIA#3432.

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b1085ce0-95da-4306-a298-94b5fb15d909

📥 Commits

Reviewing files that changed from the base of the PR and between 9f381df and 24ab014.

📒 Files selected for processing (1)
  • docs/get-started/prerequisites.md
✅ Files skipped from review due to trivial changes (1)
  • docs/get-started/prerequisites.md

📝 Walkthrough

Walkthrough

Documentation updated to reflect a split sandbox runtime topology: the Docker-driver path runs the sandbox as a sibling Docker container, while the legacy k3s path runs the sandbox as a pod inside the gateway-embedded cluster. Prerequisites updated to reference the OpenShell gateway and sandbox runtime.

Changes

Split Sandbox Topology Documentation

Layer / File(s) Summary
Architecture topology and layering table
docs/reference/architecture.md
Architecture section refactored with a new platform/compute-path table distinguishing Docker-driver and legacy k3s deployment paths; "Layering from top to bottom" updated to separate sandbox as container (Docker-driver) vs. pod (legacy k3s), and gateway container description scoped to legacy path.
Prerequisites component references
docs/get-started/prerequisites.md
Hardware prerequisites and Docker storage driver notes updated to reference the Docker daemon, OpenShell gateway container, and sandbox runtime during sandbox image push, and to state that nemoclaw onboard builds a fuse-overlayfs-enabled cluster image to bypass a nested-overlay limitation in the OpenShell gateway image.

🎯 2 (Simple) | ⏱️ ~10 minutes

🐰 The docs hopped through the code-lined bog,
Gateway and sandbox now run side-by-side, not in a log.
Docker-driver sibling or k3s pod by lore,
I nibble the changes and bounce for more. 🥕📘

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: documenting the Docker-driver compute path alongside the legacy k3s path in the reference documentation, which directly aligns with the primary objective of updating docs to reflect the dual-path compute model.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
docs/reference/architecture.md (3)

156-159: ⚡ Quick win

Apply code formatting to technical terms in table.

Technical identifiers like k3s, Landlock, seccomp, and netns should use inline code formatting even within table cells for consistency.

Suggested fix
-| Gateway container | Docker container | Hosts the credential store and the L7 proxy. On the legacy k3s path, also hosts the embedded k3s control plane. |
-| k3s (legacy path only) | Process tree inside the gateway container | Kubernetes control plane that schedules the sandbox pod. |
-| Sandbox container (Docker driver path) | Sibling Docker container managed by the gateway | Runs the OpenClaw agent and the NemoClaw plugin under Landlock + seccomp + netns. |
-| Sandbox pod (legacy k3s path) | Pod in the embedded k3s cluster | Runs the OpenClaw agent and the NemoClaw plugin under Landlock + seccomp + netns. |
+| Gateway container | Docker container | Hosts the credential store and the L7 proxy. On the legacy `k3s` path, also hosts the embedded `k3s` control plane. |
+| `k3s` (legacy path only) | Process tree inside the gateway container | Kubernetes control plane that schedules the sandbox pod. |
+| Sandbox container (Docker driver path) | Sibling Docker container managed by the gateway | Runs the OpenClaw agent and the NemoClaw plugin under `Landlock` + `seccomp` + `netns`. |
+| Sandbox pod (legacy `k3s` path) | Pod in the embedded `k3s` cluster | Runs the OpenClaw agent and the NemoClaw plugin under `Landlock` + `seccomp` + `netns`. |

As per coding guidelines, CLI commands, file paths, flags, parameter names, and values must use inline code formatting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/architecture.md` around lines 156 - 159, Update the table rows
so all technical identifiers are formatted as inline code: wrap terms like
`Gateway container` entries' `k3s` (in "k3s (legacy path only)"), `Landlock`,
`seccomp`, and `netns` in backticks within their respective table cells (also
apply to any CLI commands, file paths, flags, parameter names, or values present
in the same rows), e.g., modify the "k3s (legacy path only)" row and both
"Sandbox container (Docker driver path)" and "Sandbox pod (legacy k3s path)"
descriptions to use inline code formatting for those terms to meet the coding
guidelines.

93-93: ⚡ Quick win

Apply code formatting to technical term.

The term k3s should use inline code formatting for consistency with other technical identifiers in the documentation.

Suggested fix
-The sandbox runs as a sibling Docker container or as a pod inside an embedded k3s cluster, depending on the host platform.
+The sandbox runs as a sibling Docker container or as a pod inside an embedded `k3s` cluster, depending on the host platform.

As per coding guidelines, CLI commands, file paths, flags, parameter names, and values must use inline code formatting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/architecture.md` at line 93, Replace the plain text occurrence
of the technical term k3s in the sentence inside the architecture doc with
inline code formatting (e.g., `k3s`) so it matches the project's guideline for
CLI/technical identifiers; update the phrase "embedded k3s cluster" to "embedded
`k3s` cluster" where it appears.

101-102: ⚡ Quick win

Fix passive voice and apply code formatting to technical terms.

Line 102 uses passive voice ("are replaced by"), which violates the active voice requirement. Additionally, k3s and the technical security terms should use inline code formatting.

Suggested fix
-The diagram below shows the legacy k3s path.
-On the Docker driver path, the embedded k3s cluster and the sandbox pod are replaced by a sibling Docker container that hosts the OpenClaw agent under the same Landlock, seccomp, and netns confinement.
+The diagram below shows the legacy `k3s` path.
+On the Docker driver path, a sibling Docker container replaces the embedded `k3s` cluster and sandbox pod, hosting the OpenClaw agent under the same `Landlock`, `seccomp`, and `netns` confinement.

As per coding guidelines, active voice is required, and CLI commands, file paths, flags, parameter names, and values must use inline code formatting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/architecture.md` around lines 101 - 102, Rewrite the sentence
to use active voice and apply inline code formatting to technical terms: change
"On the Docker driver path, the embedded k3s cluster and the sandbox pod are
replaced by a sibling Docker container that hosts the OpenClaw agent under the
same Landlock, seccomp, and netns confinement." to an active construction that
explicitly names the actor (e.g., "The Docker driver replaces the embedded `k3s`
cluster and the sandbox pod with a sibling Docker container that hosts the
`OpenClaw` agent under the same `Landlock`, `seccomp`, and `netns`
confinement.")—ensure `k3s`, `OpenClaw`, `Landlock`, `seccomp`, and `netns` are
formatted as inline code and prefer "replaces" (active verb) instead of passive
phrasing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/get-started/prerequisites.md`:
- Line 35: The paragraph on sandbox image memory should be rewritten so each
sentence is on its own line: split the existing six-sentence block into six
separate lines, preserving the original wording and order (sentences about image
size, concurrent processes: Docker daemon/OpenShell gateway/sandbox runtime,
pipeline buffering decompressed layers in memory, OOM killer on machines <8 GB,
recommendation to configure at least 8 GB swap if memory cannot be increased,
and note about slower performance), ensuring one sentence per line to satisfy
the one-sentence-per-line style requirement.

---

Nitpick comments:
In `@docs/reference/architecture.md`:
- Around line 156-159: Update the table rows so all technical identifiers are
formatted as inline code: wrap terms like `Gateway container` entries' `k3s` (in
"k3s (legacy path only)"), `Landlock`, `seccomp`, and `netns` in backticks
within their respective table cells (also apply to any CLI commands, file paths,
flags, parameter names, or values present in the same rows), e.g., modify the
"k3s (legacy path only)" row and both "Sandbox container (Docker driver path)"
and "Sandbox pod (legacy k3s path)" descriptions to use inline code formatting
for those terms to meet the coding guidelines.
- Line 93: Replace the plain text occurrence of the technical term k3s in the
sentence inside the architecture doc with inline code formatting (e.g., `k3s`)
so it matches the project's guideline for CLI/technical identifiers; update the
phrase "embedded k3s cluster" to "embedded `k3s` cluster" where it appears.
- Around line 101-102: Rewrite the sentence to use active voice and apply inline
code formatting to technical terms: change "On the Docker driver path, the
embedded k3s cluster and the sandbox pod are replaced by a sibling Docker
container that hosts the OpenClaw agent under the same Landlock, seccomp, and
netns confinement." to an active construction that explicitly names the actor
(e.g., "The Docker driver replaces the embedded `k3s` cluster and the sandbox
pod with a sibling Docker container that hosts the `OpenClaw` agent under the
same `Landlock`, `seccomp`, and `netns` confinement.")—ensure `k3s`, `OpenClaw`,
`Landlock`, `seccomp`, and `netns` are formatted as inline code and prefer
"replaces" (active verb) instead of passive phrasing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 302cd111-35e8-4909-8787-e9636926e59e

📥 Commits

Reviewing files that changed from the base of the PR and between 904ea58 and 9f381df.

📒 Files selected for processing (2)
  • docs/get-started/prerequisites.md
  • docs/reference/architecture.md

Comment thread docs/get-started/prerequisites.md Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant