Skip to content

fix: configure OpenShell gateway bind address and port#3425

Merged
ericksoa merged 16 commits into
mainfrom
fix/gateway-bind-port-config
May 13, 2026
Merged

fix: configure OpenShell gateway bind address and port#3425
ericksoa merged 16 commits into
mainfrom
fix/gateway-bind-port-config

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented May 12, 2026

Summary

Closes #2779.

This adds bounded OpenShell gateway network configuration for onboarding:

  • keeps the default gateway behavior on 127.0.0.1:8080
  • allows NEMOCLAW_GATEWAY_BIND_ADDRESS=127.0.0.1 or 0.0.0.0
  • keeps local registration, readiness probes, SSH gateway host, and sandbox callback endpoints on concrete connect hosts instead of advertising 0.0.0.0
  • rejects NEMOCLAW_GATEWAY_PORT values that overlap the dashboard allocation range, configured NemoClaw service ports, or the default inference/proxy ports
  • documents the bind-address security tradeoff

Validation

  • npm run build:cli
  • npx vitest run src/lib/core/ports.test.ts src/lib/core/gateway-address.test.ts src/lib/onboard/preflight.test.ts src/lib/onboard/gateway-tcp-readiness.test.ts test/gateway-start-wait.test.ts test/gateway-http-reuse-wait.test.ts test/check-env-var-docs.test.ts test/onboard.test.ts
  • npm run checks
  • git diff --check

Summary by CodeRabbit

  • New Features

    • Configure gateway bind address (127.0.0.1 or 0.0.0.0); onboarding, TCP/HTTP readiness probes, and advertised endpoints respect bind/connect host; wildcard bind emits a warning.
  • Behavior Changes

    • Stricter gateway port validation (disallowed ranges/ports and no overlaps with dashboard/vLLM/Ollama/proxy); improved probing, retries and error messages.
  • Documentation

    • Expanded CLI reference, env-var docs, security guidance, troubleshooting, and console examples for bind/port settings.
  • Tests

    • Expanded coverage for bind parsing, endpoint mapping, port validation, onboarding readiness, Docker-driver env generation, and env-file handling.

Review Change Stack

@ericksoa ericksoa added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. configuration VDR Linked to VDR finding v0.0.40 Release target labels May 12, 2026
@ericksoa ericksoa self-assigned this May 12, 2026
@ericksoa ericksoa added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. configuration VDR Linked to VDR finding v0.0.40 Release target labels May 12, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fd44f2a3-098b-4d46-b370-ba2beb10fa7b

📥 Commits

Reviewing files that changed from the base of the PR and between 8f923e6 and 994e923.

📒 Files selected for processing (1)
  • test/onboard-selection.test.ts

📝 Walkthrough

Walkthrough

Adds bind-address and gateway-port parsing/validation, host-aware port probing and endpoint helpers, centralizes Docker-driver gateway env handling, integrates helpers into onboarding/readiness, normalizes subprocess envs, updates tests/harness, and updates docs.

Changes

Gateway port and bind-address configuration

Layer / File(s) Summary
Gateway bind-address core abstraction
src/lib/core/gateway-address.ts, src/lib/core/gateway-address.test.ts
Exports default and wildcard bind-address constants and type, parseGatewayBindAddress(), GATEWAY_BIND_ADDRESS, getGatewayConnectHost(), getGatewayHttpEndpoint(), getGatewayHttpsEndpoint(); tests validate parsing and advertised endpoint normalization.
Gateway port validation and parsing
src/lib/core/ports.ts, src/lib/core/ports.test.ts
Adds GatewayPortValidationOptions, validateGatewayPort(), and parseGatewayPort() to enforce no overlap with dashboard auto-allocation range or reserved inference/proxy ports; rewires GATEWAY_PORT; tests cover conflict and overlap matrices.
Docker-driver gateway env helpers & persistence
src/lib/onboard/docker-driver-gateway-env.ts, src/lib/onboard/docker-driver-gateway-env.test.ts
New module centralizes managed runtime env keys, builders for gateway start network env and Docker-driver endpoint, port-check options, wildcard-bind warning, env-file formatting/validation, and conditional Debian env override writer with restrictive permissions; tests validate env-file rewriting and write behavior.
Onboarding flow integration
src/lib/onboard.ts
Onboarding delegates env construction/writing to the new helper, uses helper-derived endpoints and port-check options, sanitizes gateway CLI args, re-probes using helper options, warns on wildcard binds, and updates messages to use helper-derived endpoints and args.
Preflight host-aware probing & readiness probes
src/lib/onboard/preflight.ts, src/lib/onboard/preflight.test.ts, src/lib/onboard/gateway-tcp-readiness.ts, src/lib/onboard/gateway-tcp-readiness.test.ts, src/lib/onboard/gateway-http-readiness.ts
CheckPortOpts gains optional host; probeImpl now receives (port, host); probePortAvailability() and fallback net probe use chosen host; isGatewayTcpReady() defaults host to getGatewayConnectHost() and isGatewayHttpReady() uses getGatewayHttpEndpoint() instead of hardcoded localhost; tests assert host propagation and real-net behavior on 0.0.0.0.
Subprocess env normalization
src/lib/runner.ts
Adds internal buildRunnerEnv() delegating to buildSubprocessEnv; runner spawn/capture paths now use buildRunnerEnv(...) for subprocess env.
Tests & harness updates
test/gateway-start-wait.test.ts, test/onboard.test.ts, other tests
Test harness clears gateway-address from require cache; tests add getGatewayStartEnv() helper validation; new cases assert wildcard bind affects bind/env generation and drift detection while advertised local endpoint remains loopback; various tests updated for sanitized args, timeout adjustments, mock restoration, and managed-router logging changes.
User-facing documentation
docs/reference/commands.md, docs/reference/troubleshooting.md, docs/security/best-practices.md
Docs updated to clarify gateway HTTP probing during reuse; document NEMOCLAW_GATEWAY_PORT and NEMOCLAW_GATEWAY_BIND_ADDRESS with validation/overlap rules and allowed values; add console examples and security notes about 0.0.0.0.
E2E script comments
test/e2e/test-openshell-gateway-upgrade.sh
Added inert comment lines to fake openshell stubs used in macOS regression tests; no logic change.

Estimated code review effort:
🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs:

  • NVIDIA/NemoClaw#3378: Overlapping changes to TCP readiness probe host defaults and readiness behavior.

Suggested labels: fix, OpenShell, security, NemoClaw CLI

Suggested reviewers:

  • cv
  • jyaunches

"🐰 I parsed the ports and hosts with care,
Wildcards mapped so clients still share,
Docs clarified, tests in tow,
Configure your gateway, set it to go,
Tiny hops onward — secure and aware."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: configure OpenShell gateway bind address and port' directly summarizes the main change—adding configuration support for both gateway bind address and port.
Linked Issues check ✅ Passed The PR addresses all coding requirements from issue #2779: configurable gateway port via NEMOCLAW_GATEWAY_PORT, bind address support via NEMOCLAW_GATEWAY_BIND_ADDRESS (127.0.0.1 or 0.0.0.0), port overlap validation, and security documentation.
Out of Scope Changes check ✅ Passed All changes are scoped to gateway configuration: new gateway-address.ts module, port validation enhancements, documentation updates, onboarding integration, and supporting tests. No unrelated refactoring or scope creep detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/gateway-bind-port-config

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/onboard.test.ts (1)

430-467: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make this test hermetic against ambient gateway env vars.

At Line 451+, only NEMOCLAW_DISABLE_OVERLAY_FIX is controlled. If NEMOCLAW_GATEWAY_PORT or NEMOCLAW_GATEWAY_BIND_ADDRESS is set in the parent shell/CI, these fixed 127.0.0.1:8080 assertions can fail nondeterministically.

Suggested patch
+    const originalGatewayBindAddress = process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS;
+    const originalGatewayPort = process.env.NEMOCLAW_GATEWAY_PORT;
     const originalOverlayFix = process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
+    delete process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS;
+    delete process.env.NEMOCLAW_GATEWAY_PORT;
     process.env.NEMOCLAW_DISABLE_OVERLAY_FIX = "1";
     try {
       expect(getGatewayStartEnv()).toMatchObject({
         OPENSHELL_BIND_ADDRESS: "127.0.0.1",
         OPENSHELL_SERVER_PORT: "8080",
         OPENSHELL_SSH_GATEWAY_HOST: "127.0.0.1",
         OPENSHELL_SSH_GATEWAY_PORT: "8080",
       });
     } finally {
+      if (originalGatewayBindAddress === undefined) {
+        delete process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS;
+      } else {
+        process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS = originalGatewayBindAddress;
+      }
+      if (originalGatewayPort === undefined) {
+        delete process.env.NEMOCLAW_GATEWAY_PORT;
+      } else {
+        process.env.NEMOCLAW_GATEWAY_PORT = originalGatewayPort;
+      }
       if (originalOverlayFix === undefined) {
         delete process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
       } else {
         process.env.NEMOCLAW_DISABLE_OVERLAY_FIX = originalOverlayFix;
       }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/onboard.test.ts` around lines 430 - 467, The test isn't hermetic because
it only controls NEMOCLAW_DISABLE_OVERLAY_FIX but not ambient
NEMOCLAW_GATEWAY_PORT or NEMOCLAW_GATEWAY_BIND_ADDRESS, causing the assertions
on 127.0.0.1:8080 to flake; fix getGatewayStartEnv test by saving original
values of process.env.NEMOCLAW_GATEWAY_PORT and
process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS, set them to "8080" and "127.0.0.1"
(respectively) alongside NEMOCLAW_DISABLE_OVERLAY_FIX before the assertion, then
restore all three originals in the finally block; locate this change around the
getGatewayStartEnv call in the test (references: getGatewayStartEnv,
NEMOCLAW_DISABLE_OVERLAY_FIX).
src/lib/onboard.ts (1)

3811-3823: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist the new gateway networking env vars in the DEB override path.

getDockerDriverGatewayEnv() now makes bind address, port, gRPC endpoint, and SSH gateway host/port part of the runtime contract, but writeDockerGatewayDebEnvOverride() still only writes the old driver/supervisor keys. On distro installs, a service restart will come back on the stale/default networking config, and the next drift check will immediately treat it as stale again.

Suggested follow-up
 function writeDockerGatewayDebEnvOverride(): void {
   const servicePath = "/usr/lib/systemd/user/openshell-gateway.service";
   const legacyServicePath = "/lib/systemd/user/openshell-gateway.service";
@@
   const preserved = existing
     .split("\n")
     .filter(
       (line: string) =>
         line.trim() &&
-        !/^OPENSHELL_(DRIVERS|DOCKER_SUPERVISOR_IMAGE|DOCKER_SUPERVISOR_BIN)=/.test(line),
+        !/^OPENSHELL_(DRIVERS|BIND_ADDRESS|SERVER_PORT|GRPC_ENDPOINT|SSH_GATEWAY_HOST|SSH_GATEWAY_PORT|DOCKER_SUPERVISOR_IMAGE|DOCKER_SUPERVISOR_BIN)=/.test(line),
     );
   const override = getDockerDriverGatewayEnv();
   const next = [
     ...preserved,
     `OPENSHELL_DRIVERS=${override.OPENSHELL_DRIVERS}`,
+    `OPENSHELL_BIND_ADDRESS=${override.OPENSHELL_BIND_ADDRESS}`,
+    `OPENSHELL_SERVER_PORT=${override.OPENSHELL_SERVER_PORT}`,
+    `OPENSHELL_GRPC_ENDPOINT=${override.OPENSHELL_GRPC_ENDPOINT}`,
+    `OPENSHELL_SSH_GATEWAY_HOST=${override.OPENSHELL_SSH_GATEWAY_HOST}`,
+    `OPENSHELL_SSH_GATEWAY_PORT=${override.OPENSHELL_SSH_GATEWAY_PORT}`,
     `OPENSHELL_DOCKER_SUPERVISOR_IMAGE=${override.OPENSHELL_DOCKER_SUPERVISOR_IMAGE}`,
     ...(override.OPENSHELL_DOCKER_SUPERVISOR_BIN
       ? [`OPENSHELL_DOCKER_SUPERVISOR_BIN=${override.OPENSHELL_DOCKER_SUPERVISOR_BIN}`]
       : []),
   ].join("\n");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 3811 - 3823, The DEB override writer
writeDockerGatewayDebEnvOverride() currently only persists the old
driver/supervisor keys so new networking env vars from
getDockerDriverGatewayEnv() (OPENSHELL_BIND_ADDRESS, OPENSHELL_SERVER_PORT,
OPENSHELL_GRPC_ENDPOINT, OPENSHELL_SSH_GATEWAY_HOST, OPENSHELL_SSH_GATEWAY_PORT)
are not saved; update writeDockerGatewayDebEnvOverride() to include these keys
and their computed values (from getDockerDriverGatewayEnv() or the same
computation used when building env in onboard.ts) when writing the DEB override
file so distro installs will restart with the correct bind address/port, gRPC
endpoint and SSH gateway host/port and avoid immediate drift detection.
🧹 Nitpick comments (1)
docs/reference/commands.md (1)

188-188: ⚡ Quick win

Use active voice in the gateway-reuse sentence.

Line 188 starts with passive phrasing (“is detected”). Prefer an active subject, e.g., “When NemoClaw detects an existing gateway for reuse, …”.

As per coding guidelines: "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/commands.md` at line 188, Replace the passive sentence "When
an existing gateway is detected for reuse, NemoClaw probes the host gateway HTTP
endpoint..." with an active-voice construction starting with the agent (e.g.,
"When NemoClaw detects an existing gateway for reuse, NemoClaw probes the host
gateway HTTP endpoint before declaring it reusable...") so the subject
"NemoClaw" performs the action; ensure the rest of the sentence (probing the
host gateway HTTP endpoint and rebuilding if upstream is warming up) remains
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/core/ports.ts`:
- Around line 103-112: parseGatewayPort currently calls validateGatewayPort only
when hasExplicitEnvValue(envVar) is true, so implicitly-resolved gateway ports
can collide with other services; always run validation after resolving the port.
Change parseGatewayPort to call parsePort(envVar, fallback) and then
unconditionally call validateGatewayPort(envVar, port, options) (remove the
hasExplicitEnvValue guard) so validateGatewayPort always checks the resolved
port; preserve return of port.

In `@src/lib/onboard.ts`:
- Around line 3476-3478: Extract the bind-aware gateway port probe into a single
helper (e.g., isGatewayPortAvailable or probeGatewayPort) that encapsulates
calling checkPortAvailable(GATEWAY_PORT, { host: GATEWAY_BIND_ADDRESS }) and
returning the same result/shape; then replace the three in-file call sites that
currently inline checkPortAvailable(..., { host: GATEWAY_BIND_ADDRESS }) (the
reuse-detection, preflight, and startup usages in src/lib/onboard.ts) to call
this new helper. Ensure the helper is exported/defined near other onboarding
utilities and preserves error/return semantics so existing callers (reuse
detection, preflight, startup) need only swap the call without changing
downstream logic.

In `@src/lib/onboard/gateway-tcp-readiness.test.ts`:
- Around line 97-100: The test currently only checks the return type, which
doesn't verify that isGatewayTcpReady(undefined, 200) actually uses the default
GATEWAY_PORT; update the test to deterministically assert default-port wiring by
spying/mocking the TCP connect call (or the internal helper that opens sockets)
and asserting it was invoked with GATEWAY_PORT when the function is called with
undefined, while still asserting the resolved boolean result; reference the
isGatewayTcpReady function and the GATEWAY_PORT constant when implementing the
spy/mock to validate the port argument.

---

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 3811-3823: The DEB override writer
writeDockerGatewayDebEnvOverride() currently only persists the old
driver/supervisor keys so new networking env vars from
getDockerDriverGatewayEnv() (OPENSHELL_BIND_ADDRESS, OPENSHELL_SERVER_PORT,
OPENSHELL_GRPC_ENDPOINT, OPENSHELL_SSH_GATEWAY_HOST, OPENSHELL_SSH_GATEWAY_PORT)
are not saved; update writeDockerGatewayDebEnvOverride() to include these keys
and their computed values (from getDockerDriverGatewayEnv() or the same
computation used when building env in onboard.ts) when writing the DEB override
file so distro installs will restart with the correct bind address/port, gRPC
endpoint and SSH gateway host/port and avoid immediate drift detection.

In `@test/onboard.test.ts`:
- Around line 430-467: The test isn't hermetic because it only controls
NEMOCLAW_DISABLE_OVERLAY_FIX but not ambient NEMOCLAW_GATEWAY_PORT or
NEMOCLAW_GATEWAY_BIND_ADDRESS, causing the assertions on 127.0.0.1:8080 to
flake; fix getGatewayStartEnv test by saving original values of
process.env.NEMOCLAW_GATEWAY_PORT and process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS,
set them to "8080" and "127.0.0.1" (respectively) alongside
NEMOCLAW_DISABLE_OVERLAY_FIX before the assertion, then restore all three
originals in the finally block; locate this change around the getGatewayStartEnv
call in the test (references: getGatewayStartEnv, NEMOCLAW_DISABLE_OVERLAY_FIX).

---

Nitpick comments:
In `@docs/reference/commands.md`:
- Line 188: Replace the passive sentence "When an existing gateway is detected
for reuse, NemoClaw probes the host gateway HTTP endpoint..." with an
active-voice construction starting with the agent (e.g., "When NemoClaw detects
an existing gateway for reuse, NemoClaw probes the host gateway HTTP endpoint
before declaring it reusable...") so the subject "NemoClaw" performs the action;
ensure the rest of the sentence (probing the host gateway HTTP endpoint and
rebuilding if upstream is warming up) remains unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c7a11e0c-5d3a-48bb-aa11-8deaefdbc69a

📥 Commits

Reviewing files that changed from the base of the PR and between fc4a227 and 8bd6484.

📒 Files selected for processing (15)
  • docs/reference/commands.md
  • docs/reference/troubleshooting.md
  • docs/security/best-practices.md
  • src/lib/core/gateway-address.test.ts
  • src/lib/core/gateway-address.ts
  • src/lib/core/ports.test.ts
  • src/lib/core/ports.ts
  • src/lib/onboard.ts
  • src/lib/onboard/gateway-http-readiness.ts
  • src/lib/onboard/gateway-tcp-readiness.test.ts
  • src/lib/onboard/gateway-tcp-readiness.ts
  • src/lib/onboard/preflight.test.ts
  • src/lib/onboard/preflight.ts
  • test/gateway-start-wait.test.ts
  • test/onboard.test.ts

Comment thread src/lib/core/ports.ts
Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard/gateway-tcp-readiness.test.ts
@ericksoa ericksoa changed the title Configure OpenShell gateway bind address and port fix: configure OpenShell gateway bind address and port May 12, 2026
Comment thread src/lib/onboard/docker-driver-gateway-env.ts Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard.ts`:
- Line 4730: Replace uses of getDockerDriverGatewayEndpoint() inside the
non-Docker diagnostics branches with the legacy helper getGatewayLocalEndpoint()
so the diagnostic messages point to the correct host/scheme when the
Docker-driver gateway is disabled; specifically update the interpolated message
string that currently reads "Gateway container is running but
${getDockerDriverGatewayEndpoint()}/ is not responding. Recreating..." and any
other similar diagnostics that call getDockerDriverGatewayEndpoint() (also
present at the other noted occurrences) to use getGatewayLocalEndpoint()
instead, leaving the surrounding logic in functions handling gateway diagnostics
unchanged.

In `@src/lib/onboard/docker-driver-gateway-env.ts`:
- Around line 132-135: The current write uses fs.writeFileSync(envFile,
buildDockerGatewayDebEnvFile(existing, override), { mode: 0o600 }) but mode only
applies on create, leaving an existing gateway.env with permissive bits
unchanged; after calling fs.writeFileSync (or before), call
fs.chmodSync(envFile, 0o600) to enforce restrictive permissions for the existing
file (use the same envFile variable and buildDockerGatewayDebEnvFile call), and
ensure you handle/propagate any errors from chmod appropriately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b8496acd-cddf-41c2-bd99-39aace6299f2

📥 Commits

Reviewing files that changed from the base of the PR and between 8bd6484 and b0f8aed.

📒 Files selected for processing (8)
  • docs/reference/commands.md
  • src/lib/core/ports.test.ts
  • src/lib/core/ports.ts
  • src/lib/onboard.ts
  • src/lib/onboard/docker-driver-gateway-env.test.ts
  • src/lib/onboard/docker-driver-gateway-env.ts
  • src/lib/onboard/gateway-tcp-readiness.test.ts
  • test/gateway-start-wait.test.ts
✅ Files skipped from review due to trivial changes (2)
  • src/lib/onboard/docker-driver-gateway-env.test.ts
  • docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/core/ports.test.ts
  • test/gateway-start-wait.test.ts

Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard/docker-driver-gateway-env.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/lib/onboard/docker-driver-gateway-env.ts (1)

154-159: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce permissions for pre-existing gateway env dir/file.

On Line 154 and Line 156, mode is applied at creation time; existing ~/.config/openshell / gateway.env can remain overly permissive. Please explicitly chmod both after creation/write.

Suggested minimal patch
-  const envFile = path.join(os.homedir(), ".config", "openshell", "gateway.env");
-  fs.mkdirSync(path.dirname(envFile), { recursive: true, mode: 0o700 });
+  const envDir = path.join(os.homedir(), ".config", "openshell");
+  const envFile = path.join(envDir, "gateway.env");
+  fs.mkdirSync(envDir, { recursive: true, mode: 0o700 });
+  fs.chmodSync(envDir, 0o700);
   const existing = readTextFileIfPresent(envFile);
   fs.writeFileSync(envFile, buildDockerGatewayDebEnvFile(existing, override), {
     encoding: "utf-8",
     mode: 0o600,
   });
+  fs.chmodSync(envFile, 0o600);
In Node.js fs.writeFileSync, does the `mode` option only affect newly created files, and should fs.chmodSync be used to enforce permissions on existing files?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/docker-driver-gateway-env.ts` around lines 154 - 159, The
directory/file permission flags passed to fs.mkdirSync and fs.writeFileSync only
affect newly created items; enforce permissions for existing ones by calling
fs.chmodSync after creation/write: after fs.mkdirSync(...) ensure the directory
at path.dirname(envFile) is chmod'd to 0o700 with fs.chmodSync, and after
writing the envFile with buildDockerGatewayDebEnvFile(...) via
fs.writeFileSync(...) call fs.chmodSync(envFile, 0o600); locate usages around
the envFile variable and functions readTextFileIfPresent and
buildDockerGatewayDebEnvFile to add these explicit chmod calls.
src/lib/onboard.ts (1)

4749-4749: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the legacy gateway endpoint in these non-Docker diagnostics.

These branches only execute when the Docker-driver gateway is disabled, but the messages still interpolate getDockerDriverGatewayEndpoint(). That points users at the wrong endpoint while debugging legacy gateway failures. Use getGatewayLocalEndpoint() consistently here.

Suggested fix
-        `  Gateway container is running but ${getDockerDriverGatewayEndpoint()}/ is not responding. Recreating...`,
+        `  Gateway container is running but ${getGatewayLocalEndpoint()}/ is not responding. Recreating...`,

-      `  Gateway metadata reports healthy but ${getDockerDriverGatewayEndpoint()}/ is not responding. Starting a fresh gateway...`,
+      `  Gateway metadata reports healthy but ${getGatewayLocalEndpoint()}/ is not responding. Starting a fresh gateway...`,

-            `  Error: could not verify gateway container state and ${getDockerDriverGatewayEndpoint()}/ is not responding.`,
+            `  Error: could not verify gateway container state and ${getGatewayLocalEndpoint()}/ is not responding.`,

-          `  Gateway container is running but ${getDockerDriverGatewayEndpoint()}/ is not responding. Recreating...`,
+          `  Gateway container is running but ${getGatewayLocalEndpoint()}/ is not responding. Recreating...`,

Also applies to: 5028-5028, 11187-11187, 11200-11200

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` at line 4749, Replace usages of
getDockerDriverGatewayEndpoint() in the non-Docker diagnostic message branches
with getGatewayLocalEndpoint() so diagnostics point to the legacy/local gateway
endpoint; locate the diagnostic strings that currently interpolate
getDockerDriverGatewayEndpoint() (e.g., the message template "`Gateway container
is running but ${getDockerDriverGatewayEndpoint()}/ is not responding.
Recreating...`") and update them to use getGatewayLocalEndpoint() instead,
ensuring all non-Docker paths (the same pattern also at the other occurrences)
consistently reference getGatewayLocalEndpoint().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/lib/onboard.ts`:
- Line 4749: Replace usages of getDockerDriverGatewayEndpoint() in the
non-Docker diagnostic message branches with getGatewayLocalEndpoint() so
diagnostics point to the legacy/local gateway endpoint; locate the diagnostic
strings that currently interpolate getDockerDriverGatewayEndpoint() (e.g., the
message template "`Gateway container is running but
${getDockerDriverGatewayEndpoint()}/ is not responding. Recreating...`") and
update them to use getGatewayLocalEndpoint() instead, ensuring all non-Docker
paths (the same pattern also at the other occurrences) consistently reference
getGatewayLocalEndpoint().

In `@src/lib/onboard/docker-driver-gateway-env.ts`:
- Around line 154-159: The directory/file permission flags passed to
fs.mkdirSync and fs.writeFileSync only affect newly created items; enforce
permissions for existing ones by calling fs.chmodSync after creation/write:
after fs.mkdirSync(...) ensure the directory at path.dirname(envFile) is chmod'd
to 0o700 with fs.chmodSync, and after writing the envFile with
buildDockerGatewayDebEnvFile(...) via fs.writeFileSync(...) call
fs.chmodSync(envFile, 0o600); locate usages around the envFile variable and
functions readTextFileIfPresent and buildDockerGatewayDebEnvFile to add these
explicit chmod calls.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0ec72966-4b5f-478f-a8ac-af1e82974cf0

📥 Commits

Reviewing files that changed from the base of the PR and between b0f8aed and fa305c8.

📒 Files selected for processing (4)
  • src/lib/onboard.ts
  • src/lib/onboard/docker-driver-gateway-env.test.ts
  • src/lib/onboard/docker-driver-gateway-env.ts
  • test/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/onboard/docker-driver-gateway-env.test.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard/docker-driver-gateway-env.test.ts Fixed
Comment thread src/lib/onboard/docker-driver-gateway-env.test.ts Fixed
…-config

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25767133910
Branch: fix/gateway-bind-port-config
Requested jobs: openshell-gateway-upgrade-e2e,gateway-health-honest-e2e,double-onboard-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
double-onboard-e2e ✅ success
gateway-health-honest-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25768132522
Branch: fix/gateway-bind-port-config
Requested jobs: openshell-gateway-upgrade-e2e,gateway-health-honest-e2e,double-onboard-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
double-onboard-e2e ✅ success
gateway-health-honest-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 25771873587
Branch: fix/gateway-bind-port-config
Requested jobs: openshell-gateway-upgrade-e2e,gateway-health-honest-e2e,double-onboard-e2e
Summary: 2 passed, 1 failed, 0 skipped

Job Result
double-onboard-e2e ✅ success
gateway-health-honest-e2e ✅ success
openshell-gateway-upgrade-e2e ❌ failure

Failed jobs: openshell-gateway-upgrade-e2e. Check run artifacts for logs.

ericksoa added 2 commits May 12, 2026 18:31
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25772601822
Branch: fix/gateway-bind-port-config
Requested jobs: openshell-gateway-upgrade-e2e,gateway-health-honest-e2e,double-onboard-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
double-onboard-e2e ⚠️ cancelled
gateway-health-honest-e2e ✅ success
openshell-gateway-upgrade-e2e ⚠️ cancelled

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25772816411
Branch: fix/gateway-bind-port-config
Requested jobs: openshell-gateway-upgrade-e2e,gateway-health-honest-e2e,double-onboard-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
double-onboard-e2e ✅ success
gateway-health-honest-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: openshell-gateway-upgrade-e2e, test-e2e-port-overrides, cloud-onboard-e2e
Optional E2E: gateway-health-honest-e2e, double-onboard-e2e, onboard-repair-e2e, runtime-overrides-e2e, e2e-branch-validation

Workflow run

Full advisor summary

Pi Semantic E2E Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • openshell-gateway-upgrade-e2e (high): This PR edits test/e2e/test-openshell-gateway-upgrade.sh directly and changes the runner spawn shape and gateway start environment used by the upgrade flow it asserts on. The nightly job that runs this script must validate the change.
  • test-e2e-port-overrides (medium): NEMOCLAW_GATEWAY_PORT now goes through parseGatewayPort with new cross-port conflict rules (dashboard 18789-18799 range, default 8000/11434/11435, configured dashboard/vLLM/Ollama/proxy). This is the only existing job that drives the real entrypoint and Node ports module with NEMOCLAW_GATEWAY_PORT/NEMOCLAW_DASHBOARD_PORT/NEMOCLAW_VLLM_PORT/NEMOCLAW_OLLAMA_PORT overrides; regressions here would silently break onboard port validation.
  • cloud-onboard-e2e (high): Preflight, gateway HTTP/TCP readiness probes, gateway port-availability check options, and OPENSHELL_BIND_ADDRESS/OPENSHELL_GRPC_ENDPOINT plumbing are all on the onboard hot path. cloud-onboard-e2e exercises the public installer end-to-end with the real gateway and is the canonical onboarding regression for inference routing and gateway bring-up.

Optional E2E

  • gateway-health-honest-e2e (high): Gateway HTTP readiness helpers (gateway-http-readiness.ts) and the local-endpoint computation moved through the new gateway-address module; the health-honesty E2E is the closest live regression for gateway readiness/honesty signaling.
  • double-onboard-e2e (high): The PR changes gateway reuse / port-conflict / refresh logic (refreshDockerDriverGatewayReuseState, checkGatewayPortAvailable). Double-onboard exercises the reuse-vs-recreate decision the new code paths feed into.
  • onboard-repair-e2e (high): Preflight gateway port and HTTP readiness branches are touched; onboard-repair drives recovery flows that depend on these probes.
  • runtime-overrides-e2e (high): Runtime-overrides covers env-var-driven onboarding, the same surface where NEMOCLAW_GATEWAY_BIND_ADDRESS and the new gateway port validation live.
  • e2e-branch-validation (high): This Brev-driven workflow runs test/e2e/test-dashboard-remote-bind.sh, the closest existing live coverage for an operator-controlled all-interfaces bind. The new NEMOCLAW_GATEWAY_BIND_ADDRESS=0.0.0.0 path shares the same security boundary (gateway/dashboard reachable from off-host).

New E2E recommendations

  • gateway-networking (high): NEMOCLAW_GATEWAY_BIND_ADDRESS=0.0.0.0 is a new opt-in flag that intentionally exposes the OpenShell gateway on all IPv4 interfaces. Unit tests cover parsing and env-building, but no live E2E proves that (a) the OpenShell gateway actually listens on 0.0.0.0:GATEWAY_PORT after onboard, (b) the loopback default keeps the listener bound only to 127.0.0.1, and (c) preflight, gateway HTTP probe, and connect host still target 127.0.0.1 even when the bind is wildcard. This is a security-boundary change and warrants a dedicated regression.
    • Suggested test: test/e2e/test-gateway-bind-address.sh: onboard non-interactively with NEMOCLAW_GATEWAY_BIND_ADDRESS=0.0.0.0 and a custom NEMOCLAW_GATEWAY_PORT, then assert (1) ss -ltn shows 0.0.0.0: for the gateway, (2) the warning about wildcard bind appears in onboard output, (3) nemoclaw <name> status and the HTTP probe still succeed via 127.0.0.1. Repeat with the loopback default and assert the listener is 127.0.0.1-only. Wire it into nightly-e2e.yaml as gateway-bind-address-e2e.
  • port-validation (medium): parseGatewayPort now rejects overlap with the dashboard auto-allocation range and the default vLLM/Ollama/Ollama-proxy ports, and surfaces a new error path. test-e2e-port-overrides exercises basic parsing but does not cover the new conflict messages users will hit in real onboarding when NEMOCLAW_GATEWAY_PORT collides with a configured dashboard/vLLM/Ollama port.
    • Suggested test: Extend test/e2e-port-overrides.sh (or add test/e2e/test-gateway-port-conflicts.sh) that runs nemoclaw onboard --non-interactive with NEMOCLAW_GATEWAY_PORT set to 18790, 8000, 11434, 11435, and to a value matching a non-default NEMOCLAW_DASHBOARD_PORT, and asserts the documented NEMOCLAW_GATEWAY_PORT=<port> nemoclaw onboard remediation hint plus the specific conflicting env-var name appears in stderr and onboarding exits non-zero before any sandbox is created.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/lib/onboard.ts (1)

4796-4798: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the helper-derived local endpoint in this Docker-driver warning.

This is the one remaining reuse diagnostic that still hardcodes http://127.0.0.1:${GATEWAY_PORT}/. With configurable bind/connect host and HTTPS endpoint support, it can point users at the wrong address, port, or scheme while debugging a failed reuse. Switch it to getGatewayLocalEndpoint() like the other gateway diagnostics.

Suggested fix
-        `  Docker-driver gateway metadata reports healthy but http://127.0.0.1:${GATEWAY_PORT}/ is not responding. Starting a fresh gateway...`,
+        `  Docker-driver gateway metadata reports healthy but ${getGatewayLocalEndpoint()}/ is not responding. Starting a fresh gateway...`,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 4796 - 4798, Replace the hardcoded URL in
the Docker-driver gateway warning with the helper-derived local endpoint: use
getGatewayLocalEndpoint() instead of constructing
`http://127.0.0.1:${GATEWAY_PORT}/`. Locate the log emitting that message (the
console.log that mentions "Docker-driver gateway metadata reports healthy but
http://127.0.0.1:${GATEWAY_PORT}/ is not responding. Starting a fresh
gateway...") and build the message using getGatewayLocalEndpoint() so it
reflects configurable host, port and scheme; keep the rest of the wording
intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/core/gateway-address.ts`:
- Around line 18-21: The current check uses raw before trimming so
whitespace-only values like "   " don't fall back; update the logic to normalize
first (e.g., compute trimmed = String(raw ?? "").trim()), then treat trimmed ===
"" as the empty case and return fallback, and keep the existing comparisons
against DEFAULT_GATEWAY_BIND_ADDRESS and WILDCARD_GATEWAY_BIND_ADDRESS using
trimmed to decide returns.

In `@test/onboard.test.ts`:
- Around line 452-467: Test assumes default gateway envs but can be influenced
by NEMOCLAW_GATEWAY_BIND_ADDRESS/NEMOCLAW_GATEWAY_PORT; before calling
getGatewayStartEnv() snapshot the current values of
process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS and process.env.NEMOCLAW_GATEWAY_PORT
(like originalOverlayFix for NEMOCLAW_DISABLE_OVERLAY_FIX), then clear them
(delete if undefined) so getGatewayStartEnv() sees defaults, perform the expect
on getGatewayStartEnv(), and finally restore the two env vars to their original
values in the finally block; reference getGatewayStartEnv,
NEMOCLAW_GATEWAY_BIND_ADDRESS, NEMOCLAW_GATEWAY_PORT, and
NEMOCLAW_DISABLE_OVERLAY_FIX.

---

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 4796-4798: Replace the hardcoded URL in the Docker-driver gateway
warning with the helper-derived local endpoint: use getGatewayLocalEndpoint()
instead of constructing `http://127.0.0.1:${GATEWAY_PORT}/`. Locate the log
emitting that message (the console.log that mentions "Docker-driver gateway
metadata reports healthy but http://127.0.0.1:${GATEWAY_PORT}/ is not
responding. Starting a fresh gateway...") and build the message using
getGatewayLocalEndpoint() so it reflects configurable host, port and scheme;
keep the rest of the wording intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 152d2a68-ee02-44ef-96f2-5735031ac874

📥 Commits

Reviewing files that changed from the base of the PR and between 3294d03 and 27b0e00.

📒 Files selected for processing (10)
  • docs/reference/commands.md
  • docs/reference/troubleshooting.md
  • src/lib/core/gateway-address.ts
  • src/lib/onboard.ts
  • src/lib/onboard/docker-driver-gateway-env.test.ts
  • src/lib/onboard/gateway-http-readiness.ts
  • src/lib/runner.ts
  • test/e2e/test-openshell-gateway-upgrade.sh
  • test/onboard-selection.test.ts
  • test/onboard.test.ts
✅ Files skipped from review due to trivial changes (4)
  • test/e2e/test-openshell-gateway-upgrade.sh
  • src/lib/runner.ts
  • docs/reference/troubleshooting.md
  • docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard/docker-driver-gateway-env.test.ts

Comment on lines +18 to +21
if (raw === undefined || raw === "") return fallback;
const trimmed = String(raw).trim();
if (trimmed === DEFAULT_GATEWAY_BIND_ADDRESS) return DEFAULT_GATEWAY_BIND_ADDRESS;
if (trimmed === WILDCARD_GATEWAY_BIND_ADDRESS) return WILDCARD_GATEWAY_BIND_ADDRESS;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Treat whitespace-only bind values as empty input.

At Line 18, blank handling is checked before trimming, so NEMOCLAW_GATEWAY_BIND_ADDRESS=" " throws instead of using the fallback. Normalizing empties after trim avoids surprising config failures.

Suggested patch
 export function parseGatewayBindAddress(
   envVar = "NEMOCLAW_GATEWAY_BIND_ADDRESS",
   fallback: GatewayBindAddress = DEFAULT_GATEWAY_BIND_ADDRESS,
 ): GatewayBindAddress {
   const raw = process.env[envVar];
-  if (raw === undefined || raw === "") return fallback;
   const trimmed = String(raw).trim();
+  if (raw === undefined || trimmed === "") return fallback;
   if (trimmed === DEFAULT_GATEWAY_BIND_ADDRESS) return DEFAULT_GATEWAY_BIND_ADDRESS;
   if (trimmed === WILDCARD_GATEWAY_BIND_ADDRESS) return WILDCARD_GATEWAY_BIND_ADDRESS;
   throw new Error(
     `Invalid gateway bind address: ${envVar}="${raw}" — must be either ${DEFAULT_GATEWAY_BIND_ADDRESS} or ${WILDCARD_GATEWAY_BIND_ADDRESS}`,
   );
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (raw === undefined || raw === "") return fallback;
const trimmed = String(raw).trim();
if (trimmed === DEFAULT_GATEWAY_BIND_ADDRESS) return DEFAULT_GATEWAY_BIND_ADDRESS;
if (trimmed === WILDCARD_GATEWAY_BIND_ADDRESS) return WILDCARD_GATEWAY_BIND_ADDRESS;
const trimmed = String(raw).trim();
if (raw === undefined || trimmed === "") return fallback;
if (trimmed === DEFAULT_GATEWAY_BIND_ADDRESS) return DEFAULT_GATEWAY_BIND_ADDRESS;
if (trimmed === WILDCARD_GATEWAY_BIND_ADDRESS) return WILDCARD_GATEWAY_BIND_ADDRESS;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/core/gateway-address.ts` around lines 18 - 21, The current check uses
raw before trimming so whitespace-only values like "   " don't fall back; update
the logic to normalize first (e.g., compute trimmed = String(raw ?? "").trim()),
then treat trimmed === "" as the empty case and return fallback, and keep the
existing comparisons against DEFAULT_GATEWAY_BIND_ADDRESS and
WILDCARD_GATEWAY_BIND_ADDRESS using trimmed to decide returns.

Comment thread test/onboard.test.ts
Comment on lines +452 to +467
const originalOverlayFix = process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
process.env.NEMOCLAW_DISABLE_OVERLAY_FIX = "1";
try {
expect(getGatewayStartEnv()).toMatchObject({
OPENSHELL_BIND_ADDRESS: "127.0.0.1",
OPENSHELL_SERVER_PORT: "8080",
OPENSHELL_SSH_GATEWAY_HOST: "127.0.0.1",
OPENSHELL_SSH_GATEWAY_PORT: "8080",
});
} finally {
if (originalOverlayFix === undefined) {
delete process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
} else {
process.env.NEMOCLAW_DISABLE_OVERLAY_FIX = originalOverlayFix;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make gateway env assertions hermetic

Line 455 currently assumes default gateway bind/port behavior, but inherited NEMOCLAW_GATEWAY_BIND_ADDRESS or NEMOCLAW_GATEWAY_PORT can make this test fail nondeterministically. Snapshot and clear them in this test (same pattern you used for NEMOCLAW_DISABLE_OVERLAY_FIX) before asserting defaults.

Suggested patch
-    const originalOverlayFix = process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
+    const originalOverlayFix = process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
+    const originalGatewayBindAddress = process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS;
+    const originalGatewayPort = process.env.NEMOCLAW_GATEWAY_PORT;
     process.env.NEMOCLAW_DISABLE_OVERLAY_FIX = "1";
+    delete process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS;
+    delete process.env.NEMOCLAW_GATEWAY_PORT;
     try {
       expect(getGatewayStartEnv()).toMatchObject({
         OPENSHELL_BIND_ADDRESS: "127.0.0.1",
         OPENSHELL_SERVER_PORT: "8080",
         OPENSHELL_SSH_GATEWAY_HOST: "127.0.0.1",
@@
       if (originalOverlayFix === undefined) {
         delete process.env.NEMOCLAW_DISABLE_OVERLAY_FIX;
       } else {
         process.env.NEMOCLAW_DISABLE_OVERLAY_FIX = originalOverlayFix;
       }
+      if (originalGatewayBindAddress === undefined) {
+        delete process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS;
+      } else {
+        process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS = originalGatewayBindAddress;
+      }
+      if (originalGatewayPort === undefined) {
+        delete process.env.NEMOCLAW_GATEWAY_PORT;
+      } else {
+        process.env.NEMOCLAW_GATEWAY_PORT = originalGatewayPort;
+      }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/onboard.test.ts` around lines 452 - 467, Test assumes default gateway
envs but can be influenced by
NEMOCLAW_GATEWAY_BIND_ADDRESS/NEMOCLAW_GATEWAY_PORT; before calling
getGatewayStartEnv() snapshot the current values of
process.env.NEMOCLAW_GATEWAY_BIND_ADDRESS and process.env.NEMOCLAW_GATEWAY_PORT
(like originalOverlayFix for NEMOCLAW_DISABLE_OVERLAY_FIX), then clear them
(delete if undefined) so getGatewayStartEnv() sees defaults, perform the expect
on getGatewayStartEnv(), and finally restore the two env vars to their original
values in the finally block; reference getGatewayStartEnv,
NEMOCLAW_GATEWAY_BIND_ADDRESS, NEMOCLAW_GATEWAY_PORT, and
NEMOCLAW_DISABLE_OVERLAY_FIX.

ericksoa added 3 commits May 12, 2026 19:30
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/onboard.ts (1)

4605-4637: Run the gateway/onboarding E2Es for this path before merge.

This diff now threads bind-address/port config through gateway reuse, env persistence, startup, and recovery inside the onboarding core. The unit coverage is solid, but I’d still run at least the gateway-focused E2Es (sandbox-operations-e2e and openshell-gateway-upgrade-e2e) to catch cross-step regressions.

As per coding guidelines, src/lib/onboard.ts: "This file contains core onboarding logic. Changes here affect the full sandbox creation and configuration flow." The same guideline recommends sandbox-operations-e2e and openshell-gateway-upgrade-e2e.

Also applies to: 4750-4752

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 4605 - 4637, Run the gateway-focused
end-to-end tests against the updated onboarding path in src/lib/onboard.ts to
catch cross-step regressions introduced by threading bind-address/port and env
persistence through gateway reuse/restart: specifically run
sandbox-operations-e2e and openshell-gateway-upgrade-e2e exercising the flows
that call runCaptureOpenshell, getDockerDriverGatewayEnv, isGatewayHealthy,
getDockerDriverGatewayPid/getDockerDriverGatewayRuntimeDrift,
restartDockerDriverGatewayProcessForDrift, registerDockerDriverGatewayEndpoint,
isDockerDriverGatewayHttpReady, checkGatewayPortAvailable and
getDockerDriverGatewayPortListenerPid to validate startup, reuse, drift recovery
and port/listener behavior; if failures appear, iterate locally until these E2Es
pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 4605-4637: Run the gateway-focused end-to-end tests against the
updated onboarding path in src/lib/onboard.ts to catch cross-step regressions
introduced by threading bind-address/port and env persistence through gateway
reuse/restart: specifically run sandbox-operations-e2e and
openshell-gateway-upgrade-e2e exercising the flows that call
runCaptureOpenshell, getDockerDriverGatewayEnv, isGatewayHealthy,
getDockerDriverGatewayPid/getDockerDriverGatewayRuntimeDrift,
restartDockerDriverGatewayProcessForDrift, registerDockerDriverGatewayEndpoint,
isDockerDriverGatewayHttpReady, checkGatewayPortAvailable and
getDockerDriverGatewayPortListenerPid to validate startup, reuse, drift recovery
and port/listener behavior; if failures appear, iterate locally until these E2Es
pass.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 749f93fb-f35c-4f22-8b6c-4d8ce1f91da7

📥 Commits

Reviewing files that changed from the base of the PR and between f51d73f and 981c43f.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/onboard.test.ts

ericksoa added 2 commits May 12, 2026 21:16
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa merged commit 0a828da into main May 13, 2026
30 of 31 checks passed
@ericksoa ericksoa deleted the fix/gateway-bind-port-config branch May 13, 2026 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

configuration enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. v0.0.40 Release target VDR Linked to VDR finding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Please allow use of alternate port (other than the second most used port for... 8080). Also allow for Address 0.0.0.0

2 participants