fix(cli): honour NEMOCLAW_LOCAL_INFERENCE_TIMEOUT for compatible-endpoint (#2403) by jason-ma-nv · Pull Request #2583 · NVIDIA/NemoClaw

jason-ma-nv · 2026-04-28T05:34:08Z

Summary

compatible-endpoint skipped the --timeout flag on openshell inference set, leaving the 60 s default active even when NEMOCLAW_LOCAL_INFERENCE_TIMEOUT was exported. ollama-local and vllm-local already pass the flag; this brings compatible-endpoint into parity so reasoning-model operators on external LAN endpoints are not silently capped at 60 seconds.

Related Issue

Fixes #2403

Changes

src/lib/onboard.ts — add --timeout to inference set args when provider is compatible-endpoint, using the same LOCAL_INFERENCE_TIMEOUT_SECS constant already applied to ollama-local and vllm-local
test/onboard-selection.test.ts — new test that spawns a fake openshell binary recording inference set args, sets NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=600, calls setupInference directly, and asserts --timeout 600 is present

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages includes SPDX header and frontmatter (new pages only)

AI Disclosure

AI-assisted — tool: Claude Code

Signed-off-by: Jason Ma jama@nvidia.com

Summary by CodeRabbit

New Features
- Added configurable timeout support for local inference setup, applied for the compatible endpoint flow.
Tests
- Added a regression test to verify the configured timeout is forwarded to the inference initialization command.

…oint (#2403) The compatible-endpoint provider skipped the --timeout flag on `openshell inference set`, leaving the 60s default in place even when NEMOCLAW_LOCAL_INFERENCE_TIMEOUT was exported. ollama-local and vllm-local already pass the flag; this brings compatible-endpoint into parity so reasoning-model operators on external LAN endpoints are not silently capped at 60 seconds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

copy-pr-bot · 2026-04-28T05:34:11Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-28T05:34:20Z

📝 Walkthrough

Walkthrough

Adds plumbing so NEMOCLAW_LOCAL_INFERENCE_TIMEOUT (via LOCAL_INFERENCE_TIMEOUT_SECS) is forwarded as --timeout when setupInference configures the compatible-endpoint provider; includes a regression test verifying the value is passed into the openshell inference set invocation.

Changes

Cohort / File(s)	Summary
Timeout configuration for compatible-endpoint `src/lib/onboard.ts`	Conditionalizes `openshell inference set` argument construction to include `--timeout <LOCAL_INFERENCE_TIMEOUT_SECS>` for the `compatible-endpoint` provider.
Regression test for timeout propagation `test/onboard-selection.test.ts`	Adds a test that sets `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT`, injects a fake `openshell` binary to capture `inference set` args, and asserts `--timeout 600` is present for the `compatible-endpoint` flow.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐇 I hopped through code with whiskers twitching bright,
A timeout carried home from dark to light,
Compatible-endpoint now holds the key,
Models can think a while — set them free! 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: fixing the compatible-endpoint provider to honour the NEMOCLAW_LOCAL_INFERENCE_TIMEOUT environment variable, which directly addresses issue `#2403`.
Linked Issues check	✅ Passed	The PR fully addresses the coding requirements from issue `#2403`: plumbing NEMOCLAW_LOCAL_INFERENCE_TIMEOUT into compatible-endpoint provider so it passes --timeout to openshell inference set, with corresponding regression test added.
Out of Scope Changes check	✅ Passed	All changes are in-scope: src/lib/onboard.ts adds timeout flag for compatible-endpoint provider, and test/onboard-selection.test.ts adds a regression test validating the fix. Both directly address the linked issue requirement.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/compatible-endpoint-timeout-2403

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

wscurran · 2026-04-28T14:55:16Z

✨ Thanks for submitting this PR that fixes a bug and improves the inference experience with the compatible-endpoint provider by honoring the NEMOCLAW_LOCAL_INFERENCE_TIMEOUT environment variable. This change brings the compatible-endpoint into parity with ollama-local and vllm-local, ensuring that reasoning-model operators on external LAN endpoints are not silently capped at 60 seconds.

Related open issues:

#2403 compatible-endpoint provider does not honour NEMOCLAW_LOCAL_INFERENCE_TIMEOUT (vllm-local and ollama-local do); 60s default leaks through to reasoning-model streams

ericksoa

Looks good. This is narrowly scoped to , reuses the existing local inference timeout setting, and includes a regression test that verifies receives the expected argument.

coderabbitai

🧹 Nitpick comments (1)

test/onboard-selection.test.ts (1)

3456-3463: Tighten timeout assertion to validate flag/value pairing.

Current checks can pass if "600" appears elsewhere in args. Prefer asserting "600" is the immediate value after "--timeout".

Proposed assertion hardening

-    assert.ok(
-      state.inferenceSetArgs.includes("--timeout"),
-      `Expected --timeout in inference set args, got: ${JSON.stringify(state.inferenceSetArgs)}`,
-    );
-    assert.ok(
-      state.inferenceSetArgs.includes("600"),
-      `Expected 600 in inference set args, got: ${JSON.stringify(state.inferenceSetArgs)}`,
-    );
+    const timeoutIdx = state.inferenceSetArgs.indexOf("--timeout");
+    assert.notEqual(
+      timeoutIdx,
+      -1,
+      `Expected --timeout in inference set args, got: ${JSON.stringify(state.inferenceSetArgs)}`,
+    );
+    assert.equal(
+      state.inferenceSetArgs[timeoutIdx + 1],
+      "600",
+      `Expected --timeout value 600, got: ${JSON.stringify(state.inferenceSetArgs)}`,
+    );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/onboard-selection.test.ts` around lines 3456 - 3463, The test currently
just asserts that "--timeout" and "600" exist anywhere in
state.inferenceSetArgs; change it to find the index of "--timeout" in
state.inferenceSetArgs (using state.inferenceSetArgs.indexOf("--timeout")),
assert the index is >= 0 and index + 1 is within bounds, then assert that
state.inferenceSetArgs[index + 1] === "600" so the flag and value are paired.
This tightens the assertion in the failing test that uses
state.inferenceSetArgs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/onboard-selection.test.ts`:
- Around line 3456-3463: The test currently just asserts that "--timeout" and
"600" exist anywhere in state.inferenceSetArgs; change it to find the index of
"--timeout" in state.inferenceSetArgs (using
state.inferenceSetArgs.indexOf("--timeout")), assert the index is >= 0 and index
+ 1 is within bounds, then assert that state.inferenceSetArgs[index + 1] ===
"600" so the flag and value are paired. This tightens the assertion in the
failing test that uses state.inferenceSetArgs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d431ed71-5511-4b15-a110-e6a47baa1ddc

📥 Commits

Reviewing files that changed from the base of the PR and between 9e2b975 and db7e8eb.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/onboard-selection.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/lib/onboard.ts

## Summary Refreshes the daily docs from NemoClaw commits merged in the past 24 hours and advances the docs metadata from 0.0.29 to 0.0.31, the next version after tag v0.0.30. The updates cover documented behavior gaps found in the merged PRs listed below. ## Related Issue None. ## Changes - `docs/versions1.json` and `docs/project.json`: bump the preferred docs version to `0.0.31` for daily release preparation after latest tag `v0.0.30`. - `docs/reference/commands.md`: document non-interactive Brave Search validation fallback from #2511 / 9bfe30b, missing `--from <Dockerfile>` path validation from #2597 / 7186834, and `logs` reading OpenShell audit events from #2590 / e225dfb. - `docs/inference/use-local-inference.md`: document local inference reachability retry and host-side fallback from #2453 / 9dbe855, plus compatible-endpoint timeout coverage from #2583 / b4ef3db. - `docs/reference/troubleshooting.md`: document source-install shim fallback from #2520 / 01a177c, TLS gateway trust recovery from #1936 / 24725d2, compatible-endpoint timeout coverage from #2583 / b4ef3db, local reachability diagnostics from #2453 / 9dbe855, and host proxy `NO_PROXY` injection from #2662 / b4df07e. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) Additional verification: - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --dry-run` passed. - `git diff --check` passed. - Pre-push hooks passed through markdownlint, docs-to-skills, JSON checks, gitleaks, and version sync before `Test (skills YAML)` failed because this fresh worktree lacked `vitest/config`. - `npx prek run --all-files` could not run from the fresh worktree because `npx prek` resolved to a missing `prek@*` package; downloading `@j178/prek` was not approved. - `npm test` could not complete from the fresh worktree because dependencies and compiled `dist/lib/*` artifacts were absent. ## AI Disclosure - [x] AI-assisted — tool: OpenAI Codex --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>  ## Summary by CodeRabbit * **Documentation** * Version updated to 0.0.31 * Local inference onboarding now includes retry logic for container reachability checks * Web search setup failure handling clarified with fallback guidance * Dockerfile path validation timing documented * Logging behavior clarified for concurrent stream reading * New TLS/certificate troubleshooting section added * Install path and proxy configuration troubleshooting updated  Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

…oint (NVIDIA#2403) (NVIDIA#2583) ## Summary `compatible-endpoint` skipped the `--timeout` flag on `openshell inference set`, leaving the 60 s default active even when `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` was exported. `ollama-local` and `vllm-local` already pass the flag; this brings `compatible-endpoint` into parity so reasoning-model operators on external LAN endpoints are not silently capped at 60 seconds. ## Related Issue Fixes NVIDIA#2403 ## Changes - `src/lib/onboard.ts` — add `--timeout` to `inference set` args when provider is `compatible-endpoint`, using the same `LOCAL_INFERENCE_TIMEOUT_SECS` constant already applied to `ollama-local` and `vllm-local` - `test/onboard-selection.test.ts` — new test that spawns a fake `openshell` binary recording `inference set` args, sets `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=600`, calls `setupInference` directly, and asserts `--timeout 600` is present ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes - [ ] `npm test` passes - [x] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages includes SPDX header and frontmatter (new pages only) ## AI Disclosure - [x] AI-assisted — tool: Claude Code --- Signed-off-by: Jason Ma <jama@nvidia.com>  ## Summary by CodeRabbit * **New Features** * Added configurable timeout support for local inference setup, applied for the compatible endpoint flow. * **Tests** * Added a regression test to verify the configured timeout is forwarded to the inference initialization command.  Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>

## Summary Refreshes the daily docs from NemoClaw commits merged in the past 24 hours and advances the docs metadata from 0.0.29 to 0.0.31, the next version after tag v0.0.30. The updates cover documented behavior gaps found in the merged PRs listed below. ## Related Issue None. ## Changes - `docs/versions1.json` and `docs/project.json`: bump the preferred docs version to `0.0.31` for daily release preparation after latest tag `v0.0.30`. - `docs/reference/commands.md`: document non-interactive Brave Search validation fallback from NVIDIA#2511 / 9bfe30b, missing `--from <Dockerfile>` path validation from NVIDIA#2597 / 7186834, and `logs` reading OpenShell audit events from NVIDIA#2590 / e225dfb. - `docs/inference/use-local-inference.md`: document local inference reachability retry and host-side fallback from NVIDIA#2453 / 9dbe855, plus compatible-endpoint timeout coverage from NVIDIA#2583 / b4ef3db. - `docs/reference/troubleshooting.md`: document source-install shim fallback from NVIDIA#2520 / 01a177c, TLS gateway trust recovery from NVIDIA#1936 / 24725d2, compatible-endpoint timeout coverage from NVIDIA#2583 / b4ef3db, local reachability diagnostics from NVIDIA#2453 / 9dbe855, and host proxy `NO_PROXY` injection from NVIDIA#2662 / b4df07e. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) Additional verification: - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --dry-run` passed. - `git diff --check` passed. - Pre-push hooks passed through markdownlint, docs-to-skills, JSON checks, gitleaks, and version sync before `Test (skills YAML)` failed because this fresh worktree lacked `vitest/config`. - `npx prek run --all-files` could not run from the fresh worktree because `npx prek` resolved to a missing `prek@*` package; downloading `@j178/prek` was not approved. - `npm test` could not complete from the fresh worktree because dependencies and compiled `dist/lib/*` artifacts were absent. ## AI Disclosure - [x] AI-assisted — tool: OpenAI Codex --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>  ## Summary by CodeRabbit * **Documentation** * Version updated to 0.0.31 * Local inference onboarding now includes retry logic for container reachability checks * Web search setup failure handling clarified with fallback guidance * Dockerfile path validation timing documented * Logging behavior clarified for concurrent stream reading * New TLS/certificate troubleshooting section added * Install path and proxy configuration troubleshooting updated  Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

jason-ma-nv self-assigned this Apr 28, 2026

ericksoa approved these changes Apr 29, 2026

View reviewed changes

ericksoa enabled auto-merge (squash) April 29, 2026 12:48

Merge branch 'main' into fix/compatible-endpoint-timeout-2403

db7e8eb

ericksoa disabled auto-merge April 29, 2026 12:55

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

ericksoa merged commit b4ef3db into main Apr 29, 2026
12 checks passed

wscurran added the VDR Linked to VDR #4 finding label Apr 29, 2026

miyoungc mentioned this pull request Apr 30, 2026

docs: refresh daily docs for 0.0.31 #2724

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): honour NEMOCLAW_LOCAL_INFERENCE_TIMEOUT for compatible-endpoint (#2403)#2583

fix(cli): honour NEMOCLAW_LOCAL_INFERENCE_TIMEOUT for compatible-endpoint (#2403)#2583
ericksoa merged 2 commits intomainfrom
fix/compatible-endpoint-timeout-2403

jason-ma-nv commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented Apr 28, 2026

Uh oh!

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

wscurran commented Apr 28, 2026

Uh oh!

ericksoa left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jason-ma-nv commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

AI Disclosure

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Apr 28, 2026

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

wscurran commented Apr 28, 2026

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jason-ma-nv commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading