Skip to content

fix(cli): honour NEMOCLAW_LOCAL_INFERENCE_TIMEOUT for compatible-endpoint (#2403)#2583

Merged
ericksoa merged 2 commits intomainfrom
fix/compatible-endpoint-timeout-2403
Apr 29, 2026
Merged

fix(cli): honour NEMOCLAW_LOCAL_INFERENCE_TIMEOUT for compatible-endpoint (#2403)#2583
ericksoa merged 2 commits intomainfrom
fix/compatible-endpoint-timeout-2403

Conversation

@jason-ma-nv
Copy link
Copy Markdown
Contributor

@jason-ma-nv jason-ma-nv commented Apr 28, 2026

Summary

compatible-endpoint skipped the --timeout flag on openshell inference set, leaving the 60 s default active even when NEMOCLAW_LOCAL_INFERENCE_TIMEOUT was exported. ollama-local and vllm-local already pass the flag; this brings compatible-endpoint into parity so reasoning-model operators on external LAN endpoints are not silently capped at 60 seconds.

Related Issue

Fixes #2403

Changes

  • src/lib/onboard.ts — add --timeout to inference set args when provider is compatible-endpoint, using the same LOCAL_INFERENCE_TIMEOUT_SECS constant already applied to ollama-local and vllm-local
  • test/onboard-selection.test.ts — new test that spawns a fake openshell binary recording inference set args, sets NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=600, calls setupInference directly, and asserts --timeout 600 is present

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages includes SPDX header and frontmatter (new pages only)

AI Disclosure

  • AI-assisted — tool: Claude Code

Signed-off-by: Jason Ma jama@nvidia.com

Summary by CodeRabbit

  • New Features

    • Added configurable timeout support for local inference setup, applied for the compatible endpoint flow.
  • Tests

    • Added a regression test to verify the configured timeout is forwarded to the inference initialization command.

…oint (#2403)

The compatible-endpoint provider skipped the --timeout flag on
`openshell inference set`, leaving the 60s default in place even when
NEMOCLAW_LOCAL_INFERENCE_TIMEOUT was exported. ollama-local and
vllm-local already pass the flag; this brings compatible-endpoint into
parity so reasoning-model operators on external LAN endpoints are not
silently capped at 60 seconds.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jason-ma-nv jason-ma-nv self-assigned this Apr 28, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 28, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

📝 Walkthrough

Walkthrough

Adds plumbing so NEMOCLAW_LOCAL_INFERENCE_TIMEOUT (via LOCAL_INFERENCE_TIMEOUT_SECS) is forwarded as --timeout when setupInference configures the compatible-endpoint provider; includes a regression test verifying the value is passed into the openshell inference set invocation.

Changes

Cohort / File(s) Summary
Timeout configuration for compatible-endpoint
src/lib/onboard.ts
Conditionalizes openshell inference set argument construction to include --timeout <LOCAL_INFERENCE_TIMEOUT_SECS> for the compatible-endpoint provider.
Regression test for timeout propagation
test/onboard-selection.test.ts
Adds a test that sets NEMOCLAW_LOCAL_INFERENCE_TIMEOUT, injects a fake openshell binary to capture inference set args, and asserts --timeout 600 is present for the compatible-endpoint flow.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐇 I hopped through code with whiskers twitching bright,
A timeout carried home from dark to light,
Compatible-endpoint now holds the key,
Models can think a while — set them free! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing the compatible-endpoint provider to honour the NEMOCLAW_LOCAL_INFERENCE_TIMEOUT environment variable, which directly addresses issue #2403.
Linked Issues check ✅ Passed The PR fully addresses the coding requirements from issue #2403: plumbing NEMOCLAW_LOCAL_INFERENCE_TIMEOUT into compatible-endpoint provider so it passes --timeout to openshell inference set, with corresponding regression test added.
Out of Scope Changes check ✅ Passed All changes are in-scope: src/lib/onboard.ts adds timeout flag for compatible-endpoint provider, and test/onboard-selection.test.ts adds a regression test validating the fix. Both directly address the linked issue requirement.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/compatible-endpoint-timeout-2403

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@wscurran wscurran added bug Something isn't working Local Models Running NemoClaw with local models NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). enhancement: inference Items related to running (local or hosted) inference models from NemoClaw. labels Apr 28, 2026
@wscurran
Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this PR that fixes a bug and improves the inference experience with the compatible-endpoint provider by honoring the NEMOCLAW_LOCAL_INFERENCE_TIMEOUT environment variable. This change brings the compatible-endpoint into parity with ollama-local and vllm-local, ensuring that reasoning-model operators on external LAN endpoints are not silently capped at 60 seconds.


Related open issues:

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. This is narrowly scoped to , reuses the existing local inference timeout setting, and includes a regression test that verifies receives the expected argument.

@ericksoa ericksoa enabled auto-merge (squash) April 29, 2026 12:48
@ericksoa ericksoa disabled auto-merge April 29, 2026 12:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/onboard-selection.test.ts (1)

3456-3463: Tighten timeout assertion to validate flag/value pairing.

Current checks can pass if "600" appears elsewhere in args. Prefer asserting "600" is the immediate value after "--timeout".

Proposed assertion hardening
-    assert.ok(
-      state.inferenceSetArgs.includes("--timeout"),
-      `Expected --timeout in inference set args, got: ${JSON.stringify(state.inferenceSetArgs)}`,
-    );
-    assert.ok(
-      state.inferenceSetArgs.includes("600"),
-      `Expected 600 in inference set args, got: ${JSON.stringify(state.inferenceSetArgs)}`,
-    );
+    const timeoutIdx = state.inferenceSetArgs.indexOf("--timeout");
+    assert.notEqual(
+      timeoutIdx,
+      -1,
+      `Expected --timeout in inference set args, got: ${JSON.stringify(state.inferenceSetArgs)}`,
+    );
+    assert.equal(
+      state.inferenceSetArgs[timeoutIdx + 1],
+      "600",
+      `Expected --timeout value 600, got: ${JSON.stringify(state.inferenceSetArgs)}`,
+    );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard-selection.test.ts` around lines 3456 - 3463, The test currently
just asserts that "--timeout" and "600" exist anywhere in
state.inferenceSetArgs; change it to find the index of "--timeout" in
state.inferenceSetArgs (using state.inferenceSetArgs.indexOf("--timeout")),
assert the index is >= 0 and index + 1 is within bounds, then assert that
state.inferenceSetArgs[index + 1] === "600" so the flag and value are paired.
This tightens the assertion in the failing test that uses
state.inferenceSetArgs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/onboard-selection.test.ts`:
- Around line 3456-3463: The test currently just asserts that "--timeout" and
"600" exist anywhere in state.inferenceSetArgs; change it to find the index of
"--timeout" in state.inferenceSetArgs (using
state.inferenceSetArgs.indexOf("--timeout")), assert the index is >= 0 and index
+ 1 is within bounds, then assert that state.inferenceSetArgs[index + 1] ===
"600" so the flag and value are paired. This tightens the assertion in the
failing test that uses state.inferenceSetArgs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d431ed71-5511-4b15-a110-e6a47baa1ddc

📥 Commits

Reviewing files that changed from the base of the PR and between 9e2b975 and db7e8eb.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard-selection.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard.ts

@ericksoa ericksoa merged commit b4ef3db into main Apr 29, 2026
12 checks passed
@wscurran wscurran added the VDR Linked to VDR #4 finding label Apr 29, 2026
@miyoungc miyoungc mentioned this pull request Apr 30, 2026
13 tasks
miyoungc added a commit that referenced this pull request Apr 30, 2026
## Summary
Refreshes the daily docs from NemoClaw commits merged in the past 24
hours and advances the docs metadata from 0.0.29 to 0.0.31, the next
version after tag v0.0.30.
The updates cover documented behavior gaps found in the merged PRs
listed below.

## Related Issue
None.

## Changes
- `docs/versions1.json` and `docs/project.json`: bump the preferred docs
version to `0.0.31` for daily release preparation after latest tag
`v0.0.30`.
- `docs/reference/commands.md`: document non-interactive Brave Search
validation fallback from #2511 / 9bfe30b, missing `--from <Dockerfile>`
path validation from #2597 / 7186834, and `logs` reading OpenShell
audit events from #2590 / e225dfb.
- `docs/inference/use-local-inference.md`: document local inference
reachability retry and host-side fallback from #2453 / 9dbe855, plus
compatible-endpoint timeout coverage from #2583 / b4ef3db.
- `docs/reference/troubleshooting.md`: document source-install shim
fallback from #2520 / 01a177c, TLS gateway trust recovery from #1936 /
24725d2, compatible-endpoint timeout coverage from #2583 / b4ef3db,
local reachability diagnostics from #2453 / 9dbe855, and host proxy
`NO_PROXY` injection from #2662 / b4df07e.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
- [ ] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

Additional verification:
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --dry-run` passed.
- `git diff --check` passed.
- Pre-push hooks passed through markdownlint, docs-to-skills, JSON
checks, gitleaks, and version sync before `Test (skills YAML)` failed
because this fresh worktree lacked `vitest/config`.
- `npx prek run --all-files` could not run from the fresh worktree
because `npx prek` resolved to a missing `prek@*` package; downloading
`@j178/prek` was not approved.
- `npm test` could not complete from the fresh worktree because
dependencies and compiled `dist/lib/*` artifacts were absent.

## AI Disclosure
- [x] AI-assisted — tool: OpenAI Codex

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
  * Version updated to 0.0.31
* Local inference onboarding now includes retry logic for container
reachability checks
  * Web search setup failure handling clarified with fallback guidance
  * Dockerfile path validation timing documented
  * Logging behavior clarified for concurrent stream reading
  * New TLS/certificate troubleshooting section added
  * Install path and proxy configuration troubleshooting updated

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…oint (NVIDIA#2403) (NVIDIA#2583)

## Summary

`compatible-endpoint` skipped the `--timeout` flag on `openshell
inference set`, leaving the 60 s default active even when
`NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` was exported. `ollama-local` and
`vllm-local` already pass the flag; this brings `compatible-endpoint`
into parity so reasoning-model operators on external LAN endpoints are
not silently capped at 60 seconds.

## Related Issue

Fixes NVIDIA#2403

## Changes

- `src/lib/onboard.ts` — add `--timeout` to `inference set` args when
provider is `compatible-endpoint`, using the same
`LOCAL_INFERENCE_TIMEOUT_SECS` constant already applied to
`ollama-local` and `vllm-local`
- `test/onboard-selection.test.ts` — new test that spawns a fake
`openshell` binary recording `inference set` args, sets
`NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=600`, calls `setupInference` directly,
and asserts `--timeout 600` is present

## Type of Change

- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification

- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `make docs` builds without warnings (doc changes only)
- [ ] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages includes SPDX header and frontmatter (new pages
only)

## AI Disclosure

- [x] AI-assisted — tool: Claude Code

---
Signed-off-by: Jason Ma <jama@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added configurable timeout support for local inference setup, applied
for the compatible endpoint flow.

* **Tests**
* Added a regression test to verify the configured timeout is forwarded
to the inference initialization command.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
## Summary
Refreshes the daily docs from NemoClaw commits merged in the past 24
hours and advances the docs metadata from 0.0.29 to 0.0.31, the next
version after tag v0.0.30.
The updates cover documented behavior gaps found in the merged PRs
listed below.

## Related Issue
None.

## Changes
- `docs/versions1.json` and `docs/project.json`: bump the preferred docs
version to `0.0.31` for daily release preparation after latest tag
`v0.0.30`.
- `docs/reference/commands.md`: document non-interactive Brave Search
validation fallback from NVIDIA#2511 / 9bfe30b, missing `--from <Dockerfile>`
path validation from NVIDIA#2597 / 7186834, and `logs` reading OpenShell
audit events from NVIDIA#2590 / e225dfb.
- `docs/inference/use-local-inference.md`: document local inference
reachability retry and host-side fallback from NVIDIA#2453 / 9dbe855, plus
compatible-endpoint timeout coverage from NVIDIA#2583 / b4ef3db.
- `docs/reference/troubleshooting.md`: document source-install shim
fallback from NVIDIA#2520 / 01a177c, TLS gateway trust recovery from NVIDIA#1936 /
24725d2, compatible-endpoint timeout coverage from NVIDIA#2583 / b4ef3db,
local reachability diagnostics from NVIDIA#2453 / 9dbe855, and host proxy
`NO_PROXY` injection from NVIDIA#2662 / b4df07e.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
- [ ] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

Additional verification:
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --dry-run` passed.
- `git diff --check` passed.
- Pre-push hooks passed through markdownlint, docs-to-skills, JSON
checks, gitleaks, and version sync before `Test (skills YAML)` failed
because this fresh worktree lacked `vitest/config`.
- `npx prek run --all-files` could not run from the fresh worktree
because `npx prek` resolved to a missing `prek@*` package; downloading
`@j178/prek` was not approved.
- `npm test` could not complete from the fresh worktree because
dependencies and compiled `dist/lib/*` artifacts were absent.

## AI Disclosure
- [x] AI-assisted — tool: OpenAI Codex

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
  * Version updated to 0.0.31
* Local inference onboarding now includes retry logic for container
reachability checks
  * Web search setup failure handling clarified with fallback guidance
  * Dockerfile path validation timing documented
  * Logging behavior clarified for concurrent stream reading
  * New TLS/certificate troubleshooting section added
  * Install path and proxy configuration troubleshooting updated

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement: inference Items related to running (local or hosted) inference models from NemoClaw. Local Models Running NemoClaw with local models NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). VDR Linked to VDR #4 finding

Projects

None yet

3 participants