fix(nodes): improve error handling in remote node execution by charliegillet · Pull Request #565 · rocketride-org/rocketride-server

charliegillet · 2026-03-31T00:31:07Z

Summary

Replace fragile string-based error origin detection ('RemoteException:' not in msg) with proper type checking (isinstance(e, APERR) and e.ec == Ec.RemoteException)
Always send error signals to the remote side before re-raising, preventing the remote pipeline from hanging when an already-wrapped RemoteException is caught
Narrow exception catch from BaseException to Exception to avoid intercepting KeyboardInterrupt and SystemExit

Type

fix

Testing

Verify remote node pipeline execution completes normally (no regression)
Trigger an error in local pipeline processing and confirm the remote side receives the error signal instead of hanging
Trigger a KeyboardInterrupt during remote execution and confirm it is no longer caught by the error handler

Changed files

nodes/src/nodes/remote/base/IInstance.py — callRemote() error handler
nodes/src/nodes/remote/server/IInstance.py — handleWebSocket() error handler

🤖 Generated with Claude Code

…screen Handle TASK_STATE.STOPPING in the control button to show "Stopping..." with a disabled state and distinct orange styling, preventing duplicate clicks and giving immediate visual feedback during pipeline shutdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace fragile string-based error origin detection with proper type checking (isinstance + error code comparison). Always send error signals to the remote side to prevent hanging connections. Narrow exception catch from BaseException to Exception to avoid intercepting KeyboardInterrupt and SystemExit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-31T00:33:23Z

Warning

Rate limit exceeded

@charliegillet has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 1 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 2 minutes and 1 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 69f8dc7b-b664-48bb-bdb9-b709f14d5bc8

📥 Commits

Reviewing files that changed from the base of the PR and between b6b2dc1 and 02adaef.

📒 Files selected for processing (4)

apps/vscode/src/providers/views/PageStatus/PageStatus.tsx
apps/vscode/src/providers/views/PageStatus/styles.css
nodes/src/nodes/remote/base/IInstance.py
nodes/src/nodes/remote/server/IInstance.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ryan-t-christensen · 2026-03-31T21:25:02Z

Looks good @charliegillet - I did test the STOPPING button state — it transitioned in ~0.6ms, so it's never visible in practice. The UI change is correct but the STOPPING state is set in _terminated() (task_engine.py:551) after the subprocess has already exited, not when the stop is requested. To make this useful the server would need to transition to STOPPING at the moment the stop command is issued. As-is the button will never render. That probably warrants a separate issue.

cc @stepmikhaylov

charliegillet · 2026-04-03T18:45:40Z

@ryan-t-christensen thanks for the thorough testing and the approval! Good catch on the STOPPING state timing — you're right that since _terminated() sets it after the subprocess exits, the button state is effectively invisible. I'll open a separate issue to track moving the STOPPING transition to the moment the stop command is issued on the server side, so the UI can actually reflect it.

* feat(vscode): improve stop button feedback in Pipeline Observability screen Handle TASK_STATE.STOPPING in the control button to show "Stopping..." with a disabled state and distinct orange styling, preventing duplicate clicks and giving immediate visual feedback during pipeline shutdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(nodes): improve error handling in remote node execution Replace fragile string-based error origin detection with proper type checking (isinstance + error code comparison). Always send error signals to the remote side to prevent hanging connections. Narrow exception catch from BaseException to Exception to avoid intercepting KeyboardInterrupt and SystemExit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add `head_branch == 'develop'` guard to the nightly workflow's init job. The `branches: [develop]` filter on the trigger already enforces this, but static analyzers (OSSF Scorecard's DangerousWorkflow rule) can't trace that — they see workflow_run + head_sha checkout + secrets: inherit and flag the pattern critical regardless of the trigger filter. The explicit branch check ties the trust boundary to develop's branch protection (PR + code-owner review + required checks), which is the actual safety guarantee. All downstream jobs depend on init via needs:, so they inherit the gate without per-job changes. Behavior is unchanged: workflow_run runs that wouldn't have fired under the trigger filter still don't fire. Manual workflow_dispatch remains unaffected. Note: `workflow_run` triggers always execute the version of this file from main (per GitHub's rules), so this fix takes effect for runtime safety only after develop → main lands. The static-analysis finding (#760, alert #565) closes once the Scorecard scan re-runs against develop with this change. Fixes #760

…771) * fix(security): explicit branch check on nightly workflow_run trigger Add `head_branch == 'develop'` guard to the nightly workflow's init job. The `branches: [develop]` filter on the trigger already enforces this, but static analyzers (OSSF Scorecard's DangerousWorkflow rule) can't trace that — they see workflow_run + head_sha checkout + secrets: inherit and flag the pattern critical regardless of the trigger filter. The explicit branch check ties the trust boundary to develop's branch protection (PR + code-owner review + required checks), which is the actual safety guarantee. All downstream jobs depend on init via needs:, so they inherit the gate without per-job changes. Behavior is unchanged: workflow_run runs that wouldn't have fired under the trigger filter still don't fire. Manual workflow_dispatch remains unaffected. Note: `workflow_run` triggers always execute the version of this file from main (per GitHub's rules), so this fix takes effect for runtime safety only after develop → main lands. The static-analysis finding (#760, alert #565) closes once the Scorecard scan re-runs against develop with this change. Fixes #760 * fix(security): restrict workflow_dispatch to develop ref Per CodeRabbit review on PR #771: the workflow_dispatch arm in nightly.yaml accepted any ref, which combined with secrets: inherit and write permissions creates a residual privilege-escalation path even for trusted users. Tying manual dispatch to refs/heads/develop makes the trust boundary uniform across both trigger paths (workflow_run already gated to develop's branch protection chain). Operational impact is negligible: hot-fix flows go through develop already, and "republishing a known-good ref" practically always means a develop commit. Anyone wanting to publish from a feature branch must merge to develop first — which is the right policy for SOC2 change-management evidence. Refs #760 --------- Co-authored-by: Anand Ray <anand.ray@rocketride.ai>

CodeRabbit flagged that the policy text said "admin permissions on the repository" for the second-principal approver, which is the wrong authorization scope. GitHub Delegated Alert Dismissal is an org-level control: only organization owners, security managers, or holders of explicitly delegated custom roles can approve dismissal requests. Repo-admin alone is not sufficient. This matches what actually worked for alert #565: kwit75 approved as a rocketride-org owner, not via repo-admin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…lerts CodeRabbit flagged that Scorecard appeared in the scanning-tools list but wasn't covered explicitly by any dismissal-flow section. In our setup, Scorecard publishes SARIF that surfaces as code scanning alerts in GitHub's UI (that's the channel by which alert #565 — Scorecard DangerousWorkflowID — was dismissed via the two-person flow). Adds an intro sentence making this coverage explicit so an auditor reading the doc doesn't have to infer it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(security): document two-person control on alert dismissals Adds a "Vulnerability & Alert Triage" section to SECURITY.md that codifies the dismissal flow we already operate under and that GitHub's delegated alert dismissal enforces: - Request (any write maintainer, with justification) - Approval (different admin maintainer; no self-approval) - GitHub auto-applies dismissal after approval Also documents: - The four valid triage dispositions (Fix / Mitigated / False positive / Won't fix) and the evidence each requires - Secret Scanning Push Protection bypass logging - Dependabot dismissal owner + comment requirement The flow was verified end-to-end on Scorecard alert #565 (anandray requested with documented compensating controls, kwit75 approved as second principal). This commit captures the policy that maps to that audit trail; it's a documentation-only change with no code impact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(security): correct approver-role wording for delegated dismissal CodeRabbit flagged that the policy text said "admin permissions on the repository" for the second-principal approver, which is the wrong authorization scope. GitHub Delegated Alert Dismissal is an org-level control: only organization owners, security managers, or holders of explicitly delegated custom roles can approve dismissal requests. Repo-admin alone is not sufficient. This matches what actually worked for alert #565: kwit75 approved as a rocketride-org owner, not via repo-admin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(security): align Dependabot dismissal wording with delegated-dismissal model Same fix as a42cf81 but for the Dependabot bullet on line 71. CodeRabbit caught that "repository admins" was wrong for the same reason: GitHub Delegated Alert Dismissal is an org-level control, authorized by organization owners, security managers, or holders of explicitly delegated custom roles — not repo-admin alone. The new wording also explicitly points at the two-person request → approval flow defined earlier in the section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

charliegillet and others added 2 commits March 30, 2026 17:11

charliegillet requested review from Rod-Christensen, jmaionchi and stepmikhaylov as code owners March 31, 2026 00:31

github-actions Bot added module:nodes Python pipeline nodes module:vscode VS Code extension labels Mar 31, 2026

ryan-t-christensen approved these changes Mar 31, 2026

View reviewed changes

stepmikhaylov approved these changes Apr 6, 2026

View reviewed changes

stepmikhaylov merged commit 25b0c0d into rocketride-org:develop Apr 6, 2026
11 checks passed

anandray mentioned this pull request May 6, 2026

fix(security): explicit branch check on nightly workflow_run trigger #771

Merged

7 tasks

anandray mentioned this pull request May 8, 2026

docs(security): document two-person control on alert dismissals #800

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(nodes): improve error handling in remote node execution#565

fix(nodes): improve error handling in remote node execution#565
stepmikhaylov merged 2 commits intorocketride-org:developfrom
charliegillet:fix/remote-node-error-handling

charliegillet commented Mar 31, 2026

Uh oh!

coderabbitai Bot commented Mar 31, 2026

Rate limit exceeded

Uh oh!

ryan-t-christensen commented Mar 31, 2026 •

edited

Loading

Uh oh!

charliegillet commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

charliegillet commented Mar 31, 2026

Summary

Type

Testing

Changed files

Uh oh!

coderabbitai Bot commented Mar 31, 2026

Rate limit exceeded

Uh oh!

ryan-t-christensen commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charliegillet commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ryan-t-christensen commented Mar 31, 2026 •

edited

Loading