fix(nodes): improve error handling in remote node execution#565
Conversation
…screen Handle TASK_STATE.STOPPING in the control button to show "Stopping..." with a disabled state and distinct orange styling, preventing duplicate clicks and giving immediate visual feedback during pipeline shutdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace fragile string-based error origin detection with proper type checking (isinstance + error code comparison). Always send error signals to the remote side to prevent hanging connections. Narrow exception catch from BaseException to Exception to avoid intercepting KeyboardInterrupt and SystemExit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 2 minutes and 1 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Looks good @charliegillet - I did test the STOPPING button state — it transitioned in ~0.6ms, so it's never visible in practice. The UI change is correct but the STOPPING state is set in _terminated() (task_engine.py:551) after the subprocess has already exited, not when the stop is requested. To make this useful the server would need to transition to STOPPING at the moment the stop command is issued. As-is the button will never render. That probably warrants a separate issue. |
|
@ryan-t-christensen thanks for the thorough testing and the approval! Good catch on the STOPPING state timing — you're right that since |
* feat(vscode): improve stop button feedback in Pipeline Observability screen Handle TASK_STATE.STOPPING in the control button to show "Stopping..." with a disabled state and distinct orange styling, preventing duplicate clicks and giving immediate visual feedback during pipeline shutdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(nodes): improve error handling in remote node execution Replace fragile string-based error origin detection with proper type checking (isinstance + error code comparison). Always send error signals to the remote side to prevent hanging connections. Narrow exception catch from BaseException to Exception to avoid intercepting KeyboardInterrupt and SystemExit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `head_branch == 'develop'` guard to the nightly workflow's init job. The `branches: [develop]` filter on the trigger already enforces this, but static analyzers (OSSF Scorecard's DangerousWorkflow rule) can't trace that — they see workflow_run + head_sha checkout + secrets: inherit and flag the pattern critical regardless of the trigger filter. The explicit branch check ties the trust boundary to develop's branch protection (PR + code-owner review + required checks), which is the actual safety guarantee. All downstream jobs depend on init via needs:, so they inherit the gate without per-job changes. Behavior is unchanged: workflow_run runs that wouldn't have fired under the trigger filter still don't fire. Manual workflow_dispatch remains unaffected. Note: `workflow_run` triggers always execute the version of this file from main (per GitHub's rules), so this fix takes effect for runtime safety only after develop → main lands. The static-analysis finding (#760, alert #565) closes once the Scorecard scan re-runs against develop with this change. Fixes #760
…771) * fix(security): explicit branch check on nightly workflow_run trigger Add `head_branch == 'develop'` guard to the nightly workflow's init job. The `branches: [develop]` filter on the trigger already enforces this, but static analyzers (OSSF Scorecard's DangerousWorkflow rule) can't trace that — they see workflow_run + head_sha checkout + secrets: inherit and flag the pattern critical regardless of the trigger filter. The explicit branch check ties the trust boundary to develop's branch protection (PR + code-owner review + required checks), which is the actual safety guarantee. All downstream jobs depend on init via needs:, so they inherit the gate without per-job changes. Behavior is unchanged: workflow_run runs that wouldn't have fired under the trigger filter still don't fire. Manual workflow_dispatch remains unaffected. Note: `workflow_run` triggers always execute the version of this file from main (per GitHub's rules), so this fix takes effect for runtime safety only after develop → main lands. The static-analysis finding (#760, alert #565) closes once the Scorecard scan re-runs against develop with this change. Fixes #760 * fix(security): restrict workflow_dispatch to develop ref Per CodeRabbit review on PR #771: the workflow_dispatch arm in nightly.yaml accepted any ref, which combined with secrets: inherit and write permissions creates a residual privilege-escalation path even for trusted users. Tying manual dispatch to refs/heads/develop makes the trust boundary uniform across both trigger paths (workflow_run already gated to develop's branch protection chain). Operational impact is negligible: hot-fix flows go through develop already, and "republishing a known-good ref" practically always means a develop commit. Anyone wanting to publish from a feature branch must merge to develop first — which is the right policy for SOC2 change-management evidence. Refs #760 --------- Co-authored-by: Anand Ray <anand.ray@rocketride.ai>
CodeRabbit flagged that the policy text said "admin permissions on the repository" for the second-principal approver, which is the wrong authorization scope. GitHub Delegated Alert Dismissal is an org-level control: only organization owners, security managers, or holders of explicitly delegated custom roles can approve dismissal requests. Repo-admin alone is not sufficient. This matches what actually worked for alert #565: kwit75 approved as a rocketride-org owner, not via repo-admin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lerts CodeRabbit flagged that Scorecard appeared in the scanning-tools list but wasn't covered explicitly by any dismissal-flow section. In our setup, Scorecard publishes SARIF that surfaces as code scanning alerts in GitHub's UI (that's the channel by which alert #565 — Scorecard DangerousWorkflowID — was dismissed via the two-person flow). Adds an intro sentence making this coverage explicit so an auditor reading the doc doesn't have to infer it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(security): document two-person control on alert dismissals Adds a "Vulnerability & Alert Triage" section to SECURITY.md that codifies the dismissal flow we already operate under and that GitHub's delegated alert dismissal enforces: - Request (any write maintainer, with justification) - Approval (different admin maintainer; no self-approval) - GitHub auto-applies dismissal after approval Also documents: - The four valid triage dispositions (Fix / Mitigated / False positive / Won't fix) and the evidence each requires - Secret Scanning Push Protection bypass logging - Dependabot dismissal owner + comment requirement The flow was verified end-to-end on Scorecard alert #565 (anandray requested with documented compensating controls, kwit75 approved as second principal). This commit captures the policy that maps to that audit trail; it's a documentation-only change with no code impact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(security): correct approver-role wording for delegated dismissal CodeRabbit flagged that the policy text said "admin permissions on the repository" for the second-principal approver, which is the wrong authorization scope. GitHub Delegated Alert Dismissal is an org-level control: only organization owners, security managers, or holders of explicitly delegated custom roles can approve dismissal requests. Repo-admin alone is not sufficient. This matches what actually worked for alert #565: kwit75 approved as a rocketride-org owner, not via repo-admin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(security): align Dependabot dismissal wording with delegated-dismissal model Same fix as a42cf81 but for the Dependabot bullet on line 71. CodeRabbit caught that "repository admins" was wrong for the same reason: GitHub Delegated Alert Dismissal is an org-level control, authorized by organization owners, security managers, or holders of explicitly delegated custom roles — not repo-admin alone. The new wording also explicitly points at the two-person request → approval flow defined earlier in the section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
'RemoteException:' not in msg) with proper type checking (isinstance(e, APERR) and e.ec == Ec.RemoteException)RemoteExceptionis caughtBaseExceptiontoExceptionto avoid interceptingKeyboardInterruptandSystemExitType
fix
Testing
KeyboardInterruptduring remote execution and confirm it is no longer caught by the error handlerChanged files
nodes/src/nodes/remote/base/IInstance.py—callRemote()error handlernodes/src/nodes/remote/server/IInstance.py—handleWebSocket()error handler🤖 Generated with Claude Code