Skip to content

Fix company runtime retries and detached local runtime health#348

Open
heodongun wants to merge 3 commits into
codex/ci-timeouts-and-test-lifecyclefrom
fix/company-runtime-retry-truthfulness
Open

Fix company runtime retries and detached local runtime health#348
heodongun wants to merge 3 commits into
codex/ci-timeouts-and-test-lifecyclefrom
fix/company-runtime-retry-truthfulness

Conversation

@heodongun
Copy link
Copy Markdown
Collaborator

@heodongun heodongun commented Apr 5, 2026

Summary

  • surface stdout-backed OpenCode failures instead of collapsing them to (no stderr) so company execution logs show the real error
  • treat the observed OpenCode DecimalError signature as recoverable so autonomous company issues retry instead of terminally dead-ending in BLOCKED
  • report CLI-only local runtime health as offline until a real app-server instance is attached, and document the final validation evidence and residual risks

Root cause

  • company-executed OpenCode failures were emitting their useful error detail on stdout, but AgentExecutor formatted the user-visible error line from stderr only
  • detached CLI company flows persisted runtime intent as RUNNING, but local backend health was still shown as healthy even when no app-server instance existed for that app home
  • the observed OpenCode DecimalError Invalid argument: [object Object] path could leave autonomous execution issues blocked instead of re-entering recoverable retry handling
  • one DesktopAppService test was asserting follow-up issue state before asynchronous reconciliation had finished, which prevented the full Gradle suite from going green after the behavioral fix

Changes

  • fallback ProcessExecutionException display text to stdout when stderr is blank
  • classify the observed OpenCode DecimalError signature as recoverable infrastructure failure in company retry logic
  • derive local runtime backend health from the live app-server instance record and expose detached CLI status as offline
  • add regression coverage for stdout-backed process failures, detached runtime status, and recoverable OpenCode decimal failures
  • stabilize the remaining DesktopAppService follow-up test so it waits for the issue reconciliation it depends on
  • add/update docs/reports/2026-04-06-company-validation-and-desktop-state-fix.md

Test results

  • export JAVA_HOME=$(/usr/libexec/java_home -v 17) && ./gradlew test
  • export JAVA_HOME=$(/usr/libexec/java_home -v 17) && ./gradlew shadowJar
  • cd macos && swift test

Manual user-scenario validation

  • fresh CLI-created autonomous company now reports backendHealth: offline with backendLifecycleState: STOPPED before any app-server is attached
  • after starting a live app-server, fresh company flows progressed from CEO planning into execution work instead of presenting a falsely healthy detached runtime
  • in a 65-second fresh validation run, the company recovered from a failed execution run and re-entered IN_PROGRESS with a fresh retry task instead of staying terminally blocked
  • replaying the real /Users/Projects/bssm-oss/cotor-organization/cotor-test company for roughly a minute showed the runtime continuing to reconcile and restart work while issue lanes moved across DELEGATED, IN_PROGRESS, and IN_REVIEW
  • company execution logs now preserve stdout-backed OpenCode failure detail, including the observed DecimalError NDJSON payloads

Impact

  • company runtime visibility is more truthful for CLI-only local flows
  • autonomous company execution is more resilient to the observed OpenCode decimal failure path
  • residual risk remains: the underlying OpenCode-side DecimalError still appears intermittently, but the workflow now retries and continues instead of dead-ending

heodongun and others added 2 commits April 6, 2026 01:20
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 85e07a3c-cbb3-48f3-b2bf-4af643bf9b4a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/company-runtime-retry-truthfulness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@heodongun heodongun force-pushed the codex/ci-timeouts-and-test-lifecycle branch from 235d75a to ff20867 Compare May 6, 2026 13:02
@heodongun heodongun force-pushed the fix/company-runtime-retry-truthfulness branch from 5810401 to 470fb1d Compare May 6, 2026 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant