refactor(orchestrator): make startingSandboxes limit configurable via feature flag#2622
Conversation
… feature flag Replace the hardcoded maxStartingInstancesPerNode=3 weighted semaphore with an AdjustableSemaphore driven by the MaxStartingInstancesPerNode feature flag, so per-node start/resume concurrency can be tuned at runtime without a redeploy. A background refresher resizes the semaphore every 30s.
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit 7fcede7. Bugbot is set up for automated code reviews on this repo. Configure here. |
❌ 4 Tests Failed:
View the full list of 12 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
Code Review
Starting the background goroutine before initialization is complete can cause a resource leak if subsequent steps fail. The background refresher should track the last applied limit to avoid unnecessary semaphore broadcasts when the value hasn't changed.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 276ca2f. Configure here.
There was a problem hiding this comment.
An organization admin can view or raise the cap at claude.ai/admin-settings/claude-code. The cap resets at the start of the next billing period.
Once the cap resets or is raised, reopen this pull request to trigger a review.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7fcede7bba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| startingLimit := cfg.FeatureFlags.IntFlag(ctx, featureflags.MaxStartingInstancesPerNode) | ||
| startingSandboxes, err := utils.NewAdjustableSemaphore(int64(startingLimit)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create starting sandboxes semaphore: %w", err) |
There was a problem hiding this comment.
Fall back when startup flag value is non-positive
New now constructs startingSandboxes directly from MaxStartingInstancesPerNode, but if LaunchDarkly is misconfigured to 0 or a negative value, NewAdjustableSemaphore returns an error and the orchestrator fails to start. This is a regression from the previous hardcoded limit path and is especially problematic because the periodic refresher already treats <= 0 as invalid and skips applying it, so startup should likewise clamp or fall back instead of hard-failing the whole service.
Useful? React with 👍 / 👎.

Replace the hardcoded maxStartingInstancesPerNode=3 weighted semaphore with an AdjustableSemaphore driven by the MaxStartingInstancesPerNode feature flag, so per-node start/resume concurrency can be tuned at runtime without a redeploy. A background refresher resizes the semaphore every 30s.