Context
When CCB is launched inside a systemd service with TasksMax=N (e.g. a CI runner, a sandbox wrapper, or our own orchestrator tool that spins up per-task scopes), all provider agents share that single cgroup's budget. One heavy agent — for example a codex doing pytest test/ with many workers — can exhaust the shared TasksMax and starve siblings. We've seen codex panic with WouldBlock: Resource temporarily unavailable in exactly this scenario.
This is a budgeting/fairness issue inside the keeper's cgroup. It's not addressable by tweaking tmux or provider CLIs individually; it needs per-agent cgroup scoping.
Proposal
Per-agent cgroup v2 sub-directories under the CCB keeper's cgroup. Each provider agent's tmux pane process is migrated into agent-<name>/ with its own pids.max and memory.max.
- Feature-flag gated:
CCB_PER_AGENT_SUBCGROUP=1 (default off, no behavior change for existing users)
- Graceful degradation: if delegation is unavailable (cgroup v1 host, missing controllers, no write permission), migration is a no-op with a WARNING log
- Wiring point: a single call site in
lib/cli/services/runtime_launch_runtime/tmux_runtime.py::launch_tmux_runtime right after launch_pane returns; uses tmux display-message '#{pane_pid}' to locate the pane shell PID and writes it to cgroup.procs of the sub-cgroup
- New module
lib/provider_core/subcgroup.py (~150 lines), fully unit-tested (40 tests)
Full design doc: RFC (the referenced RFC is in sevenx's personal notes; the fork commit itself links to it).
Reference implementation on our fork branch: commit d633ddf on SevenX77/personal.
Prerequisites
For the feature to have an effect, the keeper must be started with cgroup v2 delegation, e.g.:
systemd-run --user -p Delegate=pids memory cpu -p TasksMax=<N> --unit ... -- ccb ...
The companion tool we use to wrap CCB into a sibling scope (claude-ccb-orchestrator) already passes this.
Questions
- Would you be open to a PR for this? (fork-first route OK; we're happy to keep it in our fork if upstream scope is narrower)
- If yes, any preferred location for the helper module?
lib/provider_core/ felt natural since it's provider-agnostic infrastructure.
- Default-on in the future, or permanently behind a flag?
Related
Sibling change already under discussion: #191 (discussion: .ccb/ccb.config → .ccb.example)
Open PRs from same author: #185 (merged), #186 (merged), #188, #189, #190.
Context
When CCB is launched inside a systemd service with
TasksMax=N(e.g. a CI runner, a sandbox wrapper, or our own orchestrator tool that spins up per-task scopes), all provider agents share that single cgroup's budget. One heavy agent — for example acodexdoingpytest test/with many workers — can exhaust the sharedTasksMaxand starve siblings. We've seen codex panic withWouldBlock: Resource temporarily unavailablein exactly this scenario.This is a budgeting/fairness issue inside the keeper's cgroup. It's not addressable by tweaking tmux or provider CLIs individually; it needs per-agent cgroup scoping.
Proposal
Per-agent cgroup v2 sub-directories under the CCB keeper's cgroup. Each provider agent's tmux pane process is migrated into
agent-<name>/with its ownpids.maxandmemory.max.CCB_PER_AGENT_SUBCGROUP=1(default off, no behavior change for existing users)lib/cli/services/runtime_launch_runtime/tmux_runtime.py::launch_tmux_runtimeright afterlaunch_panereturns; usestmux display-message '#{pane_pid}'to locate the pane shell PID and writes it tocgroup.procsof the sub-cgrouplib/provider_core/subcgroup.py(~150 lines), fully unit-tested (40 tests)Full design doc: RFC (the referenced RFC is in sevenx's personal notes; the fork commit itself links to it).
Reference implementation on our fork branch: commit
d633ddfonSevenX77/personal.Prerequisites
For the feature to have an effect, the keeper must be started with cgroup v2 delegation, e.g.:
The companion tool we use to wrap CCB into a sibling scope (
claude-ccb-orchestrator) already passes this.Questions
lib/provider_core/felt natural since it's provider-agnostic infrastructure.Related
Sibling change already under discussion: #191 (discussion:
.ccb/ccb.config→.ccb.example)Open PRs from same author: #185 (merged), #186 (merged), #188, #189, #190.