The cpuset overflow check in scripts/lib/framework.sh:102 extracts the trailing integer from the cpuset string instead of the maximum:
requested_max=$(echo "$cpu_limit" | grep -oP '\d+$')
This works by accident for the current profiles (all list their ranges in ascending order, so the last integer happens to be the max), but it breaks if a profile's cpuset is reordered or if a future profile is written with the largest range first.
| cpuset string |
actual max |
parser returns |
0-31,64-95 |
95 |
95 ✓ |
0-3 |
3 |
3 ✓ |
64-95,0-31 |
95 |
31 ✗ |
96-103,0-7,32-39 |
103 |
39 ✗ |
When the parser returns a too-low value, requested_max > max_cpu is false, so the code takes the --cpuset-cpus=\"\$cpu_limit\" path. Docker then rejects the cpuset (some indices are out of range on the host) and the framework container fails to start with a Docker error, instead of the intended `warn` + `--cpus` fallback.
Fix
One-line change — extract every integer, sort numerically, take the last:
requested_max=\$(echo \"\$cpu_limit\" | grep -oP '\d+' | sort -n | tail -1)
This computes the actual max regardless of ordering. The rest of the overflow logic stays the same; no other call sites.
Why surface it now
The current production profiles in `scripts/lib/profiles.sh` all happen to have the largest range trailing (`0-31,64-95`, `1-31,65-95`, `0-3`, etc.), so the bug doesn't fire today. But it's a quiet trap waiting for the next profile that breaks the convention — and the failure mode (Docker error instead of soft fallback) is worse than the same code on a 16-core laptop, where it would normally warn and continue.
The cpuset overflow check in
scripts/lib/framework.sh:102extracts the trailing integer from the cpuset string instead of the maximum:requested_max=$(echo "$cpu_limit" | grep -oP '\d+$')This works by accident for the current profiles (all list their ranges in ascending order, so the last integer happens to be the max), but it breaks if a profile's cpuset is reordered or if a future profile is written with the largest range first.
0-31,64-950-364-95,0-3196-103,0-7,32-39When the parser returns a too-low value,
requested_max > max_cpuis false, so the code takes the--cpuset-cpus=\"\$cpu_limit\"path. Docker then rejects the cpuset (some indices are out of range on the host) and the framework container fails to start with a Docker error, instead of the intended `warn` + `--cpus` fallback.Fix
One-line change — extract every integer, sort numerically, take the last:
This computes the actual max regardless of ordering. The rest of the overflow logic stays the same; no other call sites.
Why surface it now
The current production profiles in `scripts/lib/profiles.sh` all happen to have the largest range trailing (`0-31,64-95`, `1-31,65-95`, `0-3`, etc.), so the bug doesn't fire today. But it's a quiet trap waiting for the next profile that breaks the convention — and the failure mode (Docker error instead of soft fallback) is worse than the same code on a 16-core laptop, where it would normally warn and continue.