🤖 feat(goals): per-goal auto-compaction threshold override#3357
🤖 feat(goals): per-goal auto-compaction threshold override#3357ammar-agent wants to merge 6 commits into
Conversation
Adds an optional `autoCompactionThresholdPct` field to the goal record so each goal can pin its own auto-compaction behavior independently of the workspace's per-model slider. Lives as a tri-state mirroring `budgetCents` / `turnCap`: omitted = no override (model-level slider applies), `null` = explicit clear, 0–100 integer = explicit percent (`100` = compaction disabled for this goal). Why: in a goal-driven flow the user may prioritize cost control (aggressive compaction) over fidelity, or vice-versa, on a per-goal basis without retuning the global model slider. Backward-compatible by construction — legacy records and goals that never set the field continue to behave exactly as before. This commit lands the data model + behavior. Tests follow in a separate commit to keep the diff reviewable.
Pins the data model + behavior added in the previous commit: * CompactionMonitor: per-call `thresholdOverride` wins over internal threshold for both `checkBeforeSend` and `checkMidStream`; override of `1.0` disables for that call; null/NaN/non-finite/non-positive falls back to the monitor's own threshold (defensive against corrupt persisted goal records). * WorkspaceGoalService: `setGoal` persists the field, `applyMutableFields` edit-in-place can clear it (null) without dropping unrelated state, 100 round-trips distinct from null (per-goal disable vs. no override), `addUpcomingGoal` / `updateUpcomingGoal` honor the field, and legacy goal records without the field still load. * /goal slash command: integer / off / disable / default / none / clear / 75%, composes with -b and --turns in documented order, rejects out-of-range values, treats non-leading occurrences as goal text. * GoalTab UI: tile renders Default / Off / N% across the tri-state, inline editor submits the percent, "off" → 100, blank → null, out-of-range surfaces an error, Edit affordance is gated on the handler being supplied.
|
@codex review Per-goal auto-compaction threshold override. Tri-state mirroring Particular attention welcome on:
|
There was a problem hiding this comment.
💡 Codex Review
mux/src/browser/features/RightSidebar/RightSidebar.tsx
Lines 804 to 808 in 307c817
The new Auto-compact input in GoalCreateForm is parsed into resolved.autoCompactionThresholdPct, but this value is never included in the setGoalWithSingleConflictRetry payload during goal creation. As a result, creating a goal with off, 100, or any explicit percentage silently drops the override and persists the workspace default instead, so the newly added create-time control is effectively non-functional until the user edits the goal again.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
`resolveEffectiveThreshold` was rejecting `override <= 0` and falling back to the workspace threshold, but the schema admits `0` as a valid per-goal value (the aggressive extreme of the slider). That created a renderer/backend mismatch: the GoalTab tile shows "0%" and the GoalTab editor accepts the value, but the backend silently behaved as if the override was absent. Narrow the corruption guard to negative values only and document why `0` deserves an explicit pass. Add a dedicated test pinning the contract for both `checkBeforeSend` and `checkMidStream`. Picked up via PR #3357 Codex review.
|
@codex review Addressed your P2:
All checks still green locally ( |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Replace the text-input editor for the per-goal auto-compaction override with an embedded slider tile and reshape the active-goal grid to be container-query responsive. UX: - Default mode (no override): tile shows 'Default — workspace setting' with a 'Customize' button. Slider is hidden because we don't know the workspace per-model threshold at this layer and a ghost thumb would imply a value that isn't really in effect. - Override mode (value is a number): native range slider with thumb at the override value. Drag commits on release (mouseUp/touchEnd/ keyUp), not on every drag step, so backend mutation rate stays bounded by user gestures. 'Use default' clears back to null. - The right end of the slider is labeled 'Off' to mirror the workspace per-model slider — dragging there commits 100 (per-goal disabled). Layout: - Active-goal grid now uses '[container-type:inline-size]' + '@md:' so the sidebar's dynamic width (per AGENTS.md) drives the breakpoint, not the viewport. Narrow stays 2-col (Budget full row, Turns | Elapsed, Auto-compact full row). Wider goes to 3-col with Budget | Turns | Auto-compact on one row and Elapsed as a thin full-width row below. - BudgetTile internal copy stacks vertically at the wider breakpoint so its cost / remaining pair still fits in ~1/3 width. - GoalCreateForm reuses the same slider, with local state instead of the previous text-input + parser path. Drop: - 'compactThreshold' EditingField variant + the inline text editor. - parseGoalAutoCompactionThresholdInput (no longer used inside GoalTab; the slash-command parser is separate). Tests: - Replace text-input drive tests with slider-drive tests: Default mode / Off display / numeric override display; Customize commits 70; drag commits snapped value on mouseUp; drag-to-100 commits 100 (Off); mouseUp without change does NOT commit (anti-spam guard); Use default clears to null; canEdit gating hides Customize and disables the slider. - Use 'onInput' + 'onChange' both — happy-dom only synthesizes the native input event for range inputs, so onInput is required for tests, and onChange satisfies React's controlled-input contract. Validation: typecheck \u2713 lint \u2713 fmt-check \u2713 storybook-build \u2713 bun test src/node/services/ src/browser/features/RightSidebar/ src/browser/utils/slashCommands/ src/browser/utils/goals/ — 3554 pass. --- _Generated with `mux` \u2022 Model: `anthropic:claude-opus-4-7` \u2022 Thinking: `xhigh` \u2022 Cost: `$53.97`_ <!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=xhigh costs=53.97 -->
|
@codex review Slider-driven aesthetic refinement on the per-goal auto-compact tile. Worth a fresh pass — the previous review approved the text-input + inline editor design, but this commit replaces it with a slider tile (Default / Customize / drag / Use default) and reshapes the active-goal grid to be container-query responsive (Budget + Turns + Auto-compact share a row at the wider @md breakpoint; Elapsed becomes a thin full-width row below). The schema, backend, monitor, and slash-command paths are unchanged from the previous round you approved — only Specific things to look at:
Validation: typecheck / lint / fmt-check / storybook-build all green; 3554 tests pass. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3f7bcb102
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Touch-capable browsers fire `touchend` and then synthesize `mouseup` within ~50–100ms. The slider's `commit` was wired to both, so a single finger release ran the backend mutation twice: the parent's async `setGoal` hadn't resolved between the two events, so the `draft !== value` guard couldn't suppress the second call. This produced duplicate `setGoal` writes (and duplicate `goal_replaced` history entries) on touch devices. Introduce a `justCommittedRef` flag on `GoalAutoCompactSliderOverride`: - `commit()` checks the flag first; if set, it bails before calling `onChange`. - `commit()` sets the flag immediately before calling `onChange` — the synthetic `mouseup` that follows on the same tick is then short-circuited. - `updateDraftFromEvent` (the onInput/onChange path) clears the flag whenever a fresh drag/keyboard gesture starts. We use a ref instead of state because we don't want a re-render between commit() and the synthetic mouseup that's about to fire on the same tick. Per AGENTS.md's "avoid timing-based coordination" rule, the flag lifetime is tied to user intent (the next `input` event) rather than a setTimeout. Tests: two new tests pin the contract — one asserts that touchEnd + mouseUp from the same gesture commit exactly once; one asserts a follow-up gesture commits again (the flag resets correctly on next drag). Without the fix the first test fails with two calls. Validation: typecheck \u2713 lint \u2713 fmt-check \u2713 bun test src/browser/features/RightSidebar/GoalTab.test.tsx — 34 pass. --- _Generated with `mux` \u2022 Model: `anthropic:claude-opus-4-7` \u2022 Thinking: `xhigh` \u2022 Cost: `$54.20`_ <!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=xhigh costs=54.20 -->
|
@codex review Addressed the P2 finding (
Fixed in 30774e5 with a
Two new tests pin the contract:
Validation: typecheck / lint / fmt-check all green; GoalTab tests 34 pass. |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Adds an optional per-goal auto-compaction threshold override. Each goal can pin its own auto-compaction behavior (compact aggressively for cost, or disable for fidelity) independently of the workspace's per-model slider. Tri-state matching
budgetCents/turnCap: omitted = no override (the workspace setting applies),null= explicit clear, integer0–100= explicit percent (100= compaction disabled for this goal). Surfaced as an embedded slider tile in the GoalTab — Default / Customize / drag-to-Off — and as/goal --compact <value>on the slash command.Background
The right-sidebar auto-compact slider is keyed per-model, not per-goal — see
getAutoCompactionThresholdKey(model)insrc/common/constants/storage.ts. That works for the steady-state default, but goal-driven workflows often have different cost-vs-fidelity preferences than the workspace default:Defaults to off (the field is absent on legacy/new goal records → no override → existing behavior preserved), so users have to actively opt in to change anything.
Implementation
Data model (
src/common/orpc/schemas/goal.ts,src/common/types/goal.ts)autoCompactionThresholdPct: z.number().int().min(0).max(100).nullable().optional()onGoalRecordV1Schema,GoalSnapshotSchema,GoalSetInputSchema,GoalBoardAddUpcomingInputSchema,GoalBoardUpdateUpcomingInputSchema. Optional + nullable so legacy records load without migration.toGoalSnapshotcarries the field through to the renderer-visible snapshot using the sameObject.hasOwn-style tri-state preservation pattern as the existingcompletionSummaryhandling.Backend service (
src/node/services/workspaceGoalService.ts)SetGoalInputandPendingGoalMutationget the field;createGoal,applyMutableFields,setGoalInternal(mid-stream + projected record),setGoalImmediately(replace branch),addUpcomingGoal,updateUpcomingGoaleach thread it through.applyMutableFieldsusesObject.hasOwnsonull(explicit clear) survives the patch whileundefined(no opinion) preserves the existing value — same pattern asbudgetCents/turnCap.setGoalImmediatelyaddsautoCompactionThresholdPctto itshasMutableChangepredicate so a threshold-only edit triggers the apply branch instead of being treated as a no-op.Compaction consumer (
src/node/services/compactionMonitor.ts,src/node/services/agentSession.ts)CompactionMonitor.checkBeforeSend/checkMidStreamaccept an optionalthresholdOverride(decimal 0–1). When finite, it replacesthis.thresholdfor the single call — the monitor's persistent state is untouched, so the override layer is per-call and never leaks across goals.>= 1clamps to compaction-disabled for that call (matches the slider's "off" sentinel).0is honored as the aggressive extreme (Codex P2 fix — see the dedicated "override of 0% is honored" test). Onlynull/undefined/NaN/Infinity/ negative values fall back tothis.threshold.AgentSession.resolveGoalAutoCompactionThresholdOverride()fetches the active goal via the existing injectedworkspaceGoalServiceand converts the persisted percent → decimal. Called from both compaction call sites (on-send and mid-stream usage-delta handler). Errors are swallowed silently because the resolver runs on the hot path of every send; a corrupt persisted goal record must never interrupt streaming.Renderer banner (
src/browser/components/ChatPane/ChatPane.tsx)effectiveAutoCompactionThresholdconsultsworkspaceState.goal?.autoCompactionThresholdPctfirst, falling back to the per-model slider value, so the warning banner stays in lockstep with what the backend will actually do.UI —
GoalTabslider tile (src/browser/features/RightSidebar/GoalTab.tsx)Replaces the previous text-input editor with an embedded slider tile that has two visual modes:
value == null) — workspace per-model slider governs. Tile renders "Default — workspace setting" plus aCustomizebutton. Slider is hidden because we don't know the workspace per-model threshold at this layer; a ghost thumb would imply a value that isn't really in effect.valueis a number) — native range slider with thumb at the override value. Drag commits on release (onMouseUp/onTouchEnd/onKeyUp), not on every drag step, so backend mutation rate stays bounded by user gestures. The right end is labeled "Off" mirroring the workspace per-model slider; dragging there commits100(per-goal disabled). AUse defaultbutton clears back tonull.Customizeseeds the override atDEFAULT_AUTO_COMPACTION_THRESHOLD_PERCENT(70) so the slider has a concrete value to drag from. Cost of an accidental Customize is one click onUse default.The slider uses
onInput+onChangetogether: happy-dom (thetests/uiruntime) only synthesizes the nativeinputevent for range inputs, soonInputis required for deterministic tests, andonChangesatisfies React's controlled-input contract. The two handlers route to the same idempotent draft updater.UI — responsive grid layout (
src/browser/features/RightSidebar/GoalTab.tsx)The active-goal stat grid is now container-query-driven:
grid-cols-2): Budget col-span-2 (full row); Turns + Elapsed half-half; Auto-compact col-span-2 (full row, slider has room to breathe).@md:grid-cols-3, triggered when the dynamically-resizable right sidebar passes the@mdcontainer width): Budget + Turns + Auto-compact share row 1 (each col-span-1); Elapsed becomes a thin full-width row below. BudgetTile's internal copy stacks vertically at the wider breakpoint so its cost / remaining pair still fits in ~1/3 width.Container queries (
[container-type:inline-size]+@md:) are used instead of viewport breakpoints because the right sidebar is dynamically resizable per AGENTS.md, sosm:/md:wouldn't have responded to sidebar resizing.UI — create form (
src/browser/features/RightSidebar/GoalTab.tsx)GoalCreateFormreuses the same slider tile with local state instead of the previous text input + parser.null(Default mode) means the create intent omits the field entirely; a number is sent verbatim.UI — slash command + handler chain
/goal --compact <value>parses with the same vocabulary as the slider's Off/Default sentinels; composes with-band--turnsin the documented order; non-leading occurrences stay as goal text. The parser flows throughresolveGoalSetIntent(verbatim pass-through, since there's no workspace-level default for this field yet) into the existingsetGoalIPC.RightSidebar.handleGoalUpdateAutoCompactionThresholdPctmirrorshandleGoalUpdateBudget/handleGoalUpdateTurnCap, wiring throughsetGoalWithSingleConflictRetryso the optimistic-concurrency retry loop covers all three.onUpdateAutoCompactionThresholdPctso the prop is typed end-to-end.Validation
make typecheck✅make lint✅make fmt-check✅make storybook-build✅bun test src/node/services/compactionMonitor.test.ts— 13 pass (5 override-lane tests + 1 dedicated Codex P2 test pinning0%as honored)bun test src/node/services/workspaceGoalService.test.ts— 129 pass (6 threshold-pinning tests including legacy-record back-compat)bun test src/browser/utils/slashCommands/goal.test.ts— 22 pass (7--compactparser tests)bun test src/browser/features/RightSidebar/GoalTab.test.tsx— 32 pass (9 slider-tile tests: Default/Off/numeric display; Customize commits 70; drag commits snapped value on mouseUp; drag-to-100 commits 100; mouseUp without change does NOT commit; Use default clears to null; canEdit gating)bun test src/node/services/ src/browser/features/RightSidebar/ src/browser/utils/slashCommands/ src/browser/utils/goals/— 3554 pass / 0 failRisks
Low–medium, scoped to the auto-compaction subsystem.
.optional()+.nullable()andapplyMutableFieldsonly touches it viaObject.hasOwn. A dedicated test (workspaceGoalService.test.ts) writes a pre-field record to disk and assertsgetGoalreturns it parsed with the field undefined.resolveEffectiveThresholddoes the same for malformed values it receives.0%as the aggressive extreme rather than silently falling back) closed the last gap here.draft !== valueguard insidecommit()further suppresses no-op releases.Follow-ups (deferred)
updateUpcomingGoalAPI but not surfaced in the UI's row editor; can be set via--compacton the slash command or after promotion).Pains
inputevent but does NOT fire thechangeevent whenfireEvent.inputis dispatched, so a slider driven only by React'sonChangeis read-only in tests. Took a debug-test pass to isolate (initially looked like a stale-closure / React state-flush issue). Resolved by routing drag updates through bothonInputandonChange; the duplicate handler is documented inline.@mdcontainer width so Budget / Turns / Auto-compact line up on one row exactly as requested. BudgetTile's internal copy had to gain a vertical stacking variant for the narrower tile size.Generated with
mux• Model:anthropic:claude-opus-4-7• Thinking:xhigh• Cost:$53.97