Skip to content

Add /goal slash command (session-scoped Stop hook + LLM-judge auto-clear) #4074

@qqqys

Description

@qqqys

What would you like to be added?

Add a built-in /goal <text> slash command (and /goal clear) that pins a free-form objective for the rest of the session. While the goal is active:

  1. A session-scoped Stop hook is registered — every time the assistant tries to end its turn, the hook runs and asks an LLM-as-judge (via runSideQuery) whether the goal is satisfied by the current transcript.
  2. If the goal is NOT yet satisfied, the hook returns a block decision with a stopReason that re-injects the goal text as a directive ("Hook is blocking stop until: …"). The existing Stop-hook loop in client.ts then forces the model to continue working instead of returning to the prompt.
  3. If the goal IS satisfied, the hook removes itself from SessionHooksManager and returns { continue: true } — the turn ends normally, with no further "goal clear" instruction needed from the user.
  4. A footer / banner indicates the active goal so the user can see what's being enforced; /goal with no args shows the current goal; /goal clear removes it manually.
  5. First-turn injection: when /goal <text> is invoked, the command emits a submit_prompt containing a <system-reminder> that names the goal, tells the assistant to acknowledge it briefly and immediately start working, and warns that the Stop hook will block stopping until the condition holds. This bootstraps the loop without an extra user turn.

This mirrors a feature recently shipped in Anthropic's Claude Code CLI, where /goal is one of the highest-leverage primitives for long-horizon agentic work — it lets a user state an outcome once and have the model self-correct toward it across many tool turns, instead of re-prompting after each premature stop.

Concrete behavior (UX spec)

> /goal ship the auth migration end-to-end: tests green, MR opened, reviewer assigned
Goal set: ship the auth migration end-to-end: tests green, MR opened, reviewer assigned

[system-reminder injected into the next turn]
A session-scoped Stop hook is now active with condition: "...". Briefly
acknowledge the goal, then immediately start (or continue) working toward
it — treat the condition itself as your directive and do not pause to ask
the user what to do. The hook will block stopping until the condition holds.
It auto-clears once the condition is met — do not tell the user to run
`/goal clear` after success; that's only for clearing a goal early.

…assistant works, tool turns, attempts to stop…

[Stop hook fires → side-query judges: NOT met → block + stopReason]
Stop hook feedback:
Goal not yet met. Outstanding: tests are still failing on packages/core;
no MR has been pushed. Continue.

…more tool turns…

[Stop hook fires → side-query judges: met → auto-remove + continue]
✅ Goal achieved (auto-cleared)
> /goal
Active goal: ship the auth migration end-to-end: tests green, MR opened, reviewer assigned

> /goal clear
Goal cleared.

Why this is a small change, not a new subsystem

A lot of the plumbing already exists in @qwen-code/qwen-code-core. The audit below maps it out so triage doesn't have to.

Capability /goal needs Already in Qwen Code?
Stop hook event type HookEventName.Stoppackages/core/src/hooks/types.ts, fired by HookEventHandler.fireStopEvent (hookEventHandler.ts:116)
Session-scoped hook registration (not in settings.json) SessionHooksManager.addFunctionHook(sessionId, 'Stop', '*', cb, …)sessionHooksManager.ts:77
Stop-hook → "block + reason → re-prompt model" loop ✅ Already wired in packages/core/src/core/client.ts:1230-1330stopOutput.isBlockingDecision() / shouldStopExecution() triggers continuation, with iteration counting + StopHookLoop event
In-process function hook callback returning HookOutput FunctionHookCallback returns HookOutput | boolean | undefinedtypes.ts:120; StopHookOutput carries stopReason (types.ts:440)
LLM-as-judge primitive for evaluating the goal runSideQuery (text + JSON-mode), fast-model by default — packages/core/src/utils/sideQuery.ts, exported from core/index.ts
Slash-command surface with submit_prompt return SubmitPromptActionReturnpackages/cli/src/ui/commands/types.ts:208; example pattern in rememberCommand.ts
Per-session storage with auto-cleanup on session end SessionHooksManager.removeHook(sessionId, hookId) + already cleared when the session ends

What's actually missing is only the command that wires them together, a small UI banner to surface the active goal, and a judge prompt for the side query.

Proposed implementation plan

Concrete file targets, ready to break into PRs:

  1. packages/cli/src/ui/commands/goalCommand.ts (new) — exports goalCommand: SlashCommand. Behavior:

    • /goal <text> → calls config.getHookSystem()?.getSessionHooksManager().addFunctionHook(sessionId, 'Stop', '*', goalJudgeCallback, …), stores the goal text in UIState so the banner can read it, then returns { type: 'submit_prompt', content: <system-reminder envelope> } to bootstrap the first turn.
    • /goal (no args) → returns the current goal as an info message.
    • /goal clear → removes the hook + clears UIState entry.
    • Register in packages/cli/src/services/BuiltinCommandLoader.ts next to rememberCommand.
  2. packages/core/src/goals/goalJudge.ts (new) — single function judgeGoal(config, goalText, transcript, signal): Promise<{ met: boolean; reason: string }>:

    • Calls runSideQuery with a JSON schema { met: boolean, reason: string }.
    • Fast-model default (matches relevanceSelector.ts precedent).
    • System prompt: "You are a goal-progress judge. Given a stated goal and the recent conversation/tool transcript, decide whether the goal has been fully met. If not, in reason describe the outstanding work concretely so the agent knows what to do next. Be strict — partial completion is NOT met."
  3. goalJudgeCallback (inside goalCommand.ts or a sibling module) — the FunctionHookCallback:

    async (input, ctx) => {
      const { met, reason } = await judgeGoal(config, goalText, ctx?.messages ?? [], ctx?.signal);
      if (met) {
        sessionHooksManager.removeHook(sessionId, hookId);
        uiState.clearGoal();
        return new StopHookOutput({ continue: true, systemMessage: `✅ Goal achieved (auto-cleared)` });
      }
      return new StopHookOutput({
        decision: 'block',
        stopReason: `Goal not yet met: "${goalText}". Outstanding: ${reason}`,
      });
    }

    The existing loop in client.ts:1280 will take the block and force continuation; the StopHookLoop cap will protect against infinite spins.

  4. Banner in packages/cli/src/ui/components/Footer.tsx — read uiState.activeGoal from UIStateContext and render a single-line pill like 🎯 goal: <truncated> next to existing pills (auto-accept indicator, shell mode, etc.). Behind a null check so existing layout is unchanged when no goal is set.

  5. State: add activeGoal: { text: string; hookId: string } | null to UIStateContext with setGoal / clearGoal setters. Persisting to disk is out of scope — goals are session-scoped by design (mirrors Claude's behavior; matches SessionHooksManager's lifecycle).

  6. i18n strings — all user-facing copy through t(...) per the project's existing convention.

  7. Tests (the project has good coverage habits — *.test.ts siblings exist for every command):

    • goalCommand.test.ts: argument parsing, register/clear round-trip, banner state mutations, submit_prompt payload contains the system-reminder envelope.
    • goalJudge.test.ts: schema validation, met vs not-met branches with mocked runSideQuery.
    • Integration: register a goal, fire fireStopEvent against a transcript that doesn't satisfy it → assert block with stopReason; satisfy it → assert hook self-removes and returns continue.
    • No wait()-based UI tests — assert state via the hook layer and submit_prompt return values; rendering should be covered through component-level tests if at all (consistent with the project's existing test patterns for hooksCommand, rememberCommand).

Acceptance criteria

  • /goal <text> registers a session-scoped Stop hook and emits a one-shot system-reminder via submit_prompt.
  • When the assistant tries to stop and the goal is not met, the model receives stopReason and continues; the user does NOT see an empty "Done" turn.
  • When the LLM-judge marks the goal met, the hook auto-removes itself; the next stop attempt ends the turn normally and a ✅ Goal achieved (auto-cleared) line is shown.
  • /goal (no args) reports the active goal; /goal clear removes it.
  • An indicator (footer pill) shows when a goal is active, and disappears when cleared.
  • The existing StopHookLoop iteration cap prevents runaway loops; document the cap in the side-query prompt so the judge knows it's the last line of defense.
  • No new dependencies; no settings.json schema changes.
  • Unit + integration tests as listed above; lint + typecheck green.

Open questions for maintainers

  1. Model for the judge — default to getFastModel() (matches relevanceSelector, forget)? Or always run on the main model so judgments stay consistent with the assistant's reasoning? Suggest: fast-model default, with a hidden setting to pin to main.
  2. Transcript window — feed the judge the full session, the last N messages, or only the assistant turns since /goal was set? The latter is cheapest and matches the spirit of "did you make progress since I told you?".
  3. Interaction with subagent stops — should /goal also intercept SubagentStop, or only the top-level Stop? Suggest: top-level only, to keep subagent loops scoped to their own task.
  4. /goal re-set semantics — if a goal is already active, does /goal <new> replace it (proposed) or stack? Replace is simpler and matches Claude's behavior.
  5. Hook visibility in /hooks listhooksCommand.ts already lists session hooks under "Session (temporary)". The goal hook will appear there for free; should it get a special label like [Goal]?

Why is this needed?

Long-horizon agentic work is the area where Qwen Code most needs a reliability primitive.

Today, the workflow loop for any multi-step task in Qwen Code looks like this:

user prompts → assistant runs N tools → assistant decides it's "done enough" → returns → user notices it stopped early → re-prompts with "keep going" → repeat 3-6 times until the actual goal is met.

Each keep going is wasted tokens, wasted attention, and — worse — an interrupt that the user has to remember to issue. For tasks like "land this migration", "make CI green", "audit and fix all the linter warnings in packages/core/", the loop above can drop the user out of flow 5+ times in a 30-minute session.

/goal collapses that loop into a single user action. The hook system already in core does the mechanical work (intercepting Stop, re-prompting the model with feedback); what's missing is a UX surface that exposes it.

Why now, why this design:

  • The infrastructure is already paid for. Qwen Code's Stop-hook plumbing (client.ts:1230-1330, StopHookOutput, SessionHooksManager, runSideQuery, StopHookLoop loop-detection) was added for the SDK and skills use cases. /goal is mostly a thin slash command on top of it — maybe ~200 LOC of new code plus tests.
  • Parity with the reference implementation. This repo already treats Claude Code's CLI as a reference point for new feature design (e.g. session hooks, skills, side queries are all inspired by Claude's primitives). /goal is one of the most-used features users adopt from that ecosystem and it composes cleanly with Qwen's existing hook model.
  • LLM-as-judge is already a sanctioned pattern in this codebase. relevanceSelector.ts, memory/forget.ts, web-fetch.ts, followup/suggestionGenerator.ts all use runSideQuery exactly this way. We're not introducing a new architectural concept — just applying an existing one to a new event.
  • It composes with skills. Skills can already register session Stop hooks via frontmatter; /goal is the manual / interactive equivalent. The two co-exist without a special-case (both go through SessionHooksManager, both are listed under "Session (temporary)" in /hooks list).
  • Cheap to opt out. No setting flips; no migration. If /goal isn't typed, nothing changes. If it's typed, exactly one extra side-query runs per stop attempt — bounded by the StopHookLoop cap.

The net result: users get a "set an outcome and walk away" affordance, the project gets a feature that maps 1:1 onto primitives it already owns, and the maintenance surface is one slash command + one judge function.


Additional context

  • Reference: Anthropic Claude Code CLI ships a /goal command with the exact behavior described above (session-scoped Stop hook + LLM-judge auto-clear + first-turn directive injection). The system-reminder text it injects, observed verbatim, is:

    A session-scoped Stop hook is now active with condition: "<goal>". Briefly acknowledge the goal, then immediately start (or continue) working toward it — treat the condition itself as your directive and do not pause to ask the user what to do. The hook will block stopping until the condition holds. It auto-clears once the condition is met — do not tell the user to run /goal clear after success; that's only for clearing a goal early.

    This is a reasonable verbatim starting point for the system-reminder envelope; the project can refine wording as needed.

  • Related primitives already in repo (for reviewers):

    • packages/core/src/hooks/sessionHooksManager.ts — session-scoped hook storage and lookup
    • packages/core/src/hooks/hookEventHandler.ts (fireStopEvent, line 116) — the Stop event entrypoint
    • packages/core/src/core/client.ts:1230-1330 — the existing loop that turns a blocking Stop-hook output back into a model turn, with StopHookLoop event emission for diagnostics
    • packages/core/src/utils/sideQuery.ts — the side-query primitive used by relevanceSelector, forget, web-fetch, etc.
    • packages/cli/src/ui/commands/rememberCommand.ts — pattern for a slash command that returns submit_prompt to inject a directive into the next turn
    • packages/cli/src/services/BuiltinCommandLoader.ts:95-150 — where to register the new command
  • Out of scope for the first PR (could be follow-ups):

    • Persisting goals across /resume of a session.
    • Multi-goal stacking.
    • Goal templates / saved goals.
    • Surfacing per-stop judge reasoning in the transcript UI (we get it for free via stopReason; whether to render it specially is a UX call).
  • Risks / mitigations:

    • Cost of the judge per stop attempt → use the fast model + a tight transcript window; pre-existing StopHookLoop cap is the runaway protection.
    • False negatives (judge says "not done" when it is) → the loop cap and the user's ability to /goal clear cover this; in practice, the judge prompt should bias toward "met" when ambiguous to avoid spinning.
    • False positives (judge says "done" when it isn't) → the hook auto-clears but the user can simply /goal again with a refined statement; no destructive action is involved.

Happy to take this on if maintainers agree on the design. The five files listed above are the only touch points needed for a clean first PR.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions