Skip to content

🤖 feat: advisors-as-code (configuration via .mux/advisors/<name>/ADVISOR.md)#3361

Open
ammar-agent wants to merge 2 commits into
mainfrom
advisor-0dak
Open

🤖 feat: advisors-as-code (configuration via .mux/advisors/<name>/ADVISOR.md)#3361
ammar-agent wants to merge 2 commits into
mainfrom
advisor-0dak

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

Summary

Graduate the advisor tool from experimental to GA. Configuration moves from a Settings UI + global cfg fields to per-advisor markdown files at .mux/advisors/<name>/ADVISOR.md (mirroring the skills loader). Multiple advisors can coexist; the model selects one per call via a new required advisor_name parameter. Opt-in by construction: the tool only appears when at least one ADVISOR.md is loaded — no experiment toggle to remember, no settings page to discover.

Background

The pre-GA advisor was one global model toggled via Settings → Experiments → Advisor Tool, with per-agent enable switches under Tasks. The framing answered "should the advisor be on?" — which was the wrong question. The right question is "which specialist do I hand this to?", and that's what advisors-as-code captures.

Key wins:

  • Specialization beats toggling. A ml-fellow advisor on opus + high for theorem work coexists with a code-review advisor on haiku for PR review without either starving the other.
  • One-to-one mirror of .mux/skills/. Same scope rules (project wins over global), same loader, same hot-reload (no caching, takes effect next turn), same diagnostics envelope (bad YAML doesn't crash the catalog).
  • Configuration-as-code is shareable via git. Like AGENTS.md and SKILL.md, ADVISOR.md travels with the project — a team's specialist roster lives in the repo.

Implementation

File layout

.mux/advisors/<name>/ADVISOR.md        # project scope (wins on name collision)
~/.mux/advisors/<name>/ADVISOR.md      # global scope

Frontmatter schema

Field Required Notes
description yes Single-line "use for X" — joins the tool description so the model knows when to pick this advisor.
model yes Full provider:model form.
thinking no ThinkingLevel. Defaults to off.
max_uses_per_turn no null = unlimited, positive int = cap. Defaults to 3.
max_output_tokens no Same shape. Defaults to unlimited.
agents no Restrict to a subset of agents. Omitted = available to all.

Body (after the closing ---) appends to the base advisor system prompt for per-advisor persona/voice/focus.

Tool wiring

  • ToolConfiguration.advisorRuntime now carries advisors: readonly AdvisorPackage[] (already filtered by effective agent) and defaultMaxUsesPerTurn. Per-advisor budgets read directly from frontmatter at execute time.
  • AdvisorToolInputSchema adds a required advisor_name. Tool description is rebuilt per-stream with the live catalog, so the model discovers advisors there (not the system prompt — the prior <advisor-guidance> section is removed).
  • Unknown advisor_name → self-correctable error result that lists the live catalog, so the model can retry in the same turn instead of crashing it.
  • Per-advisor per-turn usage counters run independently: ml-fellow running out of budget can't starve code-review.

Slash commands

Command Effect
/advisor List loaded advisors + diagnostics for malformed files.
/advisor init <name> Scaffold .mux/advisors/<name>/ADVISOR.md from a working template.

ORPC: advisors.list (diagnostics-flavored) + advisors.scaffold (returns the source path).

Tool display

The advisor name becomes the lead identity pill in AdvisorToolCall — e.g., advisor [ml-fellow] · done · 3.2s — so the user sees which advisor handled the turn before drilling in for the underlying model.

Gating refactor

  • resolveAgentForStream takes advisorAvailable: boolean (was isAdvisorExperimentEnabled) and opens the regex when any advisor is loaded; per-agent filtering via the agents: frontmatter field happens at runtime bundle assembly.
  • The EXPERIMENT_IDS.ADVISOR_TOOL gate, AdvisorToolExperimentConfig.tsx, the per-agent advisorEnabled cfg field, and the cfg.advisor{ModelString,ThinkingLevel,MaxUsesPerTurn,MaxOutputTokens} keys are all gone. Old keys in user configs are silently ignored.
  • The prior policy-driven system prompt rebuild block in aiService.ts is gone too — advisor guidance now lives in the tool description, so one build is always correct.

Validation

  • 19 new unit tests for the parser + loader (project/global scope, name collision, malformed YAML diagnostics, bad directory names, agent filtering, descriptor projection, scaffold refuse-to-overwrite).
  • Rewrote advisor tool tests around the new shape: unknown-name error, per-advisor + default budget caps, unlimited sentinel, body composition into system prompt, handoff message formatting.
  • Updated aiService.test.ts harness to seed advisors via .mux/advisors/default/ADVISOR.md files instead of cfg writes.
  • make static-check green; bun test shows 7957 pass / 1 pre-existing flake (GeneralSection > persists the collapsed bash summaries display mode — passes in isolation, fails in the full sweep due to happy-dom global isolation; unrelated to this change).

Risks

  • Breaking schema change to AdvisorToolInputSchema (adds required advisor_name). The pre-GA tool was experiment-gated and effectively unreleased to most users; transcripts pinned to the old shape become unreplayable, but the failure mode is bounded to advisor tool calls. Acceptable because the GA flip is atomic with the schema change.
  • Removed cfg fields (advisorModelString, advisorThinkingLevel, advisorMaxUsesPerTurn, advisorMaxOutputTokens, agentAiDefaults.<id>.advisorEnabled) are silently dropped via Zod's .passthrough(). Migration path: /advisor init default then edit the file.
  • Removed the experiments.advisorTool field from the send-options chain and ExperimentsSchema. Old clients sending it will have it stripped; backend behavior is unchanged.

Pains

The pre-GA wiring threaded the experiment flag through 7+ files (aiService.ts, agentResolution.ts, resolveToolPolicy.ts, streamContextBuilder.ts, send-options chain, Settings UI). Untangling those into the simpler "is any advisor loaded?" gate took most of the diff. The biggest cleanup win: the pre-policy/post-policy system prompt rebuild dance in aiService.ts is gone, because guidance now lives in the dynamic tool description instead of the system prompt.


Generated with mux • Model: anthropic:claude-opus-4-7 • Thinking: max • Cost: $10.51

…SOR.md)

Graduate the advisor tool from experimental to GA. Configuration shifts from
Settings UI + cfg fields to per-advisor markdown files, mirroring the skills
loader pattern. Multiple advisors can coexist; the model picks one per call
via advisor_name. Opt-in by construction: the tool only appears when at least
one ADVISOR.md is loaded.

---

_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `max` • Cost: `$10.51`_

<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=max costs=10.51 -->
@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 21, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
Mux 🟢 Ready View Preview May 21, 2026, 11:53 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0df6592532

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/common/orpc/schemas/advisor.ts
Codex review caught that 'model: "   "' parsed at load time but blew up at
execute time, breaking the 'invalid files surface in diagnostics' invariant.
Tighten the schema with a refine() so whitespace-only models fail at
discovery (where they're collected into invalidAdvisors) instead of
crashing the advisor turn.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Tightened AdvisorFrontmatterSchema.model with a .refine() that rejects whitespace-only values, so the diagnostics envelope catches model: " " at load time instead of blowing up at execute time. Added a test case to lock in the behavior.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant