Skip to content

feat(skills): orchestrator tier 2 — multi-model routing, DAG, domain memory, follow-up Q&A#8

Merged
vanterx merged 4 commits into
mainfrom
feat/perf-review-orchestrator-v4-tier2
May 18, 2026
Merged

feat(skills): orchestrator tier 2 — multi-model routing, DAG, domain memory, follow-up Q&A#8
vanterx merged 4 commits into
mainfrom
feat/perf-review-orchestrator-v4-tier2

Conversation

@vanterx
Copy link
Copy Markdown
Owner

@vanterx vanterx commented May 18, 2026

Summary

Continuation of the mssql-performance-review orchestrator, adding the intelligence and cost-control layer on top of the tier-1 agentic core.

  • V5 Multi-model routing — Haiku for triage/dispatch, Sonnet for deep-dive and synthesis, Opus for adversarial pass; --model-tier flag for economy/standard/maximum overrides
  • V6 Skill-graph DAG — static dependency edges + dynamic edges opened by findings; parallel subagent dispatch; full DAG walk algorithm with worked examples
  • V9 Domain memory — per-instance facts.json schema; rejection/escalation catalogue; facts the orchestrator consumes (MAXDOP, edition, RCSI state, AG topology)
  • V10 Follow-up Q&A — five-category question taxonomy; free in-context answers vs cheap new-probe dispatch; cost guard warnings; common question patterns

Reference files added: model-routing.md, skill-dag.md, domain-memory.md, followup-qa.md

Closes #6 (retargeted from tier1 base after tier1 merged)

Test plan

  • bash scripts/verify-docs.sh — 31 PASS
  • model-routing.md: adversarial pass documented as always Opus (cannot be downgraded)
  • domain-memory.md: facts.json schema present, rejection catalogue covers MAXDOP/edition/RCSI/AG
  • skill-dag.md: DAG walk algorithm and dynamic edge catalogue present
  • followup-qa.md: five-category taxonomy and cost guard thresholds present

vanterx added 4 commits May 17, 2026 22:10
…memory, follow-up Q&A

Builds on tier 1 (PR #5). Adds four cost-and-intelligence primitives without
changing the trust model (strictly offline).

- V5 Multi-model cost routing: Haiku for triage/classification/Q&A,
  Sonnet for synthesis and deep dive, Opus for the adversarial pass.
  Cuts cost ~40% vs all-Sonnet at no quality loss on high-stakes phases.
  Override via --model-tier {economy|standard|maximum} and --no-adversarial.

- V6 Skill-graph DAG: replaces fixed phase ordering with a dynamic
  dependency DAG built from artifact types and probe findings. Probes that
  depend on each other sequence correctly; everything else runs in parallel.
  Dynamic edges open during the walk based on findings (e.g., S9 sniffing in
  sqlplan-review opens an edge to query-store-review for plan-instability).
  --phases flag reverts to tier-1 fixed-phase behavior.

- V9 Domain memory: per-instance facts (MAXDOP, cores, AG topology,
  partitioning, RCSI status, trace flags, user_notes) loaded from a
  user-managed JSON at ~/.mssql-perf-review/instances/<server>.json.
  Validates every recommendation against the facts: redundant recommendations
  rejected, environment-aware escalators applied (partition alignment, AG
  primary, Standard edition LOB rebuild). Stale facts (>90 days) trigger a
  downgrade warning. Orchestrator never writes facts.json silently.

- V10 Follow-up Q&A: after the report, the orchestrator stays in the session
  to answer questions ("why this index ordering?", "why was MAXDOP not
  recommended?"). Most follow-ups are free — they read from the in-context
  evidence chain. Five-category question taxonomy, when-to-probe rules,
  refusal patterns, and structured answer format with evidence citation.

Added:
- skills/mssql-performance-review/references/model-routing.md
- skills/mssql-performance-review/references/skill-dag.md
- skills/mssql-performance-review/references/domain-memory.md
- skills/mssql-performance-review/references/followup-qa.md
- references/README.md updated to list and explain the 4 new references
- SKILL.md: primitive list grows from 4 to 8; new sections describe DAG
  dispatch, model routing, domain memory loading, follow-up Q&A behavior
- SKILL.md still well under the 1000-line cap (363 lines)

verify-docs.sh: 31 PASS, 0 WARN, 0 FAIL
Resolves PR #6 review comment. Line 80 of model-routing.md said
'Adversarial pass is always at least Sonnet ... Opus is the default',
contradicting:
- Line 106: 'adversarial pass (always Opus or higher)'
- Line 108: 'Adversarial pass cannot be downgraded. Even on
  --model-tier economy, the adversarial check runs on Opus.'
- SKILL.md line 126: 'Opus 4.7 (cannot be downgraded even on
  --model-tier economy — quality-critical)'

The contradiction risked an LLM following the override-rules section
using Sonnet for the adversarial pass on economy tier, breaking the
stated quality guarantee. Rewrote line 80 to align with the other
safeguards: 'Adversarial pass is always Opus ... cannot be downgraded
by any tier flag.'

verify-docs.sh: 31 PASS.
Resolves PR #6 review comment. Line 57 of risk-rubric.md previously said:

  'Domain memory escalators are checked once tier-2 introduces facts.json
   (see backlog plan v4). In tier 1, assume defaults from this table apply.'

This forward-reference became stale when tier 2 (this PR) introduced
references/domain-memory.md. Replaced with a direct reference to the
now-present file and the canonical facts.json path:

  'Domain memory escalators are checked against references/domain-memory.md
   when a facts file is loaded for the target instance
   (~/.mssql-perf-review/instances/<server>.json). When no facts file is
   present, the defaults in this table apply.'

verify-docs.sh: 31 PASS.
1. Tautological 'to Opus from Opus' in model-routing.md line 53.
   The user-situation table row for 'previous review missed the obvious
   problem' described --model-tier maximum as 'escalates the adversarial
   pass to Opus from Opus'. Since adversarial already runs Opus on
   --model-tier standard (line 14), the parenthetical was tautological.
   Rewrote to describe what maximum actually adds: synthesis and
   deep-dive escalate to Opus (adversarial unchanged).

2. Bogus flag syntax in followup-qa.md line 152.
   The follow-up Q&A reference described a dispatch as
   '--model-tier maximum --no-adversarial=false'. --no-adversarial is
   a boolean flag (model-routing.md line 25), not a --flag=<value>
   token; '--no-adversarial=false' is not valid syntax. Since
   adversarial is on by default, --no-adversarial isn't needed at all
   for an adversarial-emphasising follow-up. Simplified to just
   '--model-tier maximum' with a parenthetical noting adversarial is
   the default.

verify-docs.sh: 31 PASS.
@vanterx vanterx merged commit bdd294f into main May 18, 2026
@vanterx vanterx deleted the feat/perf-review-orchestrator-v4-tier2 branch May 18, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant