feat(skills): orchestrator tier 2 — multi-model routing, DAG, domain memory, follow-up Q&A#8
Merged
Merged
Conversation
…memory, follow-up Q&A Builds on tier 1 (PR #5). Adds four cost-and-intelligence primitives without changing the trust model (strictly offline). - V5 Multi-model cost routing: Haiku for triage/classification/Q&A, Sonnet for synthesis and deep dive, Opus for the adversarial pass. Cuts cost ~40% vs all-Sonnet at no quality loss on high-stakes phases. Override via --model-tier {economy|standard|maximum} and --no-adversarial. - V6 Skill-graph DAG: replaces fixed phase ordering with a dynamic dependency DAG built from artifact types and probe findings. Probes that depend on each other sequence correctly; everything else runs in parallel. Dynamic edges open during the walk based on findings (e.g., S9 sniffing in sqlplan-review opens an edge to query-store-review for plan-instability). --phases flag reverts to tier-1 fixed-phase behavior. - V9 Domain memory: per-instance facts (MAXDOP, cores, AG topology, partitioning, RCSI status, trace flags, user_notes) loaded from a user-managed JSON at ~/.mssql-perf-review/instances/<server>.json. Validates every recommendation against the facts: redundant recommendations rejected, environment-aware escalators applied (partition alignment, AG primary, Standard edition LOB rebuild). Stale facts (>90 days) trigger a downgrade warning. Orchestrator never writes facts.json silently. - V10 Follow-up Q&A: after the report, the orchestrator stays in the session to answer questions ("why this index ordering?", "why was MAXDOP not recommended?"). Most follow-ups are free — they read from the in-context evidence chain. Five-category question taxonomy, when-to-probe rules, refusal patterns, and structured answer format with evidence citation. Added: - skills/mssql-performance-review/references/model-routing.md - skills/mssql-performance-review/references/skill-dag.md - skills/mssql-performance-review/references/domain-memory.md - skills/mssql-performance-review/references/followup-qa.md - references/README.md updated to list and explain the 4 new references - SKILL.md: primitive list grows from 4 to 8; new sections describe DAG dispatch, model routing, domain memory loading, follow-up Q&A behavior - SKILL.md still well under the 1000-line cap (363 lines) verify-docs.sh: 31 PASS, 0 WARN, 0 FAIL
Resolves PR #6 review comment. Line 80 of model-routing.md said 'Adversarial pass is always at least Sonnet ... Opus is the default', contradicting: - Line 106: 'adversarial pass (always Opus or higher)' - Line 108: 'Adversarial pass cannot be downgraded. Even on --model-tier economy, the adversarial check runs on Opus.' - SKILL.md line 126: 'Opus 4.7 (cannot be downgraded even on --model-tier economy — quality-critical)' The contradiction risked an LLM following the override-rules section using Sonnet for the adversarial pass on economy tier, breaking the stated quality guarantee. Rewrote line 80 to align with the other safeguards: 'Adversarial pass is always Opus ... cannot be downgraded by any tier flag.' verify-docs.sh: 31 PASS.
Resolves PR #6 review comment. Line 57 of risk-rubric.md previously said: 'Domain memory escalators are checked once tier-2 introduces facts.json (see backlog plan v4). In tier 1, assume defaults from this table apply.' This forward-reference became stale when tier 2 (this PR) introduced references/domain-memory.md. Replaced with a direct reference to the now-present file and the canonical facts.json path: 'Domain memory escalators are checked against references/domain-memory.md when a facts file is loaded for the target instance (~/.mssql-perf-review/instances/<server>.json). When no facts file is present, the defaults in this table apply.' verify-docs.sh: 31 PASS.
1. Tautological 'to Opus from Opus' in model-routing.md line 53. The user-situation table row for 'previous review missed the obvious problem' described --model-tier maximum as 'escalates the adversarial pass to Opus from Opus'. Since adversarial already runs Opus on --model-tier standard (line 14), the parenthetical was tautological. Rewrote to describe what maximum actually adds: synthesis and deep-dive escalate to Opus (adversarial unchanged). 2. Bogus flag syntax in followup-qa.md line 152. The follow-up Q&A reference described a dispatch as '--model-tier maximum --no-adversarial=false'. --no-adversarial is a boolean flag (model-routing.md line 25), not a --flag=<value> token; '--no-adversarial=false' is not valid syntax. Since adversarial is on by default, --no-adversarial isn't needed at all for an adversarial-emphasising follow-up. Simplified to just '--model-tier maximum' with a parenthetical noting adversarial is the default. verify-docs.sh: 31 PASS.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Continuation of the mssql-performance-review orchestrator, adding the intelligence and cost-control layer on top of the tier-1 agentic core.
--model-tierflag for economy/standard/maximum overridesfacts.jsonschema; rejection/escalation catalogue; facts the orchestrator consumes (MAXDOP, edition, RCSI state, AG topology)Reference files added:
model-routing.md,skill-dag.md,domain-memory.md,followup-qa.mdCloses #6 (retargeted from tier1 base after tier1 merged)
Test plan
bash scripts/verify-docs.sh— 31 PASSmodel-routing.md: adversarial pass documented as always Opus (cannot be downgraded)domain-memory.md: facts.json schema present, rejection catalogue covers MAXDOP/edition/RCSI/AGskill-dag.md: DAG walk algorithm and dynamic edge catalogue presentfollowup-qa.md: five-category taxonomy and cost guard thresholds present