Skip to content

Commit 7becf67

Browse files
authored
feat(064): codify Reviewer-Liveness Contract in execution-policy schema (#742)
Add an optional `liveness` block to `review` stages of the Glass Cockpit executionPolicy schema, expressing health-based, model-diverse reviewer failover: an ordered fallback `roster`, an SLA (`slaMinutes`, default 120), and a per-head re-trigger budget (`maxFallbacksPerHead` default 1, `maxHops` default 2 -> `escalateTo` the CEO). The contract mirrors the MCP-3066 shell backstop (~/.mcpproxy-gatekeeper/bin/ensure-pr-gates.sh: classify_stall + route_fallback), which remains the source of truth. `substantive` request_changes is a mandatory fence (mode3) and is intentionally not a permitted `failoverStallModes` value, so it is never routed around. - `$defs/participant` (deduped) + `$defs/reviewerLiveness`; the same-family exclusion (a roster reviewer must not share the primary's modelFamily) is a cross-entity invariant enforced by the Go test. - Back-compat: `liveness` is optional; policies without it validate and behave unchanged. - Add execution_policy_test.go (schema compile, accept/reject, back-compat, contract-value + same-family-exclusion checks) mirroring the spec-065 corpus_test.go precedent; reuses the existing santhosh-tekuri/jsonschema/v6 dependency (no new deps). - Add reviewer-liveness.example.json documenting T/N/roster with rationale. - Document the contract in agent-instructions/README.md. Related #MCP-3068
1 parent 2f0b8e7 commit 7becf67

4 files changed

Lines changed: 433 additions & 11 deletions

File tree

specs/064-glass-cockpit/agent-instructions/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,10 @@ These are the **canonical source** for the rewritten agent brains. They evolve t
1313
- **Gate 3 (pre-merge)** — agents open PRs, never merge; the human merges on GitHub (branch protection enforced).
1414

1515
## Behavioral contract
16-
The required behaviors (and their probe tests) are pinned in [`../contracts/agent-instructions-contract.md`](../contracts/agent-instructions-contract.md). The execution-policy JSON shape is in [`../contracts/execution-policy.schema.json`](../contracts/execution-policy.schema.json).
16+
The required behaviors (and their probe tests) are pinned in [`../contracts/agent-instructions-contract.md`](../contracts/agent-instructions-contract.md). The execution-policy JSON shape is in [`../contracts/execution-policy.schema.json`](../contracts/execution-policy.schema.json) (validated by `../contracts/execution_policy_test.go`).
17+
18+
### Reviewer-Liveness Contract (FR-014a)
19+
A `review` stage MAY carry an optional `liveness` block describing **health-based, model-diverse reviewer failover**: an ordered fallback `roster`, an SLA (`slaMinutes`, default 120 = 2h), and a per-head re-trigger budget (`maxFallbacksPerHead` default 1, `maxHops` default 2 → `escalateTo` the CEO). It codifies the MCP-3066 shell backstop (`~/.mcpproxy-gatekeeper/bin/ensure-pr-gates.sh`: `classify_stall` + `route_fallback`), which **remains the source of truth**; the schema mirrors it. A *substantive* `request_changes` on the current head is a mandatory fence and is never routed around, so it is not a permitted `failoverStallModes` value. When `liveness` is omitted, a review stage runs a single reviewer with no failover (existing policies are unchanged). See the worked [`../contracts/reviewer-liveness.example.json`](../contracts/reviewer-liveness.example.json) for the contract values and their rationale.
1720

1821
## Roster mapping (live company `16edd8ed-…`)
1922
| Agent | adapterType | Instruction file | Activate for dry-run? |

specs/064-glass-cockpit/contracts/execution-policy.schema.json

Lines changed: 97 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,77 @@
22
"$schema": "https://json-schema.org/draft/2020-12/schema",
33
"$id": "https://mcpproxy.app/specs/064-glass-cockpit/execution-policy.schema.json",
44
"title": "Glass Cockpit executionPolicy (attached to a Paperclip issue)",
5-
"description": "The gate configuration the cockpit attaches to issues. Mirrors Paperclip's IssueExecutionStage model (stage types: review|approval). Gate 2 (per-spec design) and Gate 3 (pre-merge) are 'approval' stages with a user participant; the adversarial review (FR-011) is a 'review' stage with the Critic agent participant placed before the user gate.",
5+
"description": "The gate configuration the cockpit attaches to issues. Mirrors Paperclip's IssueExecutionStage model (stage types: review|approval). Gate 2 (per-spec design) and Gate 3 (pre-merge) are 'approval' stages with a user participant; the adversarial review (FR-011) is a 'review' stage with the Critic agent participant placed before the user gate. A 'review' stage MAY additionally carry a 'liveness' block (the Reviewer-Liveness Contract, FR-014a) describing health-based, model-diverse reviewer failover; see $defs/reviewerLiveness.",
66
"type": "object",
77
"required": ["mode", "stages"],
8+
"$defs": {
9+
"participant": {
10+
"type": "object",
11+
"required": ["type"],
12+
"properties": {
13+
"type": { "type": "string", "enum": ["user", "agent"] },
14+
"userId": { "type": "string", "description": "Required when type=user (the board)." },
15+
"agentId": { "type": "string", "description": "Required when type=agent (e.g. the Gemini Critic)." },
16+
"modelFamily": {
17+
"type": "string",
18+
"description": "Optional provider/model family of an agent participant (e.g. 'openai', 'moonshot', 'gemini', 'glm'). Used by the Reviewer-Liveness Contract to enforce the same-family exclusion: a fallback reviewer must not share the primary reviewer's family (FR-011 model diversity)."
19+
}
20+
}
21+
},
22+
"reviewerLiveness": {
23+
"type": "object",
24+
"description": "Reviewer-Liveness Contract — health-based, model-diverse reviewer failover for a 'review' stage (FR-014a). Codifies the MCP-3066 shell backstop (~/.mcpproxy-gatekeeper/bin/ensure-pr-gates.sh: classify_stall + route_fallback) as a first-class contract. The shipped scripts remain the source of truth; values here mirror them. When this block is omitted, the stage runs a single reviewer (participants[0]) with no failover, so pre-existing policies validate and behave unchanged.",
25+
"required": ["roster"],
26+
"additionalProperties": false,
27+
"properties": {
28+
"slaMinutes": {
29+
"type": "integer",
30+
"minimum": 1,
31+
"default": 120,
32+
"description": "T — minutes the primary reviewer may be silent on the CURRENT head before the stage is classed a silent stall (classify_stall mode1) and a fallback becomes eligible. Default 120 (2h), matching the shell backstop SLA."
33+
},
34+
"maxFallbacksPerHead": {
35+
"type": "integer",
36+
"minimum": 0,
37+
"default": 1,
38+
"description": "N — at most this many fallback hops are triggered per head SHA; the budget resets when the PR head moves. Default 1, mirroring the per-head marker state/fallback-<pr>-<sha>."
39+
},
40+
"maxHops": {
41+
"type": "integer",
42+
"minimum": 1,
43+
"default": 2,
44+
"description": "Cumulative fallback hops across the roster before escalating to escalateTo. Default 2, mirroring the hop ledger state/fallback-<pr>.hops -> escalate_ceo."
45+
},
46+
"escalateTo": {
47+
"$ref": "#/$defs/participant",
48+
"description": "Who receives the escalation issue once maxHops is exhausted (the shell backstop's escalate_ceo; normally the CEO agent)."
49+
},
50+
"failoverStallModes": {
51+
"type": "array",
52+
"uniqueItems": true,
53+
"default": ["silent", "stale_ci_pending"],
54+
"description": "Which classify_stall outcomes make a fallback eligible. 'silent' = no verdict past the SLA on a green head (mode1); 'stale_ci_pending' = a request_changes whose only reason is pending/not-green CI on a now-green head (mode2). 'substantive' is intentionally NOT a permitted value: a substantive request_changes on the current head is a mandatory fence (mode3) and is NEVER routed around.",
55+
"items": { "type": "string", "enum": ["silent", "stale_ci_pending"] }
56+
},
57+
"roster": {
58+
"type": "array",
59+
"minItems": 1,
60+
"description": "Ordered, model-diverse fallback chain tried in sequence when the primary reviewer (the stage's participants[0]) stalls. Reviewers sharing the primary's modelFamily MUST be excluded (the backstop drops the gpt-5.5 'Critic' because the primary CodexReviewer is also gpt-5.5). The same-family exclusion is a cross-entity invariant enforced by the contract test, not by this schema.",
61+
"items": {
62+
"type": "object",
63+
"required": ["agentId", "modelFamily"],
64+
"additionalProperties": false,
65+
"properties": {
66+
"agentId": { "type": "string", "description": "Paperclip agent id of the fallback reviewer." },
67+
"name": { "type": "string", "description": "Human-readable reviewer name, e.g. 'KimiReviewer'." },
68+
"model": { "type": "string", "description": "Concrete model id, e.g. 'kimi-k2', 'gemini-2.5', 'glm-4.7'." },
69+
"modelFamily": { "type": "string", "description": "Provider/model family for the same-family exclusion (e.g. 'moonshot', 'gemini', 'glm')." }
70+
}
71+
}
72+
}
73+
}
74+
}
75+
},
876
"properties": {
977
"mode": {
1078
"type": "string",
@@ -29,15 +97,11 @@
2997
"participants": {
3098
"type": "array",
3199
"minItems": 1,
32-
"items": {
33-
"type": "object",
34-
"required": ["type"],
35-
"properties": {
36-
"type": { "type": "string", "enum": ["user", "agent"] },
37-
"userId": { "type": "string", "description": "Required when type=user (the board)." },
38-
"agentId": { "type": "string", "description": "Required when type=agent (e.g. the Gemini Critic)." }
39-
}
40-
}
100+
"items": { "$ref": "#/$defs/participant" }
101+
},
102+
"liveness": {
103+
"$ref": "#/$defs/reviewerLiveness",
104+
"description": "Optional Reviewer-Liveness Contract; meaningful only on a 'review' stage. See $defs/reviewerLiveness."
41105
}
42106
}
43107
}
@@ -56,6 +120,29 @@
56120
"stages": [
57121
{ "type": "approval", "label": "Pre-merge", "participants": [ { "type": "user", "userId": "local-board" } ] }
58122
]
123+
},
124+
{
125+
"mode": "normal",
126+
"stages": [
127+
{
128+
"type": "review",
129+
"label": "Adversarial review with liveness failover",
130+
"participants": [ { "type": "agent", "agentId": "5b94562c", "modelFamily": "openai" } ],
131+
"liveness": {
132+
"slaMinutes": 120,
133+
"maxFallbacksPerHead": 1,
134+
"maxHops": 2,
135+
"escalateTo": { "type": "agent", "agentId": "2dbf9388" },
136+
"failoverStallModes": ["silent", "stale_ci_pending"],
137+
"roster": [
138+
{ "agentId": "fdaa1d4c", "name": "KimiReviewer", "model": "kimi-k2", "modelFamily": "moonshot" },
139+
{ "agentId": "d89dd1db", "name": "GeminiCritic", "model": "gemini-2.5", "modelFamily": "gemini" },
140+
{ "agentId": "caf2df03", "name": "GLMReviewer", "model": "glm-4.7", "modelFamily": "glm" }
141+
]
142+
}
143+
},
144+
{ "type": "approval", "label": "Pre-merge", "participants": [ { "type": "user", "userId": "local-board" } ] }
145+
]
59146
}
60147
]
61148
}

0 commit comments

Comments
 (0)