Context
PDD capability policy checks need a bounded agentic reviewer that can follow local import/wrapper chains to collect evidence for ambiguous effects that deterministic checks miss (e.g. notificationClient.sendRefundNotice(...) resolving to resend.emails.send(...)).
Task
Create a new PDD module agentic_reviewer by adding pdd/prompts/agentic_reviewer_python.prompt.
The module must:
- Accept as input:
- A parsed contract/effect IR (from
pdd/prompts/contract_ir_python.prompt schema: {modal, action, resource} list)
- A list of target artifact file paths to inspect
- Bounds config:
max_files, max_follow_depth, max_search_results, max_runtime_seconds
- Implement bounded evidence collection:
- Read target artifact: extract imports, calls, env reads, file writes, network calls, logging calls
- Inspect dependency manifests:
package.json, requirements.txt, pyproject.toml, go.mod
- For ambiguous local symbols: follow local import/definition chains up to
max_follow_depth (default: 2)
- Optionally inspect co-located tests/docs that mention the same symbol
- Hard stop at
max_files files inspected
- For ambiguous effects, call a constrained LLM classifier with structured input:
{"contract_effects": [...], "target": "typescript", "observed_evidence": [...], "deterministic_findings": []}
LLM must return strict JSON findings list. On invalid JSON or LLM unavailability: emit no agentic findings (graceful fallback, do not propagate exception).
- Return normalized findings using same schema as deterministic findings plus extra fields:
source: "agentic_reviewer"
severity: "warning" (default)
confidence: float
effect: {action, resource}
message: str
evidence: [{file, line, excerpt}]
agent_steps: [...]
- When evidence is insufficient, return:
{"judgment": "unknown", "confidence": <0.5, "message": "Insufficient evidence..."}
- Mode: read-only, no network access, no code execution.
Reference files
pdd/prompts/contract_ir_python.prompt — input IR schema
pdd/prompts/evidence_manifest_python.prompt — output schema conventions
context/agentic_checkup_example.py — agentic module patterns
context/agentic_checkup_orchestrator_example.py — orchestrator patterns
Acceptance Criteria
- Module
agentic_reviewer is importable and callable with contract IR + file paths + bounds
- Given a TypeScript file that calls
notificationClient.sendRefundNotice(email) and a wrapper file clients/notificationClient.ts that contains resend.emails.send(...), and a contract with MUST_NOT send email, the reviewer returns a warning finding with evidence from both files
- Following stops at
max_follow_depth=2; symbols beyond that depth are not traversed
- Invalid LLM JSON (e.g. truncated or non-JSON response) causes graceful fallback: no findings emitted, no exception raised
- When evidence is ambiguous or missing, returns
judgment: unknown with confidence < 0.5
- All bounds (
max_files, max_follow_depth, max_search_results) are respected
- Unit tests cover: wrapper-following (email hidden behind local client), unknown dependency classified as possible violation, insufficient evidence returning unknown, max-depth cutoff, invalid LLM JSON fallback
PDD Command Hint
change, sync
Split Contract
Command sequence: change → sync
Allowed write set:
pdd/prompts/agentic_reviewer_python.prompt
Acceptance criteria:
- New pdd/prompts/agentic_reviewer_python.prompt is created and agentic_reviewer module is generated
- Given a target file calling a local wrapper that resolves to resend.emails.send(), reviewer returns a warning finding with evidence from both files when contract forbids sending email
- Symbol following stops at max_follow_depth; beyond that depth is not traversed
- Invalid LLM JSON triggers graceful fallback: no findings emitted and no exception raised
- Insufficient evidence returns judgment: unknown with confidence < 0.5
- All bounds (max_files, max_follow_depth, max_search_results) are enforced
- Tests cover wrapper-following, unknown dependency, insufficient evidence, max-depth cutoff, invalid LLM JSON fallback
Independently mergeable: True
Scope rule: Do not expand beyond this contract or implement sibling sub-issue work. If the contract is insufficient, report the gap instead.
PDD Command Hint: This is a new feature. Use change → sync (modify prompts, then generate and validate code).
Parent: #1371
Parent issue: #1371
Context
PDD capability policy checks need a bounded agentic reviewer that can follow local import/wrapper chains to collect evidence for ambiguous effects that deterministic checks miss (e.g.
notificationClient.sendRefundNotice(...)resolving toresend.emails.send(...)).Task
Create a new PDD module
agentic_reviewerby addingpdd/prompts/agentic_reviewer_python.prompt.The module must:
pdd/prompts/contract_ir_python.promptschema:{modal, action, resource}list)max_files,max_follow_depth,max_search_results,max_runtime_secondspackage.json,requirements.txt,pyproject.toml,go.modmax_follow_depth(default: 2)max_filesfiles inspected{"contract_effects": [...], "target": "typescript", "observed_evidence": [...], "deterministic_findings": []}source: "agentic_reviewer"severity: "warning"(default)confidence: floateffect: {action, resource}message: strevidence: [{file, line, excerpt}]agent_steps: [...]{"judgment": "unknown", "confidence": <0.5, "message": "Insufficient evidence..."}Reference files
pdd/prompts/contract_ir_python.prompt— input IR schemapdd/prompts/evidence_manifest_python.prompt— output schema conventionscontext/agentic_checkup_example.py— agentic module patternscontext/agentic_checkup_orchestrator_example.py— orchestrator patternsAcceptance Criteria
agentic_revieweris importable and callable with contract IR + file paths + boundsnotificationClient.sendRefundNotice(email)and a wrapper fileclients/notificationClient.tsthat containsresend.emails.send(...), and a contract withMUST_NOT send email, the reviewer returns awarningfinding with evidence from both filesmax_follow_depth=2; symbols beyond that depth are not traversedjudgment: unknownwithconfidence < 0.5max_files,max_follow_depth,max_search_results) are respectedPDD Command Hint
change, sync
Split Contract
Command sequence: change → sync
Allowed write set:
pdd/prompts/agentic_reviewer_python.promptAcceptance criteria:
Independently mergeable: True
Scope rule: Do not expand beyond this contract or implement sibling sub-issue work. If the contract is insufficient, report the gap instead.
PDD Command Hint: This is a new feature. Use
change → sync(modify prompts, then generate and validate code).Parent: #1371
Parent issue: #1371