feat(mcp/plugin): enforce clarification gate before planning (#1423) by JeremyDev87 · Pull Request #1425 · JeremyDev87/codingbuddy

JeremyDev87 · 2026-04-07T09:02:28Z

Summary

MCP path: When clarificationNeeded=true, parse_mode now overrides the instructions field with a clarification-first directive that tells the AI to ask exactly one question and STOP before planning
Standalone path: Added evaluate_clarification_standalone() in mode_engine.py mirroring the MCP server's heuristics (override phrases, vague intent, tech references, prompt length) so standalone mode also enforces question-first behavior for PLAN/AUTO
User prompt hook: Passes the raw prompt to build_instructions() so the standalone gate can evaluate it
Tests: 7 new regression tests for MCP path + 12 new tests for standalone path — all passing (6171 + 60)

Test plan

yarn workspace codingbuddy exec tsc --noEmit — type-check passes
yarn workspace codingbuddy test -- --run — 244 files, 6171 tests pass
python3 -m pytest test_mode_engine.py -v — 60 tests pass
yarn prettier --write . — formatted
CI checks on PR

- Add clarification-first instruction override in parse_mode response when clarificationNeeded=true (MCP path) - Add standalone clarification gate in mode_engine.py mirroring MCP server heuristics for PLAN/AUTO modes (plugin standalone path) - Pass user prompt to ModeEngine.build_instructions() for evaluation - Add 7 regression tests in mode.handler.spec.ts verifying directive is applied/skipped correctly - Add 12 tests in test_mode_engine.py for standalone gate behavior Closes #1423

vercel · 2026-04-07T09:02:33Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
codingbuddy-landing	Ready	Preview, Comment	Apr 7, 2026 9:20am

JeremyDev87

Review: CHANGES_REQUESTED

CI Status: FAIL

9 e2e tests fail in test_user_prompt_submit_e2e.py — all directly caused by this PR's new clarification gate intercepting short/ambiguous PLAN/AUTO prompts that existing tests send.

Issues Found:

[critical] E2E tests not updated for new behavior: The 9 failing tests send prompts like PLAN design auth feature (19 chars after keyword strip, < MIN_PROMPT_LENGTH=20) or Korean/Japanese/Chinese equivalents. The clarification gate correctly identifies them as ambiguous, but the tests still assert # Mode: PLAN instead of the new 🔴 CLARIFICATION REQUIRED output. Fix: Either (a) update test prompts to be specific enough (add file paths or function refs so they pass the gate), or (b) add new e2e test cases asserting the clarification behavior and update existing prompts.

Affected tests:
- TestEnglishModeKeywords[PLAN design auth feature-PLAN]
- TestKoreanModeKeywords[계획 인증 기능 설계-PLAN], [자동 대시보드 구현-AUTO]
- TestJapaneseModeKeywords[計画認証機能の設計-PLAN], [自動ダッシュボード実装-AUTO]
- TestChineseModeKeywords[计划设计认证功能-PLAN], [自动实现仪表板-AUTO]
- TestContextInjection::test_context_contains_agent_name, ::test_case_insensitive_detection
[medium] Standalone path lacks questionBudget tracking: The MCP path includes budget in the directive (Remaining question budget: N, question_budget=N), but evaluate_clarification_standalone() has no budget parameter and its directive omits budget info. This creates a behavior gap between MCP and standalone modes, partially failing the spec requirement "questionBudget decrements across rounds."
[low] Overly broad _TECH_REFERENCE_PATTERNS snake_case regex: Pattern r"\b[a-z]+_[a-z][a-z0-9_]*\b" matches any snake_case token (e.g., of_the, ice_cream). This could cause false negatives where non-tech prompts skip clarification. Consider narrowing to min 2 segments or requiring programming context.

Code Quality: Good

No any types, no unused imports, no dead code
Immutable approach in MCP handler (spread override, no mutation)
Clean JSDoc/docstrings, good type annotations
Consistent directive format between MCP and standalone (minus budget)
7 MCP + 12 standalone unit tests well-structured

Spec Compliance (#1423):

Criterion	Status
Ambiguous PLAN/AUTO → clarification-first	✅
Asks exactly one highest-value question	✅
questionBudget decrements across rounds	⚠️ MCP only, standalone missing
"just do it" bypass works	✅
Tests cover MCP and standalone paths	⚠️ Unit tests pass, e2e broken

Recommendation: REQUEST_CHANGES

Fix the 9 failing e2e tests (critical, blocking CI). Address the standalone questionBudget gap (medium). The core implementation logic is solid.

#1423) - Update 9 e2e test prompts to include file references (login.ts, auth.ts, dashboard.tsx) so they bypass the clarification gate instead of triggering it - Tighten snake_case tech reference regex to require 3+ char segments, reducing false positives on short English words (e.g. "of_the")

JeremyDev87

Review: APPROVE (Cycle 2/3)

CI Status: PASS

e2e-plugin-docker ✅
e2e-plugin-hooks 3.11 ✅
e2e-plugin-hooks 3.12 ✅
install-dependencies: pending (non-blocking)

Previous Issues Resolution:

CRITICAL (e2e tests): ✅ Fixed — 9 test prompts updated with file references (login.ts, auth.ts, dashboard.tsx) to bypass clarification gate
MEDIUM (questionBudget): ✅ Already implemented — evaluate_clarification_standalone() accepts question_budget param with DEFAULT_QUESTION_BUDGET = 3
LOW (regex): ✅ Fixed — snake_case pattern tightened to {3,} char segments

Issues Found: 0 critical, 0 high

Recommendation: APPROVE

JeremyDev87

Review: APPROVE (cycle 2/3)

CI Status: PASS

All 28 checks pass including e2e-plugin-docker, e2e-plugin-hooks (3.11, 3.12).

Previous Issues — All Resolved:

[critical] E2E tests: ✅ All test prompts now include file references (login.ts, auth.ts, dashboard.tsx) so the clarification gate correctly identifies them as specific. 9 previously-failing tests now pass.
[medium] Standalone questionBudget: ✅ evaluate_clarification_standalone() now accepts question_budget parameter with DEFAULT_QUESTION_BUDGET=3. Budget decrements (remaining = budget - 1) and is included in the directive. build_instructions() also passes question_budget through. MCP/standalone parity achieved.
[low] snake_case regex: ✅ Pattern updated from [a-z]+_[a-z] to [a-z]{3,}_[a-z]{3,} — requires 3+ character segments on both sides, eliminating false positives like of_the.

Spec Compliance (#1423):

Criterion	Status
Ambiguous PLAN/AUTO → clarification-first	✅
Asks exactly one highest-value question	✅
questionBudget decrements across rounds	✅ (MCP + standalone)
"just do it" bypass works	✅
Tests cover MCP and standalone paths	✅ (7 MCP + 12 standalone unit + e2e all pass)

Recommendation: APPROVE

All 3 review issues resolved. CI green. Spec fully satisfied.

JeremyDev87 added feat mcp-server apps/mcp-server plugin packages/claude-code-plugin P1 Priority 1: First Impression labels Apr 7, 2026

vercel Bot deployed to Preview April 7, 2026 09:04 View deployment

JeremyDev87 commented Apr 7, 2026

View reviewed changes

vercel Bot deployed to Preview April 7, 2026 09:20 View deployment

JeremyDev87 commented Apr 7, 2026

View reviewed changes

JeremyDev87 self-assigned this Apr 7, 2026

JeremyDev87 merged commit 7006180 into master Apr 7, 2026
29 checks passed

JeremyDev87 deleted the feat/clarification-gate-enforcement-1423 branch April 7, 2026 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mcp/plugin): enforce clarification gate before planning (#1423)#1425

feat(mcp/plugin): enforce clarification gate before planning (#1423)#1425
JeremyDev87 merged 2 commits into
masterfrom
feat/clarification-gate-enforcement-1423

JeremyDev87 commented Apr 7, 2026

Uh oh!

vercel Bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

JeremyDev87 left a comment

Uh oh!

JeremyDev87 left a comment

Uh oh!

JeremyDev87 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JeremyDev87 commented Apr 7, 2026

Summary

Test plan

Uh oh!

vercel Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JeremyDev87 left a comment

Choose a reason for hiding this comment

Review: CHANGES_REQUESTED

CI Status: FAIL

Issues Found:

Code Quality: Good

Spec Compliance (#1423):

Recommendation: REQUEST_CHANGES

Uh oh!

JeremyDev87 left a comment

Choose a reason for hiding this comment

Review: APPROVE (Cycle 2/3)

CI Status: PASS

Previous Issues Resolution:

Issues Found: 0 critical, 0 high

Recommendation: APPROVE

Uh oh!

JeremyDev87 left a comment

Choose a reason for hiding this comment

Review: APPROVE (cycle 2/3)

CI Status: PASS

Previous Issues — All Resolved:

Spec Compliance (#1423):

Recommendation: APPROVE

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Apr 7, 2026 •

edited

Loading