Skip to content

yg/phase a6 policy query tests#91

Open
couragehong wants to merge 2 commits intofeat/go-migrationfrom
yg/phase-a6-policy-query-tests
Open

yg/phase a6 policy query tests#91
couragehong wants to merge 2 commits intofeat/go-migrationfrom
yg/phase-a6-policy-query-tests

Conversation

@couragehong
Copy link
Copy Markdown
Contributor

@couragehong couragehong commented Apr 28, 2026

Warning: This PR is created by Claude.

Summary

  • What: internal/policy/query_test.go + internal/domain/query_test.go 신규. 테스트 only, production 코드 변경 없음.
  • Why: policy/query.go (Python query_processor.py bit-identical 포팅) 와 domain/query.go 의 helpers (IsReliable/IsPhase/ExtractPayloadText) 에 테스트 0건. silent drift 시 recall accuracy 가 무성으로 떨어짐.
  • Scope: agents/tests/test_retriever.py::TestQueryProcessor11개 중 10개 포팅 + Python 이 안 가드한 invariant 추가. 나머지 클래스는 Go 등가 코드 없음/D21/D28 로 모두 N/A.

Python 대비 매핑

Python 클래스 처리 사유
TestQueryProcessor (11) 10 포팅 test_format_for_search 만 N/A — Go 에 format_for_search 미존재
TestQueryProcessorMultilingual (8) N/A D21 — agent pre-translates, Go ParsedQueryLanguage 필드 없음
TestSearcher / TestExpandPhaseChains N/A searcher Go 등가물 = service/recall.go (Phase 5, TODO 다수)
TestSynthesizer* (3 클래스) N/A D28 — agent-delegated, Go 에 synthesis 자체 없음
TestDisplayTextLocalization / FormatAnswer N/A display/synthesis 영역, recall 외부

강화 포인트 (Python 미가드)

  • 모든 QueryIntent 8개 / TimeScope 5개 망라 (Python 은 4 of 8, 2 of 5)
  • 규칙 iteration 순서 precedence (intent + time)
  • Cap 정확값 (len == 5/10/15, 아님)
  • AND assertion (Python 의 OR 보다 strict)
  • GENERAL intent fall-through contract
  • Stop word + length filter + dedup count exact-1
  • 1999 negative boundary (20\d{2} 패턴 강제)

Validation

  • go test -count=1 -race ./internal/policy/ ./internal/domain/ → ok (~1.4s, 27 함수 / 68 subtest)
  • gofmt -l / go vet → clean
  • 사전 리뷰 적용: 3 서브에이전트 (adversarial / Python parity / Go style) 가 잡은 HIGH 발견 모두 반영 (OR→AND, exact-cap, 규칙 precedence, slices.Contains, package doc 충돌 회피, table-driven IsPhase)

Cross-Agent Invariants

테스트 파일 2개만 추가. scripts/bootstrap-mcp.sh, agent 스크립트, Codex/Claude/Gemini/OpenAI 지시서, SKILL.md, commands/rune/*.toml, AGENT_INTEGRATION.md 모두 미수정 → 모든 invariant trivially 만족.

Notes for Reviewers

  • Risk: TestSearchHit_IsPhase/group_id_pointer_to_empty_string 가 현재 동작 (pointer 존재만 보고 IsPhase=true) 을 lock-in. 팀이 빈 문자열을 "phase 아님" 으로 결정하면 production IsPhase 와 같이 수정 필요. 코드에 TODO(yg) 표시.
  • BC: 없음 (test-only).
  • Follow-up: policy/{rerank,novelty,record_builder,payload_text,pii}.go, domain/schema.go (Python golden 비교 추천), lifecycle/shutdown.go::ZeroizeDEK (dead-store 최적화 가드 필요).

🤖 Generated with Claude Code

couragehong and others added 2 commits April 28, 2026 10:34
…yloadText

Port the Python TestQueryProcessor suite (agents/tests/test_retriever.py)
to Go and extend coverage to every QueryIntent / TimeScope plus invariants
the Python suite did not gate explicitly.

internal/policy/query_test.go (black-box, package policy_test):
  - All 7 explicit intents + GENERAL fallback (was 5 of 8 in Python).
  - All 4 explicit time scopes + ALL_TIME default (was 2 of 5 in Python).
  - Entity extraction: quoted strings + capitalized + tech regex.
  - Keyword extraction: stop-word filter, length filter, dedup.
  - Query expansion: original included, intent variants, 5-cap.
  - Cleaning: lowercase, whitespace collapse, trim, trailing punctuation
    (question mark preserved; period/bang/comma/semicolon/colon stripped
    including consecutive runs).
  - Field caps: entities <= 10, keywords <= 15, expansions <= 5.

internal/domain/query_test.go:
  - SearchHit.IsReliable: supported/partially_supported true; everything
    else false (incl. empty string and unrelated values).
  - SearchHit.IsPhase: pointer presence drives the predicate (documented
    via a subtest, since pointer-to-empty-string also returns true).
  - ExtractPayloadText: standard path, missing payload, payload not a map,
    text not a string, text missing, nil metadata, payload nil.

Multilingual cases skipped: Go ParsedQuery has no Language field
(D21 - agent pre-translates), so the regex/LLM split that exists in
the Python QueryProcessor does not exist here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address findings from a 3-agent pre-PR review (adversarial / Python parity /
Go style). Same scope as the previous commit; assertions made stricter so
plausible regressions are caught.

policy:
  - Replace OR with AND in TestParse_KeywordsRetainsContentTerms and
    TestParse_EntitiesCapitalizedAndTechPatterns. Half-broken extraction
    no longer slips through.
  - Cap tests assert exact equality (len == 5/10/15) on overflowing input,
    not just upper bound. A cap regression to a smaller value now fails.
  - Add TestParse_IntentRulePrecedence and TestParse_TimeRulePrecedence to
    gate the rule iteration order (silent reordering would be undetected
    by the per-intent / per-scope tests alone).
  - Add TestParse_EntityOrCleanedSurfacesToken — preserves the disjunction
    Python asserts in test_retriever.py:25 (entity verbatim OR token in
    cleaned), which the original Go split lost.
  - Add TestParse_ExpansionGeneralIntentNoPrefixes — gates the
    fall-through contract for intents not in generateExpansions's switch.
  - Tighten TestParse_ExpansionContainsCleanedAndIntentVariants: must use
    a query that actually matches DecisionRationale, and the assertion
    requires a prefix-form intent variant in the output.
  - Rename TestParse_KeywordsDeduplicated to
    TestParse_KeywordsDedupAfterLowercase — clarifies that the dedup is
    of already-lowercase strings (cleanQuery happens upstream).
  - Extend TestParse_TimeScope: this_week / this_quarter / past 3 months /
    past year / 1999 negative boundary (must NOT match 20\d{2}).
  - Extend TestParse_Cleaned: tab and newline whitespace collapse cases.

domain:
  - TestSearchHit_IsReliable: comment cites the 4 canonical Certainty
    values from Python schema and justifies the 6 chosen test cases.
  - TestSearchHit_IsPhase: convert to table-driven for consistency, mark
    pointer-to-empty-string with a TODO referencing the team decision.
  - TestExtractPayloadText: doc clarifies Go is intentionally narrower
    than Python (D32 — drops the 3 fallback paths), not a literal port.

Style:
  - Replace hand-rolled `contains` helper with `slices.Contains` (Go 1.25).
  - Move file-level header comments below `package` lines to avoid
    colliding with package doc comments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@couragehong couragehong changed the title Yg/phase a6 policy query tests yg/phase a6 policy query tests Apr 28, 2026
@couragehong couragehong requested a review from esifea April 28, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants