Skip to content

feat(polish): faithfulness judge integration (Phase 3 of polish-fact-check)#36

Merged
silversurfer562 merged 1 commit into
mainfrom
feat/polish-fact-check-phase-3
May 16, 2026
Merged

feat(polish): faithfulness judge integration (Phase 3 of polish-fact-check)#36
silversurfer562 merged 1 commit into
mainfrom
feat/polish-fact-check-phase-3

Conversation

@silversurfer562
Copy link
Copy Markdown
Member

Summary

Stacked on #35 (Phase 2) — please review and merge that first.

  • Ship Phase 3 of the polish-fact-check spec: wrap attune_rag.eval.faithfulness.FaithfulnessJudge as a post-polish step. Every polished file gets scored against its source files; below-threshold scores append a ## Faithfulness review block listing the unsupported claims and the judge's reasoning.
  • New src/attune_author/faithfulness/ package: judge wrapper (sync-bridged via asyncio.run), FaithfulnessConfig dataclass, char-based cost estimator that gates the call before paying for it, and a format_review_block / apply_review_block pair matching the Phase 1 soft-fail shape.
  • Wired into generator.apply_polish_results after the Phase 1 fact-check pass. Per-process cost telemetry counters reset at run_maintenance start and summary-logged at end.
  • Opt-in by default (enabled = false in config). The judge is best-effort: missing attune-rag[claude], missing ANTHROPIC_API_KEY, over-budget cost estimates, and transient API failures all degrade silently rather than blocking the polish. CI lanes can opt into strict mode with block_polish_on_unavailable = true.

Motivation

Phase 1 catches mistakes after generation (regex/AST matching for known-bad shapes — invented imports, unknown CLI flags, broken links, wrong counts). Phase 2 prevents them during generation (ground-truth context injection). Phase 3 covers the middle ground: an LLM-as-judge that catches shapes Phase 1 can't pattern-match (e.g. the missing-security-callout for 0.0.0.0 from attune-ai PR #351's fixture) and that Phase 2 can't fully prevent.

Configuration

[tool.attune-author.fact-check.faithfulness]
enabled = true                # opt-in; defaults to false
threshold = 0.95              # below this triggers a review block
budget_per_file_usd = 0.10    # skip if estimated cost exceeds cap
model = "claude-sonnet-4-6"   # haiku is ~1/3 the cost
block_polish_on_unavailable = false  # set true in strict CI lanes

End-of-run telemetry log:

INFO Faithfulness judge: 11 call(s), 2 skipped, estimated cost $0.0537

Test plan

  • Unit tests: 30 new tests under tests/unit/faithfulness/ covering — the budget gate, every skip path (disabled, missing file, no sources, over-budget, missing extra, missing key, transient failure), the happy path with mocked FaithfulnessJudge, the below-threshold review-block append, env-var override (ATTUNE_AUTHOR_FAITHFULNESS=off), telemetry counters + reset, and unexpected-exception swallowing
  • Full attune-author suite: 926 passed, 37 pre-existing skips
  • ruff check clean across all touched files
  • Live-LLM acceptance (gated): threshold calibration against the ops-dashboard pre-fix / post-fix fixture (tasks 3.3 + 3.4) — scheduled to land alongside Phase 2's live-LLM acceptance run so a single real-API cycle covers both phases' open items. The placeholder default threshold=0.95 is documented as pre-calibration in decisions.md.

Notes for review

  • uv.lock is intentionally excluded — pre-existing drift (lockfile recorded attune-author 0.6.1), separate cleanup PR.
  • Why opt-in: the judge makes real Anthropic API calls. We shouldn't bill users for it silently on first run after install. Phase 1 (no API calls) defaults on; Phase 3 (real API calls) defaults off. Easy to flip per-project via pyproject.toml.
  • Why asyncio.run: the existing polish pipeline is synchronous and FaithfulnessJudge.score is async. The bridge is at one call site; expanding to native-async polish is a future change if drift surfaces.
  • Why char-based cost estimate vs real tokenizer: the budget gate cares about ~$0.10 precision; chars/4 token estimate is accurate to ~20% which is well inside that gate. Documented in decisions.md.

🤖 Generated with Claude Code

…check)

Phase 3 adds an opt-in faithfulness judge that scores polished
documents against the source files they were generated from.
When the score falls below the configured threshold, a
`## Faithfulness review` block listing the unsupported claims and
the judge's reasoning is appended to the polished file.

Pairs with Phase 1 (AST fact-check after generation) and Phase 2
(ground-truth context injection before generation) to give three
distinct interventions against polish-pass hallucinations.

New package: src/attune_author/faithfulness/
  - judge wrapper around attune_rag.eval.faithfulness.FaithfulnessJudge
    via asyncio.run (the polish pipeline is sync)
  - FaithfulnessConfig: threshold (0.95 pre-calibration default),
    budget_per_file_usd ($0.10), model (Sonnet 4.6 — Haiku is ~1/3
    the cost), block_polish_on_unavailable for strict CI
  - estimate_cost_usd: chars-to-tokens heuristic + per-model price
    lookup, used as the budget gate so we never invoke the judge
    when the estimated cost exceeds the cap
  - format_review_block + apply_review_block: soft-fail formatter
    matching the Phase 1 ## Unresolved references shape

Wiring:
  - generator._run_faithfulness_judge runs after _run_fact_check
    on every polished file. Reads optional pyproject config.
  - generator._faithfulness_telemetry / reset_faithfulness_telemetry:
    per-process counters; run_maintenance resets them at start and
    logs INFO summary at end (calls, skipped, total estimated $).
  - ATTUNE_AUTHOR_FAITHFULNESS=off env override.

Best-effort contract: missing attune-rag[claude], missing
ANTHROPIC_API_KEY, over-budget estimates, transient API failures
all degrade silently. The judge never blocks the polish.

Tests: 30 new tests under tests/unit/faithfulness/ covering the
budget gate, every skip path, the happy path, the
below-threshold review-block append, env-var override, telemetry
reset, and unexpected-exception swallowing. Full suite: 926
passed, 37 pre-existing skips.

Threshold calibration (tasks 3.3, 3.4) deferred to the same
real-LLM run that closes Phase 2's live-LLM acceptance gate —
folding two API cycles into one. Default of 0.95 is documented as
pre-calibration in decisions.md.

Spec: docs/specs/polish-fact-check/

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@silversurfer562 silversurfer562 force-pushed the feat/polish-fact-check-phase-3 branch from 42f3aa6 to 3a9da04 Compare May 16, 2026 07:59
@silversurfer562 silversurfer562 merged commit d3c5f3e into main May 16, 2026
12 checks passed
@silversurfer562 silversurfer562 deleted the feat/polish-fact-check-phase-3 branch May 16, 2026 07:59
silversurfer562 added a commit that referenced this pull request May 22, 2026
#38)

Follow-up to the polish-fact-check Phase 3 PR (#36) that landed
the faithfulness judge. This commit adds the local helpers used
to smoke-test the judge end-to-end without paying for a full
attune-author regenerate cycle.

Two changes:

1. scripts/test_faithfulness.py
   Tiny harness that picks the smallest feature in features.yaml
   (fewest source files) and regenerates its 3 core kinds
   (concept/task/reference) with telemetry-reset + summary-print
   + review-block detection. Cost on Haiku 4.5 ≈ $0.03 per run.
   Refuses to run without ANTHROPIC_API_KEY in env.

   Usage:
       uv run python scripts/test_faithfulness.py
       uv run python scripts/test_faithfulness.py <feature_name>

2. pyproject.toml: enable the judge for attune-author's own
   self-dogfood help regeneration. With this, anyone running
   `attune-author regenerate` against attune-author with auth
   available exercises the Phase 3 pipeline end-to-end —
   matches the pattern attune-author already uses for the
   polish pass (live API calls during dogfood).

   Configured on Haiku 4.5 (~1/3 the cost of Sonnet 4.6) since
   the threshold + budget defaults are pre-calibration and a
   cheaper model is fine for the initial measurement pass.

Why ship this as a follow-up rather than baking it into #36:
the Phase 3 PR was scoped to the implementation + tests; the
spec defines `enabled=false` as the global default (opt-in,
since the judge makes real API calls). Flipping it on for the
attune-author repo itself is a per-project preference, not a
default change. Same shape as how attune-author has always
defaulted polish-strict on for its own dogfood while the
package default is lenient.

Post-Phase-0 of the sibling-subscription-auth spec
(attune-ai PR #406), this also exercises the subscription-
routing path for Claude Code users — though the wire-up to
actually use claude_agent_sdk lives in Phase 1, which hasn't
shipped yet, so today this still requires ANTHROPIC_API_KEY.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant