Skip to content

feat(query): gate LIP semantic rerank on !MixedModels#208

Merged
SimplyLiz merged 1 commit intomainfrom
lip-mixed-models-gate
Apr 15, 2026
Merged

feat(query): gate LIP semantic rerank on !MixedModels#208
SimplyLiz merged 1 commit intomainfrom
lip-mixed-models-gate

Conversation

@SimplyLiz
Copy link
Copy Markdown
Owner

Summary

  • When the LIP index contains vectors from more than one embedding model, cosine similarity between those vectors is mathematically meaningless — the vector spaces are different bases. CKB previously ignored this and silently ranked on garbage.
  • Adds Engine.lipSemanticAvailable() — a 60 s TTL-cached probe of lip.IndexStatus() — and gates both semantic call sites in searchSymbols (SemanticSearchWithLIP fallback at symbols.go:519, RerankWithLIP at :646) behind it. Mixed-model state → pure lexical fallback.
  • Surfaces the state via a new lip_mixed_models DegradationWarning (capability 70%) so users learn why semantic ranking is inactive instead of quietly getting worse results.

No change to the internal/lip client, the LIP daemon protocol, or the rerank function signatures — the gate is entirely on the Engine side.

Test plan

  • go test ./internal/query/... -run 'TestLipSemanticAvailable|TestGetDegradationWarnings' -v — 6 new tests covering healthy, mixed, daemon-down, TTL-cache, warning-emitted, and no-warning-before-first-probe.
  • go test ./internal/query/... — full package suite still passes.
  • go build ./... clean.
  • gofmt clean.
  • Manual: run ckb doctor against a real LIP daemon — confirm MixedModels reporting unchanged.
  • Manual: exercise searchSymbols with single-model index (rerank fires) vs. induced mixed-model index (rerank skipped, warning present in response metadata).

🤖 Generated with Claude Code

When the LIP index contains vectors from more than one embedding model
(e.g. during a partial re-index after a model upgrade), cosine similarity
across those vector spaces is mathematically meaningless. The MixedModels
flag was already surfaced in `ckb doctor` but no query path consumed it —
so RerankWithLIP and SemanticSearchWithLIP silently ranked on garbage.

Adds a cached health check on Engine (60 s TTL) that short-circuits both
semantic call sites when the daemon reports MixedModels, falling back to
pure lexical ranking. Surfaces the state via a `lip_mixed_models`
DegradationWarning so users learn why semantic ranking is inactive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🟢 Change Impact Analysis

Metric Value
Risk Level LOW 🟢
Files Changed 5
Symbols Changed 4
Directly Affected 0
Transitively Affected 0

Blast Radius: 0 modules, 0 files, 0 unique callers

📝 Changed Symbols (4)
Symbol File Type Confidence
internal/query/degradation.go internal/query/degradation.go modified 30%
internal/query/engine.go internal/query/engine.go modified 30%
internal/query/lip_health.go internal/query/lip_health.go added 30%
internal/query/symbols.go internal/query/symbols.go modified 30%

Recommendations

  • ℹ️ coverage: 4 symbols have low mapping confidence. Index may be stale.
    • Action: Run 'ckb index' to refresh the SCIP index

Generated by CKB

@github-actions
Copy link
Copy Markdown

NFR Tests ✅ 39 unchanged

Comparing PR against main branch (dynamic baseline).

Regressions: 0 ✅

Thresholds: WARN ≥ +5% • FAIL ≥ +10%

All scenarios
Scenario Change Actual (B) Base (B) Time
analyzeChange / large +0.0% 193,169 193,169 723µs
analyzeChange / medium +0.0% 38,575 38,575 184µs
analyzeChange / small +0.0% 4,046 4,046 50µs
analyzeChange / xlarge +0.0% 387,417 387,417 1.508677ms
analyzeImpact / large +0.0% 17,966 17,966 112µs
analyzeImpact / small +0.0% 1,924 1,924 19µs
batchGet / large +0.0% 11,789 11,789 84µs
batchGet / small +0.0% 4,733 4,733 27µs
batchSearch / large +0.0% 90,816 90,816 334µs
batchSearch / medium +0.0% 18,036 18,036 192µs
batchSearch / small +0.0% 3,379 3,379 23µs
explore / large +0.0% 94,262 94,262 591µs
explore / small +0.0% 4,253 4,253 34µs
findReferences / large +0.0% 445,943 445,943 3.949561ms
findReferences / medium +0.0% 44,123 44,123 518µs
findReferences / small +0.0% 4,395 4,395 68µs
getAffectedTests / large +0.0% 7,521 7,521 50µs
getAffectedTests / medium +0.0% 3,110 3,110 29µs
getAffectedTests / small +0.0% 903 903 16µs
getAffectedTests / xlarge +0.0% 14,870 14,870 76µs
getArchitecture / large +0.0% 6,690 6,690 60µs
getArchitecture / small +0.0% 960 960 14µs
getCallGraph / deep +0.0% 15,238 15,238 89µs
getCallGraph / shallow +0.0% 887 887 17µs
getHotspots / large +0.0% 16,748 16,748 151µs
getHotspots / small +0.0% 886 886 36µs
listEntrypoints / large +0.0% 23,798 23,798 142µs
listEntrypoints / small +0.0% 4,795 4,795 47µs
prepareChange / large +0.0% 16,194 16,194 138µs
prepareChange / small +0.0% 2,483 2,483 32µs
searchSymbols / large +0.0% 90,246 90,246 945µs
searchSymbols / medium +0.0% 17,766 17,766 271µs
searchSymbols / small +0.0% 3,588 3,588 40µs
summarizeDiff / large +0.0% 19,939 19,939 327µs
summarizeDiff / small +0.0% 2,133 2,133 36µs
traceUsage / large +0.0% 7,728 7,728 87µs
traceUsage / small +0.0% 725 725 13µs
understand / large +0.0% 460,608 460,608 2.600332ms
understand / small +0.0% 5,555 5,555 41µs

* = new scenario, compared against static baseline

@github-actions
Copy link
Copy Markdown

CKB Analysis

Risk Files +219 -3 Modules

🎯 4 changed → 0 affected · 🔥 1 hotspot · 📊 1 complex · 💣 1 blast · 📚 197 stale

Risk factors: Touches 1 hotspot(s)

👥 Suggested: @lisa.welsch1985@gmail.com (100%), @talantyyr@gmail.com (40%), @lisa@tastehub.io (40%)

Metric Value
Impact Analysis 4 symbols → 0 affected 🟢
Doc Coverage 6.598984771573605% ⚠️
Complexity 1 violations ⚠️
Coupling 0 gaps
Blast Radius 0 modules, 0 files
Index indexed (0s) 🆕
🎯 Change Impact Analysis · 🟢 LOW · 4 changed → 0 affected
Metric Value
Symbols Changed 4
Directly Affected 0
Transitively Affected 0
Modules in Blast Radius 0
Files in Blast Radius 0

Symbols changed in this PR:

Recommendations:

  • ℹ️ 4 symbols have low mapping confidence. Index may be stale.
    • Action: Run 'ckb index' to refresh the SCIP index
💣 Blast radius · 0 symbols · 1 tests · 0 consumers

Tests that may break:

  • internal/query/lip_health_test.go
🔥 Hotspots · 1 volatile files
File Churn Score
internal/query/symbols.go 8.89
📊 Complexity · 1 violations
File Cyclomatic Cognitive
internal/query/symbols.go ⚠️ 85 ⚠️ 226
💡 Quick wins · 10 suggestions
📚 Stale docs · 197 broken references

Generated by CKB · Run details

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 93.54839% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/query/symbols.go 0.0% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@          Coverage Diff          @@
##            main    #208   +/-   ##
=====================================
  Coverage   43.0%   43.0%           
=====================================
  Files        525     526    +1     
  Lines      81073   81101   +28     
=====================================
+ Hits       34896   34935   +39     
+ Misses     43709   43702    -7     
+ Partials    2468    2464    -4     
Flag Coverage Δ
unit 43.0% <93.5%> (+<0.1%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

CKB review failed to generate output.

@SimplyLiz SimplyLiz merged commit 3e1a659 into main Apr 15, 2026
17 of 18 checks passed
@SimplyLiz SimplyLiz deleted the lip-mixed-models-gate branch April 15, 2026 10:53
SimplyLiz added a commit that referenced this pull request Apr 15, 2026
CKB used to re-probe IndexStatus on a 60 s TTL every time the rerank
path checked whether the LIP index was mixed-models. The LIP daemon has
pushed IndexChanged frames to all active sessions since v1.5.0, so the
polling was pure debt.

New internal/lip/subscribe.go opens a long-lived connection, pings
IndexStatus every 3 s to flush the daemon's broadcast channel (the
session loop drains queued pushes only after writing a response), reads
every frame in a loop, and routes index_changed and index_status by
type tag. Reconnects with exponential backoff to 30 s on daemon drop.

Engine owns one subscriber, started in NewEngine and cancelled in
Close. The cached availability/mixed flags are now written by the
subscriber, not the query path — lipSemanticAvailable is lock-free
RLock+read, no RPC. Worst-case staleness for the rerank gate drops
from 60 s to ~3 s.

Tests rewritten: the fake daemon now serves multiple requests per
connection and tests wait for the first health frame before asserting,
plus a new TestLipSubscriber_ReusesSingleConnection verifies the hot
path issues zero requests.

Also: CHANGELOG entries for #208 and #209 which landed on develop
without one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SimplyLiz added a commit that referenced this pull request Apr 15, 2026
* feat(lip): push-driven health via long-lived subscribe

CKB used to re-probe IndexStatus on a 60 s TTL every time the rerank
path checked whether the LIP index was mixed-models. The LIP daemon has
pushed IndexChanged frames to all active sessions since v1.5.0, so the
polling was pure debt.

New internal/lip/subscribe.go opens a long-lived connection, pings
IndexStatus every 3 s to flush the daemon's broadcast channel (the
session loop drains queued pushes only after writing a response), reads
every frame in a loop, and routes index_changed and index_status by
type tag. Reconnects with exponential backoff to 30 s on daemon drop.

Engine owns one subscriber, started in NewEngine and cancelled in
Close. The cached availability/mixed flags are now written by the
subscriber, not the query path — lipSemanticAvailable is lock-free
RLock+read, no RPC. Worst-case staleness for the rerank gate drops
from 60 s to ~3 s.

Tests rewritten: the fake daemon now serves multiple requests per
connection and tests wait for the first health frame before asserting,
plus a new TestLipSubscriber_ReusesSingleConnection verifies the hot
path issues zero requests.

Also: CHANGELOG entries for #208 and #209 which landed on develop
without one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: drop stray cmd/ckb-bench/version_test.go

Had a package-level init() that printed cartographer version on import
under the cartographer build tag — not a test, and leaked output into
unrelated test runs when the tag was enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant