docs(genetics): PROBE-CHAODA-1000G spike RUN — single-method LFD AUC 0.624, below bar (ndarray #219)#505
Conversation
…0.624, below bar The 1-day spike substitute named in genetics-probes-v1.md has been run against the shipped kernel (ndarray PR #219). Records the measurement as a CONJECTURE->FINDING update and relocates the probe's prerequisite. MEASURED (ndarray #219, deterministic synthetic 5-lane Gaussian mixture): mean cluster score 0.6749, mean outlier score 0.7500 ROC-AUC (Mann-Whitney U) = 0.6240 FINDING: the shipped single-method leaf-LFD anomaly_scores reaches only AUC 0.624 on the easiest possible case (clean clusters + far outliers), well below the >= 0.85 bar. Mechanical cause: leaf LFD measures intra-leaf geometry complexity, not inter-leaf isolation, so isolated outliers and dense-cluster points share a score band under global min-max normalisation. The CHAODA ensemble of Ishaq et al. 2021 combines several graph-based signals; only the LFD signal is shipped. Changes: - genetics-probes-v1.md: add the prominent FINDING block under PROBE-CHAODA-1000G; mark the spike DONE; add a Status column + blocker note to the Sequencing table; extend DAG-honesty with the new prerequisite deliverable D-GEN-CHAODA-ENSEMBLE (port the multi-method ensemble, re-run the spike, gate at AUC >= 0.85 BEFORE genomic fixtures). - GENETIC_RESEARCH_VIA_STACK.md S 1.4: add a MEASURED CAVEAT block. The pattern match is real but the single shipped signal is NOT a working novel-variant detector as-is; caveated, not retracted. Evidence-before-build: the gap is caught before any adapter-genetics-experimental (D-GEN-1..4) spend. https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v
|
Warning Review limit reached
More reviews will be available in 20 minutes and 42 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98db536c99
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| > **D-GEN-CHAODA-ENSEMBLE (new, prerequisite to PROBE-CHAODA-1000G):** port | ||
| > the multi-method CHAODA anomaly ensemble into `ndarray::hpc::clam` | ||
| > alongside the existing leaf-LFD `anomaly_scores`. Re-run the ndarray #219 | ||
| > spike; gate at AUC ≥ 0.85 on the synthetic mixture *before* genomic |
There was a problem hiding this comment.
Update the probe to use the ensemble
When D-GEN-CHAODA-ENSEMBLE is completed, this new prerequisite still frames the work as adding the ensemble alongside the existing anomaly_scores, while the PROBE-CHAODA-1000G run step above still tells runners to compute anomaly_scores(held_out_bytes, vec_len=5), the exact single-LFD path now documented as AUC 0.624. If the plan is followed literally, the genomic fixture probe will continue measuring the known-bad signal instead of the ported ensemble, so the deliverable should update the probe/run API it expects runners to call.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Confirmed real — fixed in 4ac7e35.
You're right: D-GEN-CHAODA-ENSEMBLE framed the work as adding the ensemble alongside anomaly_scores, while PROBE-CHAODA-1000G Step 3 still told runners to call anomaly_scores(held_out_bytes, vec_len=5) — the exact single-LFD path the FINDING measures at AUC 0.624. Followed literally, the genomic probe would re-measure the known-bad signal.
Changes:
- Step 3 now computes the gated score via the ported multi-method ensemble, and runs single-method
anomaly_scoresonly as the known-bad baseline column the ensemble must beat. AUC is computed for both (ensemble = gated; single-LFD ≈ 0.62 = regression floor). - Pass condition gates the ensemble AUC ≥ 0.85; the single-LFD baseline is recorded, not gated.
- Fail modes add: ensemble AUC ≈ single-LFD baseline ⇒ port added no lift, escalate before any genomic-fixture spend.
- D-GEN-CHAODA-ENSEMBLE now specifies a new scoring entry point (
ensemble_anomaly_scores, name TBD), keeps single-methodanomaly_scoresunchanged as the documented baseline/regression (the ndarray codec research: CAM-PQ solves argmax blind spot (ICC 0.9998 at 6 B/row) + production plan #219auc < 0.85tripwire stays green on it), and makes wiring Step 3 to the new entry point an explicit part of the deliverable.
Commit: 4ac7e35.
…le, not the known-bad single-LFD path
Codex caught a real inconsistency: D-GEN-CHAODA-ENSEMBLE framed the work
as adding the ensemble "alongside" anomaly_scores, while PROBE-CHAODA-1000G
Step 3 still instructed runners to call anomaly_scores(held_out_bytes,
vec_len=5) — the exact single-LFD path the FINDING documents at AUC 0.624.
Followed literally, the genomic-fixture probe would re-measure the known-bad
signal instead of the ported ensemble.
Fixes:
- PROBE-CHAODA-1000G Step 3: run the ported multi-method ensemble for the
gated score; run single-method anomaly_scores only as the known-bad
baseline column the ensemble must beat. Compute AUC for BOTH (ensemble =
gated, single-LFD ~ 0.62 = regression floor).
- Pass condition: gate the ENSEMBLE AUC >= 0.85; the single-LFD baseline is
recorded, not gated. Per-quartile separation rephrased to top vs bottom
ensemble-score quartile.
- Fail modes: add "ensemble AUC ~ single-LFD baseline => port added no lift,
escalate before genomic-fixture spend."
- D-GEN-CHAODA-ENSEMBLE: specify a NEW scoring entry point
(ensemble_anomaly_scores, name TBD), keep single-method anomaly_scores
unchanged as baseline/regression (the ndarray #219 auc<0.85 tripwire stays
green on it), and make wiring PROBE-CHAODA-1000G Step 3 to the new entry
point an explicit part of the deliverable.
https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v
Summary
The 1-day spike substitute named in
.claude/plans/genetics-probes-v1.md(merged in #503) has been run against the shipped kernel — see AdaWorldAPI/ndarray PR #219. This PR records the measurement as a CONJECTURE→FINDING update and relocatesPROBE-CHAODA-1000G's prerequisite.Measured (ndarray #219, deterministic synthetic 5-lane Gaussian mixture)
FINDING
The shipped single-method leaf-LFD
anomaly_scoresreaches only AUC 0.624 on the easiest possible case (clean clusters + deliberately far outliers) — well below the ≥ 0.85 bar a novelty detector needs. Mechanical cause: leafLFD = log₂(|B(c,r)|/|B(c,r/2)|)measures intra-leaf geometry complexity, not inter-leaf isolation, so isolated outliers and dense-cluster points share a score band under global min-max normalisation. The CHAODA ensemble of Ishaq et al. 2021 combines several graph-based signals (relative/component cardinality, graph neighbourhood, random-walk stationary distribution, vertex degree); only the LFD signal is shipped.This does not retract
PROBE-CHAODA-1000G— it relocates the prerequisite. Porting the multi-method CHAODA ensemble is now the true P0 work item (new candidate deliverable D-GEN-CHAODA-ENSEMBLE), ahead of the genomic-fixture pipeline. The §1.4 "unsupervised novel-variant detection" hand-off claim is caveated, not retracted: the pattern match is sound; the single shipped signal is not yet sufficient.Changes
~.claude/plans/genetics-probes-v1.md:PROBE-CHAODA-1000G~docs/GENETIC_RESEARCH_VIA_STACK.md§1.4: a MEASURED CAVEAT block (the shipped composition is not a working novel-variant detector as-is).Why this is the right outcome
This is the evidence-before-build payoff the probe-spec discipline exists for: the gap is caught before any
adapter-genetics-experimental(D-GEN-1..4) spend, and before the §1.4 claim reaches an external genomics audience uncaveated.Cross-refs
.claude/plans/genetics-probes-v1.md(merged docs(genetics): probe spec v1 + Salient cite-rot fix #503) — the probe being substituted for.ndarray/src/hpc/clam.rs:1493-1567— theanomaly_scoresunder test.https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v