test: stabilize flaky Poisson(10) niterations convergence test#98
Merged
Conversation
The `Poisson(10)` niterations convergence sub-test used a fixed 700 Monte-Carlo samples per run. With a large `λ` the Poisson variance is high, so the KL noise floor sat right at the stable-point `stdthreshold` (5e-2). The test is deterministic (StableRNG(42)-seeded) but sits on the pass/fail boundary, so small numerical perturbations from a different ExponentialFamily.jl / Julia / BLAS version in downstream CI flip it. Raise `niterations_nsamples` from 700 to 4000 (matching the `nsamples_range` ceiling that already passes), lowering the tail rolling-std from ~0.09 to ~0.015 — comfortably below the threshold. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #98 +/- ##
=======================================
Coverage 99.42% 99.42%
=======================================
Files 14 14
Lines 520 520
=======================================
Hits 517 517
Misses 3 3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Nimrais
approved these changes
Jul 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
Poissonprojection convergence test fails intermittently in the ExponentialFamily.jl downstream CI job (e.g. run 28360311006). The failure is in thePoisson(10)case oftest/projection/projected_to_poisson_tests.jl.Root cause
test_projection_convergenceruns aniterationsconvergence sub-test that sweepsniterations = 100:50:1000with a fixedniterations_nsamples = 700Monte-Carlo samples per run, thentest_convergence_to_stable_pointrequires the tail's rolling std to fall belowstdthreshold = 5e-2.For
λ = 10the Poisson variance is large, so 700 samples leave a KL noise floor right at the0.05threshold. The test is deterministic (everything isStableRNG(42)-seeded) so it does not randomly flip — but it sits on the pass/fail boundary, and small numerical drift from a different ExponentialFamily.jl / Julia / BLAS version in the downstream job tips it over. That is what makes it look flaky.The companion
nsamplessweep already runs up to 4000 samples and passes, confirming more samples is the correct lever.Fix
Raise
niterations_nsamplesfrom700→4000for thePoisson(10)testset only (matching thensamples_rangeceiling that already passes). No shared helpers, other distributions, or convergence criteria are touched.Verification
Reproduced the exact CI failure locally (same
max div = 0.278) and swept sample counts:niterations_nsamplesAll three Poisson test items pass locally (8/8). The
Poisson(10)item is slower (~12s vs ~4s), an acceptable cost for the ~3×-below-threshold margin.🤖 Generated with Claude Code