Instrument SimTable for rare ActionIsntFinished flake#99
Merged
Conversation
Action rejections in apply_action were silently swallowed via let _ =, masking a rare CI stall where run_street exhausts its iteration cap and bring_it_in() then fails with ActionIsntFinished. The eprintln!s here fire only on the smoking-gun paths (rejected primary action; rejected fallback for Bet/Raise; run_street exit with incomplete betting), so green runs stay silent. The next failure on CI will surface the offending PlayerAction, error, seat, street, and per-seat state under the failing test's captured stderr. No behavior change. Next move when CI fails again The diagnostic output will appear under the failing test's ---- ... stdout ---- block. With one captured failure, we can: 1. Identify which action the decider returned that the table rejected. 2. Identify which street and what seat state caused it. 3. Decide between the three fix paths from earlier (surface-and-force-fold, decider-rule fix, or seeded-RNG plumbing) with a concrete reproducer in hand instead of a probability guess.
The CI flakes in tests/exploitative_play_smoke.rs (ActionIsntFinished
stalls, no-rule-fires assertions) traced to RuleBasedDecider::decide
calling rand::rng() — the thread-local — at decider.rs:126, plus the
deck shuffle using the same thread-local at cards.rs:467. Across 1,000
hands of probabilistic play, rare RNG samples produced pathological
sequences that the assertions couldn't tolerate.
This change is additive only:
* BotDecider trait gains decide_seeded / on_new_hand_with_rng with
default impls that forward to the existing methods. All three
shipped deciders (RuleBasedDecider, JokerDecider,
ExploitativeDecider) override decide_seeded; JokerDecider also
gains new_with_rng and overrides on_new_hand_with_rng.
* SimTable gains an Option<SmallRng> field and with_seed / with_rng
builder methods. When set, run_hand uses shuffle_in_place_with on
the deck and dispatches deciders through decide_seeded; when
unset, the original thread-local path is preserved.
* Cards gains shuffle_in_place_with; shuffle_in_place delegates.
* The three smoke tests opt into deterministic runs via .with_seed(0).
Verified across 50 consecutive runs of the smoke suite (50 pass, 0
fail), 9,059 lib tests, all doc tests, all sibling integration tests
including bot-training. No downstream impact — existing BotDecider
impls (e.g. TelemetryDecider in examples) compile unchanged.
Optional follow-ups
1. Heuristic-calibration gap: the seed sweep showed BotProfile::loose_passive produces VPIP ~0.13 and PFR ~0.06 in heads-up play —
nit-like, not LP-like. Either RangeStrategy::loose_passive's range is too tight or RuleBasedDecider's preflop calling logic ignores profile
width. Worth a separate investigation; not blocking this PR.
2. Decide whether to keep the previous instrumentation: the eprintln! smoking-gun diagnostics in src/bot/sim.rs::apply_action/run_street
and the enriched panic message in tests/exploitative_play_smoke.rs::accumulated_lp_stats_cause_profile_deviation are now
belt-and-suspenders — they don't fire on the green path, only on rare regressions. Recommend keeping them as cheap insurance, but happy to
revert if you'd rather the PR be tight.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Action rejections in apply_action were silently swallowed via let _ =,
masking a rare CI stall where run_street exhausts its iteration cap and
bring_it_in() then fails with ActionIsntFinished. The eprintln!s here
fire only on the smoking-gun paths (rejected primary action; rejected
fallback for Bet/Raise; run_street exit with incomplete betting), so
green runs stay silent. The next failure on CI will surface the
offending PlayerAction, error, seat, street, and per-seat state under
the failing test's captured stderr. No behavior change.
Next move when CI fails again
The diagnostic output will appear under the failing test's ---- ... stdout ---- block. With one captured failure, we can: