Instrument SimTable for rare ActionIsntFinished flake by folkengine · Pull Request #99 · ImperialBower/pkcore

folkengine · 2026-05-16T16:40:53Z

Action rejections in apply_action were silently swallowed via let _ =,
masking a rare CI stall where run_street exhausts its iteration cap and
bring_it_in() then fails with ActionIsntFinished. The eprintln!s here
fire only on the smoking-gun paths (rejected primary action; rejected
fallback for Bet/Raise; run_street exit with incomplete betting), so
green runs stay silent. The next failure on CI will surface the
offending PlayerAction, error, seat, street, and per-seat state under
the failing test's captured stderr. No behavior change.

Next move when CI fails again

The diagnostic output will appear under the failing test's ---- ... stdout ---- block. With one captured failure, we can:

Identify which action the decider returned that the table rejected.
Identify which street and what seat state caused it.
Decide between the three fix paths from earlier (surface-and-force-fold, decider-rule fix, or seeded-RNG plumbing) with a concrete reproducer in hand instead of a probability guess.

Action rejections in apply_action were silently swallowed via let _ =, masking a rare CI stall where run_street exhausts its iteration cap and bring_it_in() then fails with ActionIsntFinished. The eprintln!s here fire only on the smoking-gun paths (rejected primary action; rejected fallback for Bet/Raise; run_street exit with incomplete betting), so green runs stay silent. The next failure on CI will surface the offending PlayerAction, error, seat, street, and per-seat state under the failing test's captured stderr. No behavior change. Next move when CI fails again The diagnostic output will appear under the failing test's ---- ... stdout ---- block. With one captured failure, we can: 1. Identify which action the decider returned that the table rejected. 2. Identify which street and what seat state caused it. 3. Decide between the three fix paths from earlier (surface-and-force-fold, decider-rule fix, or seeded-RNG plumbing) with a concrete reproducer in hand instead of a probability guess.

The CI flakes in tests/exploitative_play_smoke.rs (ActionIsntFinished stalls, no-rule-fires assertions) traced to RuleBasedDecider::decide calling rand::rng() — the thread-local — at decider.rs:126, plus the deck shuffle using the same thread-local at cards.rs:467. Across 1,000 hands of probabilistic play, rare RNG samples produced pathological sequences that the assertions couldn't tolerate. This change is additive only: * BotDecider trait gains decide_seeded / on_new_hand_with_rng with default impls that forward to the existing methods. All three shipped deciders (RuleBasedDecider, JokerDecider, ExploitativeDecider) override decide_seeded; JokerDecider also gains new_with_rng and overrides on_new_hand_with_rng. * SimTable gains an Option<SmallRng> field and with_seed / with_rng builder methods. When set, run_hand uses shuffle_in_place_with on the deck and dispatches deciders through decide_seeded; when unset, the original thread-local path is preserved. * Cards gains shuffle_in_place_with; shuffle_in_place delegates. * The three smoke tests opt into deterministic runs via .with_seed(0). Verified across 50 consecutive runs of the smoke suite (50 pass, 0 fail), 9,059 lib tests, all doc tests, all sibling integration tests including bot-training. No downstream impact — existing BotDecider impls (e.g. TelemetryDecider in examples) compile unchanged. Optional follow-ups 1. Heuristic-calibration gap: the seed sweep showed BotProfile::loose_passive produces VPIP ~0.13 and PFR ~0.06 in heads-up play — nit-like, not LP-like. Either RangeStrategy::loose_passive's range is too tight or RuleBasedDecider's preflop calling logic ignores profile width. Worth a separate investigation; not blocking this PR. 2. Decide whether to keep the previous instrumentation: the eprintln! smoking-gun diagnostics in src/bot/sim.rs::apply_action/run_street and the enriched panic message in tests/exploitative_play_smoke.rs::accumulated_lp_stats_cause_profile_deviation are now belt-and-suspenders — they don't fire on the green path, only on rare regressions. Recommend keeping them as cheap insurance, but happy to revert if you'd rather the PR be tight.

folkengine added 3 commits May 16, 2026 09:22

bump

88f863c

folkengine merged commit ec7a21c into main May 17, 2026
10 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instrument SimTable for rare ActionIsntFinished flake#99

Instrument SimTable for rare ActionIsntFinished flake#99
folkengine merged 3 commits into
mainfrom
checks

folkengine commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

folkengine commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant