fix(bandit): return neutral reward for non-finite fitness inputs by GrigoryEvko · Pull Request #2 · FusionBrainLab/gigaevo-core

GrigoryEvko · 2026-05-15T00:00:55Z

compute_bandit_reward(NaN, _) returns NaN. That feeds into the SlidingWindowUCB1 mean and silently bricks routing — every arm's score becomes NaN, score > best_score is always False, the first arm in dict order always wins. Trigger: any path producing non-finite fitness, e.g. a crashed validity stage emitting NaN.

Fix: return 0.0 when either input is non-finite. Tests: finite-input parity, NaN child, +inf parent.

compute_bandit_reward used to compute exp(min(max(NaN, 0), _MAX)) → NaN when either input was non-finite. The NaN then flowed into SlidingWindowUCB1.update_reward → mean_reward → UCB score, after which every arm's score is NaN, "score > best_score" is False, and the first arm in dict iteration order is always selected (exploration silently bricked). Non-finite inputs do occur in practice: validity stage crashes yield sentinel-or-NaN fitness depending on the acceptor. Add a finite-input fast-guard returning 0.0 — the neutral reward — so the sliding-window mean stays well-defined. Tests cover positive (finite path unchanged), NaN child, and +inf parent.

``_StructuredOutputRouter._maybe_fire_failure_hook`` used to swallow hook exceptions silently (``except Exception: pass``). The hook is observability-only, so re-raising would mask the real LLM failure — but a *silent* swallow loses telemetry whenever the hook itself has a bug, hiding bandit-side regressions. Replace ``pass`` with ``logger.warning(...)`` so the original exception still propagates, yet a broken hook is visible in operator logs. Audit item FusionBrainLab#2 from the PR FusionBrainLab#13 bug hunt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bandit): return neutral reward for non-finite fitness inputs#2

fix(bandit): return neutral reward for non-finite fitness inputs#2
GrigoryEvko wants to merge 1 commit into
FusionBrainLab:mainfrom
GrigoryEvko:fix/bandit-finite-guard

GrigoryEvko commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GrigoryEvko commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GrigoryEvko commented May 15, 2026 •

edited

Loading