Skip to content

fix: guard division-by-zero and uninitialized state in metrics/masking code#1575

Closed
mooreneural wants to merge 1 commit into
NVIDIA-BioNeMo:mainfrom
mooreneural:mooreneural/BioNeMoore
Closed

fix: guard division-by-zero and uninitialized state in metrics/masking code#1575
mooreneural wants to merge 1 commit into
NVIDIA-BioNeMo:mainfrom
mooreneural:mooreneural/BioNeMoore

Conversation

@mooreneural
Copy link
Copy Markdown

  • mlm_memmap.py: normalize codon weights only when mean > 0 to prevent NaN propagating into np.random.binomial when all token weights are zero
  • mlm_memmap.py: clamp conditional random-replace probability to [0, 1] and guard against mask_replace_prob == 1.0 causing ZeroDivisionError
  • dead_latents.py: initialize _last_avg_nonzero in init so get_stats() is safe to call before the first update()
  • evo2_dataset.py: remove duplicate ASCII 45 ('-') entry in VALID_DNA_AND_DEGENERATE

Description

Four defensive bug fixes across the codon MLM preprocessing pipeline and SAE evaluation utilities.

Usage

mlm_memmap.py — codon weight normalization: Previously, passing a codon_weights array where all weights for the sequence tokens were zero caused division-by-zero → NaN → crash inside np.random.binomial. No API change; the normalization is simply skipped when the mean is zero and masking proceeds with the flat mlm_probability.

mlm_memmap.py — random replacement probability: Calling process_item with mask_replace_prob=1.0 (or any combo where random_replace_prob / (1 - mask_replace_prob) > 1.0) raised ZeroDivisionError or ValueError. Now guarded with a zero-denominator check and clamped to [0, 1].

dead_latents.py — pre-update get_stats(): DeadLatentTracker().get_stats() now works immediately after construction without needing a prior update() call.

tracker = DeadLatentTracker(hidden_dim=4096)
stats = tracker.get_stats()  # previously AttributeError, now returns 0.0 avg_nonzero

…s/masking code

- mlm_memmap.py: normalize codon weights only when mean > 0 to prevent NaN
  propagating into np.random.binomial when all token weights are zero
- mlm_memmap.py: clamp conditional random-replace probability to [0, 1] and
  guard against mask_replace_prob == 1.0 causing ZeroDivisionError
- dead_latents.py: initialize _last_avg_nonzero in __init__ so get_stats()
  is safe to call before the first update()
- evo2_dataset.py: remove duplicate ASCII 45 ('-') entry in VALID_DNA_AND_DEGENERATE
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 18, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ed655d97-0203-45c9-8927-7d6eef0695f9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant