Skip to content

fix: allow zero-gradient initial points in check_all#63

Closed
habakan wants to merge 1 commit intopymc-devs:mainfrom
habakan:fix/allow-zero-gradient-init
Closed

fix: allow zero-gradient initial points in check_all#63
habakan wants to merge 1 commit intopymc-devs:mainfrom
habakan:fix/allow-zero-gradient-init

Conversation

@habakan
Copy link
Copy Markdown

@habakan habakan commented Apr 7, 2026


Problem

check_all calls array_all_finite_and_nonzero on transformed_gradient, which rejects
mathematically valid initial points where the gradient is exactly zero with BadInitGrad.

Reproducer

Sampling from a standard normal N(0, 1) starting at the origin:

  • logp(x) = -x²/2, gradient = -x
  • At x = 0, gradient = 0 (the mode)
  • check_all rejects the point → BadInitGrad

Fix

// Before
if !math.array_all_finite_and_nonzero(&self.transformed_gradient) {
    return false;
}

// After
if !math.array_all_finite(&self.transformed_gradient) {
    return false;
}

Why this is safe

  1. Consistency with check_untransformed
    The sibling function has always used array_all_finite only. This change makes the two consistent.

  2. Consistent with Stan / PyMC
    Stan's initialize.hpp checks std::isfinite(sum(gradient)) only — zero gradients are not rejected.

  3. No impact on step size search
    initialize_trajectory sets a random momentum p before the first leapfrog step, so the position
    moves via q += ε * p even when the initial gradient is zero.
    If the search does get stuck, nuts-rs already has two safety nets: a 100-iteration cap and a
    fallback to initial_step.


Background


Changes

File Description
src/dynamics/transformed_hamiltonian.rs array_all_finite_and_nonzeroarray_all_finite (one line)
tests/zero_gradient_init.rs regression test

@aseyboldt
Copy link
Copy Markdown
Member

Sorry for the delay, I somehow missed the issue...

Rejecting initial points with zero gradient is by design, because we choose the initial mass matrix based on the gradient. Is there any specific reason you need this?

@habakan
Copy link
Copy Markdown
Author

habakan commented Apr 20, 2026

Thank you for the clarification!

Some context: I ran into this when testing with an initial point set to the expected value (the mode of the posterior). For example, starting inference from x=0 on N(0,1) produces a zero gradient and gets rejected with BadInitGrad.

You're right that the check is intentional. After reading the code more carefully, I can see that in array_update_var_inv_std_grad, a zero gradient gets clamped to 1e-20 before .recip() is called, yielding 1e20 — a finite value — so the fill_invalid = 1.0 fallback never fires. The result is std ≈ 3.16e10, which would make the initial mass matrix wildly off. My claim that issue #7 already handles this case was incorrect.

I'll close this PR. If zero-gradient initial points are ever worth supporting in the future, the right fix would be to handle zero explicitly in array_update_var_inv_std_grad (treating it as fill_invalid) rather than relaxing the check in check_all.

@habakan habakan closed this Apr 20, 2026
@habakan habakan reopened this Apr 20, 2026
@habakan habakan closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants