Skip to content

[BUG] normal_approximation crashes with math.sqrt domain error when estimate drifts past [0,1] #244

@kamran-rapidfireAI

Description

@kamran-rapidfireAI

Summary — verified end-to-end

NormalApproximationStrategy.normal_approximation (and WilsonApproximationStrategy.wilson_approximation via inheritance) raises ValueError: math domain error when an algebraic metric estimate lands just outside its declared value_range=(0, 1) due to ordinary IEEE-754 accumulation in the user's accumulate_metrics_fn.

This is a hard crash of Experiment.run_evals, not a silent-metric-drop. The exception escapes through the final-aggregation path at controller.py:699 (_compute_final_metrics_for_pipelinesaggregator.online_strategy.add_confidence_interval_info), which is NOT wrapped by the broad try/except at controller.py:1248-1251 (that catch only protects a different live-metrics path). run_evals re-raises the ValueError and returns None.

Verified end-to-end via a working notebook that uses rapidfireai's normal public API (Experiment, RFLangChainRagSpec, RFAPIModelConfig, RFGridSearch) with real OpenAI gpt-4o-mini calls over a small FAISS+PyMuPDF RAG. No source-poking, no monkey-patching. Notebook: https://gist.github.com/kamran-rapidfireAI/a510d4c15aba56d7968bed4167404407

End-to-end repro

The notebook constructs an accumulate_metrics_fn that returns

{"DriftMetric": {"value": 1.0 + 1e-12, "is_algebraic": True, "value_range": (0, 1)}}

simulating the float-drift scenario this issue describes. Run output:

[accumulate] true_mean=1.0 drifted_mean=1.000000000001 over_upper_bound? True
run_evals raised: ValueError math domain error
run_evals completed in 84.5s. results type: NoneType

Stack trace captured in ~/rapidfireai/logs/bug244_repro/rapidfire.log:

File ".../rapidfireai/experiment.py", line 539, in run_evals
    results = self.controller.run_multi_pipeline_inference(...)
File ".../rapidfireai/evals/scheduling/controller.py", line 1544, in run_multi_pipeline_inference
    final_results = self._compute_final_metrics_for_pipelines(...)
File ".../rapidfireai/evals/scheduling/controller.py", line 699, in _compute_final_metrics_for_pipelines
    cumulative_metrics = aggregator.online_strategy.add_confidence_interval_info(...)
File ".../rapidfireai/evals/metrics/online_strategies.py", line 104, in add_confidence_interval_info
    estimate, lower, upper = self.get_confidence_interval_algebraic(...)
File ".../rapidfireai/evals/metrics/online_strategies.py", line 314, in get_confidence_interval_algebraic
    return self.normal_approximation(...)
File ".../rapidfireai/evals/metrics/online_strategies.py", line 294, in normal_approximation
    std_error = math.sqrt(variance)
ValueError: math domain error

Minimal direct repro (without run_evals)

from rapidfireai.evals.metrics.online_strategies import NormalApproximationStrategy
NormalApproximationStrategy().get_confidence_interval_algebraic(
    estimate=1.0000000002, sample_size=36, value_range=(0, 1))
# ValueError: math domain error

Triggered in practice whenever accumulate_metrics_fn returns a weighted mean of [0, 1]-valued per-batch metrics: e.g. a Precision@k that is 1.0 on every batch can accumulate to 1.0 + 1ULP rather than exact 1.0, depending on the weight values. Equivalent crash for estimate = -1e-10.

Workaround

Clamp value to the exact value_range boundary inside accumulate_metrics_fn before returning. We use a 1e-9 snap tolerance:

def _clamp01(v): return 1.0 if v >= 1.0-1e-9 else (0.0 if v <= 1e-9 else v)

Environment

  • rapidfireai: main (HEAD 91d94de); same crash on 0.15.2 PyPI
  • Python 3.12, Linux
  • Setup used in the verification notebook: experiment_name bug244_repro, endpoint bug244_gpt4omini, model gpt-4o-mini via MLflow gateway, 1 config, 2 questions, 1 shard

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions