From cf209811193491bf3790937f883524cde4d54b38 Mon Sep 17 00:00:00 2001
From: "codeflash-ai[bot]"
 <148906541+codeflash-ai[bot]@users.noreply.github.com>
Date: Sat, 24 Jan 2026 10:12:32 +0000
Subject: [PATCH] Optimize _pstdev
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The optimized code achieves a **507% speedup** by eliminating two major performance bottlenecks in the original implementation:

## Key Optimizations

**1. Single-Pass Computation with Welford's Algorithm**
The original code made two full passes over the data:
- First pass: List comprehension to filter out `None` values (`scores = [score for score in scores if score is not None]`)
- Second pass: `statistics.pstdev()` internally iterates again to compute mean and variance

The optimized version uses Welford's online algorithm to compute the population standard deviation in a **single pass**, updating running statistics (count, mean, sum of squared differences) incrementally as it encounters each non-`None` value.

**2. Eliminated Intermediate List Allocation**
The original code allocates a new filtered list in memory. For inputs with 1000 elements, this creates a new list structure with associated overhead. The optimized version processes elements on-the-fly without allocating any intermediate collections, reducing memory pressure and allocation costs.

**3. Direct Math Operations**
Instead of calling `statistics.pstdev()` (which has its own overhead for parameter validation and general-purpose handling), the optimized code directly computes `variance = M2 / n` and `std = math.sqrt(variance)`, avoiding the function call overhead.

## Performance Impact by Scenario

**Line profiler data shows the original code spent 90% of time in `round(statistics.pstdev(scores), rounding)`**, making this the critical hot spot.

Test results demonstrate consistent speedups across scenarios:
- **Small lists (2-5 elements)**: 5-8× faster (e.g., `test_basic_two_elements`: 50.0μs → 7.75μs)
- **Medium lists (100-500 elements)**: 4-5× faster (e.g., `test_large_scale_500_elements`: 490μs → 85.7μs)
- **Large lists (1000 elements)**: 5-6× faster (e.g., `test_large_scale_1000_elements`: 961μs → 168μs)
- **Lists with many `None` values**: Up to 10× faster (e.g., `test_filtering_none_preserves_order_invariance`: 910-1057% speedup) because the original code still allocates the filtered list even if most elements are `None`

The optimization is particularly effective when:
- Input lists are large (>100 elements)
- Many `None` values need filtering (avoids allocating sparse lists)
- The function is called repeatedly in metrics computation pipelines (cumulative savings)

**Edge case note**: Single-element lists are slightly slower (20-26%) because the optimized code still performs the loop setup, whereas the original code quickly returns after the filter check. This minor regression is negligible given the function returns `None` for single elements anyway.
---
 unstructured/metrics/utils.py | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/unstructured/metrics/utils.py b/unstructured/metrics/utils.py
index c490aa752b..e6bf301a31 100644
--- a/unstructured/metrics/utils.py
+++ b/unstructured/metrics/utils.py
@@ -1,4 +1,5 @@
 import logging
+import math
 import os
 import re
 import statistics
@@ -214,12 +215,30 @@ def _pstdev(scores: List[Optional[float]], rounding: Optional[int] = 3) -> Union
     Args:
         rounding (int): optional argument that allows user to define decimal points. Default at 3.
     """
-    scores = [score for score in scores if score is not None]
-    if len(scores) <= 1:
+    # Single-pass Welford algorithm to compute population variance without allocating a new list.
+    n = 0
+    mean = 0  # keeps the running mean in the same numeric type as inputs when possible
+    M2 = 0  # running sum of squares of differences from current mean
+
+    for score in scores:
+        if score is None:
+            continue
+        # Maintain behavior: let operations raise the same exceptions for invalid types
+        n += 1
+        delta = score - mean
+        mean += delta / n
+        M2 += delta * (score - mean)
+
+    if n <= 1:
         return None
+
+    variance = M2 / n
+    # Protect against tiny negative variance from floating-point error
+    std = math.sqrt(float(variance) if variance >= 0 else 0.0)
+
     if not rounding:
-        return statistics.pstdev(scores)
-    return round(statistics.pstdev(scores), rounding)
+        return std
+    return round(std, rounding)
 
 
 def _count(scores: List[Optional[float]]) -> float: