[Enhancement] Stop looping when runtime is stable (CF-934)#967
Conversation
…sistent-loop-break
…sistent-loop-break
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
…sistent-loop-break
…sistent-loop-break
The optimization achieves a **437% speedup** by eliminating expensive function calls and using more efficient algorithms for median calculation and min/max operations. **Key optimizations applied:** 1. **Custom median calculation**: Replaced `statistics.median(recent)` with a custom implementation using `sorted(recent)` and direct indexing. This eliminates the overhead of the statistics module's generic median function. 2. **Reused sorted array**: The sorted array from median calculation is reused for min/max operations (`recent_sorted[0]`, `recent_sorted[-1]`) instead of calling `min(recent)` and `max(recent)` separately, eliminating redundant iterations. 3. **Replaced `statistics.mean()` calls**: Substituted `mean(recent[:half])` and `mean(recent[half:])` with direct `sum()/length` calculations, removing function call overhead. 4. **Early termination optimization**: Changed the `all()` generator expression to an explicit loop with early `break`, avoiding unnecessary iterations when the first non-conforming value is found. **Performance impact analysis:** The line profiler shows dramatic improvements in the most expensive operations: - `first = mean(recent[:half])`: 2.07ms → 47.7μs (98% reduction) - `second = mean(recent[half:])`: 1.54ms → 35.4μs (98% reduction) - `m = median(recent)`: 220μs → eliminated, replaced with ~55μs custom implementation **Hot path significance:** Based on the function reference, `should_stop()` is called in the main test loop (`pytest_runtestloop`) after every iteration to determine if benchmarking should continue. This makes it a critical hot path function where the 437% speedup directly translates to faster benchmark completion times. **Test case performance:** The optimization performs exceptionally well across all test scenarios, with speedups ranging from 400-500% for typical cases involving stable data, fluctuations, and large datasets. Even edge cases with insufficient data show 10-25% improvements.
⚡️ Codeflash found optimizations for this PR📄 438% (4.38x) speedup for
|
…sistent-loop-break
…sistent-loop-break
|
@claude review this |
|
Claude finished @aseembits93's task in 3m 26s —— View job Code Review Complete ✅I've reviewed this PR and identified several issues that need attention, along with some positive observations. Critical Issues 🔴1. Type Hint Error in
|
| raise FileNotFoundError(msg) | ||
|
|
||
| if should_make_copy: | ||
| file_path = path |
| @@ -0,0 +1,2 @@ | |||
| def calculate_best_summed_runtime(grouped_runtime_info: dict[any, list[int]]) -> int: | |||
There was a problem hiding this comment.
entire file for a single function 🥲
| # 2) Window spread is small | ||
| r_max = recent[0] | ||
| r_min = recent[-1] | ||
| spread_ok = (r_max - r_min) / r_min <= spread_rel_tol |
|
|
||
| while total_time >= SHORTEST_AMOUNT_OF_TIME: # need to run at least one for normal tests | ||
| count += 1 | ||
| total_time = self._get_total_time(session) |
There was a problem hiding this comment.
see if total_time changes inside this loop
There was a problem hiding this comment.
it doesn't, I checked
|
|
||
| if self.enable_stability_check: | ||
| loop_end = _ORIGINAL_PERF_COUNTER_NS() | ||
| dt = loop_end - loop_start # nano-seconds |
There was a problem hiding this comment.
_ORIGINAL_PERF_COUNTER_NS() - loop_start
|
|
||
| elapsed += dt | ||
|
|
||
| best_runtime_until_now = calculate_best_summed_runtime(self.usable_runtime_data_by_test_case) |
| window_size = int(STABILITY_WINDOW_SIZE * estimated_total_loops + 0.5) | ||
| if ( | ||
| count >= session.config.option.codeflash_min_loops | ||
| and window_size > 1 |
There was a problem hiding this comment.
maybe remove the check here
|
|
||
| count: int = 0 | ||
| runtimes = [] | ||
| elapsed = 0.0 |
There was a problem hiding this comment.
if elapsed is in ns then it's an int
did some experiments for this with (optimize-me & docling), with the current values, I got
Details