⚡️ Speed up method InitDecorator.visit_ClassDef by 65% in PR #363 (part-1-windows-fixes)#784
Merged
KRRT7 merged 1 commit intoSep 30, 2025
Conversation
The optimized code achieves a **65% speedup** through strategic precomputation of AST nodes that are repeatedly created during class processing. **Key optimizations:** 1. **Precomputed AST components in `__init__`**: Instead of reconstructing identical AST nodes (like `ast.Name`, `ast.arg`, `ast.Constant`) on every `visit_ClassDef` call, the optimized version creates them once during initialization and reuses them. This eliminates the expensive AST node construction overhead seen in the profiler - lines creating decorator keywords and super() call components dropped from ~2ms total to ~0.6ms. 2. **Optimized decorator presence check**: Replaced the `any()` generator expression with a `for/else` loop that stops immediately when finding an existing `codeflash_capture` decorator. This avoids generator allocation overhead and short-circuits the search earlier. 3. **Reduced per-class AST construction**: The decorator is now built once per class using precomputed components, rather than reconstructing all keywords and function references from scratch each time. **Performance impact by test type:** - **Basic cases** (single class with simple `__init__`): ~140-220% faster, benefiting from reduced AST node construction - **Edge cases** (classes needing synthetic `__init__`): ~100-150% faster, particularly benefiting from prebuilt super() call components - **Large scale** (many methods/classes): ~17-40% faster, where the constant-time optimizations compound across many iterations The optimization is most effective for workloads processing many classes, as the upfront precomputation cost is amortized across multiple `visit_ClassDef` calls, directly addressing the bottleneck of repetitive AST node creation identified in the profiler.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #363
If you approve this dependent PR, these changes will be merged into the original PR branch
part-1-windows-fixes.📄 65% (0.65x) speedup for
InitDecorator.visit_ClassDefincodeflash/verification/instrument_codeflash_capture.py⏱️ Runtime :
1.19 milliseconds→718 microseconds(best of116runs)📝 Explanation and details
The optimized code achieves a 65% speedup through strategic precomputation of AST nodes that are repeatedly created during class processing.
Key optimizations:
Precomputed AST components in
__init__: Instead of reconstructing identical AST nodes (likeast.Name,ast.arg,ast.Constant) on everyvisit_ClassDefcall, the optimized version creates them once during initialization and reuses them. This eliminates the expensive AST node construction overhead seen in the profiler - lines creating decorator keywords and super() call components dropped from ~2ms total to ~0.6ms.Optimized decorator presence check: Replaced the
any()generator expression with afor/elseloop that stops immediately when finding an existingcodeflash_capturedecorator. This avoids generator allocation overhead and short-circuits the search earlier.Reduced per-class AST construction: The decorator is now built once per class using precomputed components, rather than reconstructing all keywords and function references from scratch each time.
Performance impact by test type:
__init__): ~140-220% faster, benefiting from reduced AST node construction__init__): ~100-150% faster, particularly benefiting from prebuilt super() call componentsThe optimization is most effective for workloads processing many classes, as the upfront precomputation cost is amortized across multiple
visit_ClassDefcalls, directly addressing the bottleneck of repetitive AST node creation identified in the profiler.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
🔎 Concolic Coverage Tests and Runtime
codeflash_concolic_nh__l8ip/tmpjzz9bzb_/test_concolic_coverage.py::test_InitDecorator_visit_ClassDefTo edit these changes
git checkout codeflash/optimize-pr363-2025-09-30T01.44.19and push.