⚡️ Speed up method TreeSitterAnalyzer.find_functions by 28% in PR #1561 (add/support_react)#1593
Merged
claude[bot] merged 2 commits intoFeb 20, 2026
Conversation
The optimization achieves a **27% runtime improvement** (18.2ms → 14.3ms) by making two key changes to the tree traversal logic: ## Primary Optimization: Iterative DFS with Explicit Stack The original code used Python recursion to traverse the syntax tree, making a recursive call for each child node. This approach incurs significant overhead from: - Python's function call machinery (stack frame creation, argument passing) - Repeated keyword argument unpacking on every recursive call - Deep call stacks for nested code structures The optimized version replaces recursion with an **iterative depth-first search using an explicit stack**. Each stack entry stores `(node, current_class, current_function)` as a tuple, and the traversal loop pops nodes and processes them iteratively. This eliminates function call overhead entirely and reduces memory pressure from deep recursion. **Impact on workloads**: The line profiler shows the `_walk_tree_for_functions` method dropped from 70ms to 42ms (40% improvement). Test results confirm larger speedups for deeply nested code: - 50 levels of nesting: **37.3% faster** (974μs → 709μs) - 100 functions: **25.4% faster** (1.33ms → 1.06ms) - Large source files with mixed content: **32.6% faster** (2.62ms → 1.97ms) ## Secondary Optimization: Cached Function Type Sets The original code reconstructed the `function_types` set on every node visit (12,665 times in profiling), repeatedly adding "arrow_function" and "method_definition" based on flags. The optimized version caches these sets in `_function_types_cache` keyed by `(include_methods, include_arrow_functions)`. Since these flags are constant per traversal, the set is built once and reused for all nodes. **Impact**: While a smaller contributor than the iterative traversal, this eliminates ~20ms of redundant set operations visible in the line profiler (lines building and modifying `function_types` accounted for ~15% of original runtime). ## Trade-offs For very small/empty inputs, the optimization shows minor slowdowns (7-10% on empty source, whitespace-only files) due to cache initialization overhead. However, these edge cases are not representative of real-world usage where the function analyzes actual code with multiple functions. All realistic test cases with actual functions show speedups of **8-42%**, with the largest gains on complex, deeply nested, or large codebases—exactly the scenarios where this analyzer would be used in production. The optimization maintains identical behavior and correctness across all 54 test cases while dramatically improving performance for production workloads.
Contributor
PR Review SummaryPrek Checks✅ Fixed —
Code Review✅ No critical issues found. The optimization correctly transforms a recursive DFS tree traversal into an iterative one with an explicit stack:
Test Coverage
Notes:
Last updated: 2026-02-20 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1561
If you approve this dependent PR, these changes will be merged into the original PR branch
add/support_react.📄 28% (0.28x) speedup for
TreeSitterAnalyzer.find_functionsincodeflash/languages/javascript/treesitter_utils.py⏱️ Runtime :
18.2 milliseconds→14.3 milliseconds(best of15runs)📝 Explanation and details
The optimization achieves a 27% runtime improvement (18.2ms → 14.3ms) by making two key changes to the tree traversal logic:
Primary Optimization: Iterative DFS with Explicit Stack
The original code used Python recursion to traverse the syntax tree, making a recursive call for each child node. This approach incurs significant overhead from:
The optimized version replaces recursion with an iterative depth-first search using an explicit stack. Each stack entry stores
(node, current_class, current_function)as a tuple, and the traversal loop pops nodes and processes them iteratively. This eliminates function call overhead entirely and reduces memory pressure from deep recursion.Impact on workloads: The line profiler shows the
_walk_tree_for_functionsmethod dropped from 70ms to 42ms (40% improvement). Test results confirm larger speedups for deeply nested code:Secondary Optimization: Cached Function Type Sets
The original code reconstructed the
function_typesset on every node visit (12,665 times in profiling), repeatedly adding "arrow_function" and "method_definition" based on flags. The optimized version caches these sets in_function_types_cachekeyed by(include_methods, include_arrow_functions). Since these flags are constant per traversal, the set is built once and reused for all nodes.Impact: While a smaller contributor than the iterative traversal, this eliminates ~20ms of redundant set operations visible in the line profiler (lines building and modifying
function_typesaccounted for ~15% of original runtime).Trade-offs
For very small/empty inputs, the optimization shows minor slowdowns (7-10% on empty source, whitespace-only files) due to cache initialization overhead. However, these edge cases are not representative of real-world usage where the function analyzes actual code with multiple functions. All realistic test cases with actual functions show speedups of 8-42%, with the largest gains on complex, deeply nested, or large codebases—exactly the scenarios where this analyzer would be used in production.
The optimization maintains identical behavior and correctness across all 54 test cases while dramatically improving performance for production workloads.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1561-2026-02-20T08.59.35and push.