Commit c14c7ae
authored
Optimize _find_type_node
The optimized code achieves a **143% speedup** (from 169μs to 69.3μs) through three key performance improvements:
**1. Eliminated Dictionary Reconstruction Overhead (25.3% of original time)**
The original code rebuilt the `type_declarations` dictionary on every recursive call (329 times in the profiler). By hoisting it to a module-level constant `TYPE_DECLARATIONS`, this overhead is completely eliminated. The profiler shows ~340μs spent creating this dict repeatedly in the original version.
**2. Replaced Recursion with Iterative Stack-Based DFS**
The original code made 319 recursive calls (`_find_type_node(child, type_name, source_bytes)` at 638μs). Each recursive call incurs Python function call overhead including frame creation, argument passing, and local variable setup. The optimized version uses an explicit stack to traverse the tree iteratively, eliminating this overhead entirely. This is especially impactful in the deep tree test case, which shows **211% speedup** (146μs → 47μs).
**3. Direct Byte Comparison Instead of UTF-8 Decoding**
The original code decoded byte slices to strings 12 times (`source_bytes[...].decode("utf8")` at 13μs). The optimized version encodes `type_name` to bytes once at function entry (10.7μs for 10 calls), then performs direct byte-to-byte comparison without any decoding. This is particularly effective for multibyte UTF-8 names, as shown in the UTF-8 test case with **25.1% speedup** (2.24μs → 1.79μs).
**Performance Analysis by Test Case:**
- Simple cases show modest improvements (0-2μs) due to lower overhead
- Nested/deep traversals show dramatic gains (e.g., 211% on 300-depth tree) where recursion elimination matters most
- UTF-8 handling improves 25% by avoiding repeated decode operations
- A few edge cases show minor regression (1-6% slower) due to stack manipulation overhead, but these are dwarfed by gains in realistic workloads
The optimization preserves exact behavior including traversal order (reversed children maintain left-to-right DFS), return types, and edge case handling while delivering significant runtime improvements especially for deep syntax trees—a common scenario when parsing Java source code.1 parent 41b08a9 commit c14c7ae
1 file changed
Lines changed: 25 additions & 17 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
22 | 28 | | |
23 | 29 | | |
24 | 30 | | |
| |||
253 | 259 | | |
254 | 260 | | |
255 | 261 | | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | | - | |
270 | | - | |
271 | | - | |
272 | | - | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
273 | 281 | | |
274 | 282 | | |
275 | 283 | | |
| |||
0 commit comments