Skip to content

Commit c14c7ae

Browse files
Optimize _find_type_node
The optimized code achieves a **143% speedup** (from 169μs to 69.3μs) through three key performance improvements: **1. Eliminated Dictionary Reconstruction Overhead (25.3% of original time)** The original code rebuilt the `type_declarations` dictionary on every recursive call (329 times in the profiler). By hoisting it to a module-level constant `TYPE_DECLARATIONS`, this overhead is completely eliminated. The profiler shows ~340μs spent creating this dict repeatedly in the original version. **2. Replaced Recursion with Iterative Stack-Based DFS** The original code made 319 recursive calls (`_find_type_node(child, type_name, source_bytes)` at 638μs). Each recursive call incurs Python function call overhead including frame creation, argument passing, and local variable setup. The optimized version uses an explicit stack to traverse the tree iteratively, eliminating this overhead entirely. This is especially impactful in the deep tree test case, which shows **211% speedup** (146μs → 47μs). **3. Direct Byte Comparison Instead of UTF-8 Decoding** The original code decoded byte slices to strings 12 times (`source_bytes[...].decode("utf8")` at 13μs). The optimized version encodes `type_name` to bytes once at function entry (10.7μs for 10 calls), then performs direct byte-to-byte comparison without any decoding. This is particularly effective for multibyte UTF-8 names, as shown in the UTF-8 test case with **25.1% speedup** (2.24μs → 1.79μs). **Performance Analysis by Test Case:** - Simple cases show modest improvements (0-2μs) due to lower overhead - Nested/deep traversals show dramatic gains (e.g., 211% on 300-depth tree) where recursion elimination matters most - UTF-8 handling improves 25% by avoiding repeated decode operations - A few edge cases show minor regression (1-6% slower) due to stack manipulation overhead, but these are dwarfed by gains in realistic workloads The optimization preserves exact behavior including traversal order (reversed children maintain left-to-right DFS), return types, and edge case handling while delivering significant runtime improvements especially for deep syntax trees—a common scenario when parsing Java source code.
1 parent 41b08a9 commit c14c7ae

1 file changed

Lines changed: 25 additions & 17 deletions

File tree

codeflash/languages/java/context.py

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,12 @@
1919
if TYPE_CHECKING:
2020
from tree_sitter import Node
2121

22+
TYPE_DECLARATIONS = {
23+
"class_declaration": "class",
24+
"interface_declaration": "interface",
25+
"enum_declaration": "enum",
26+
}
27+
2228
logger = logging.getLogger(__name__)
2329

2430

@@ -253,23 +259,25 @@ def _find_type_node(node: Node, type_name: str, source_bytes: bytes) -> tuple[No
253259
Tuple of (node, type_kind) where type_kind is "class", "interface", or "enum".
254260
255261
"""
256-
type_declarations = {
257-
"class_declaration": "class",
258-
"interface_declaration": "interface",
259-
"enum_declaration": "enum",
260-
}
261-
262-
if node.type in type_declarations:
263-
name_node = node.child_by_field_name("name")
264-
if name_node:
265-
node_name = source_bytes[name_node.start_byte : name_node.end_byte].decode("utf8")
266-
if node_name == type_name:
267-
return node, type_declarations[node.type]
268-
269-
for child in node.children:
270-
result, kind = _find_type_node(child, type_name, source_bytes)
271-
if result:
272-
return result, kind
262+
# Encode the search name once to avoid repeated UTF-8 decodes of slices.
263+
type_name_bytes = type_name.encode("utf8")
264+
265+
# Use an explicit stack for DFS to avoid recursion overhead.
266+
stack: list[Node] = [node]
267+
268+
while stack:
269+
current = stack.pop()
270+
if current.type in TYPE_DECLARATIONS:
271+
name_node = current.child_by_field_name("name")
272+
if name_node:
273+
# Compare bytes directly to avoid decoding the slice to str.
274+
if source_bytes[name_node.start_byte : name_node.end_byte] == type_name_bytes:
275+
return current, TYPE_DECLARATIONS[current.type]
276+
277+
# Push children in reverse order so that the leftmost child is processed first,
278+
# preserving the original recursive traversal order.
279+
for child in reversed(current.children):
280+
stack.append(child)
273281

274282
return None, ""
275283

0 commit comments

Comments
 (0)