Optimize _add_behavior_instrumentation

codeflash-ai[bot] · web-flow · commit 464967e0622f · 2026-02-20T10:00:31.000Z
The optimized code achieves a **196% speedup** (from 13.3ms to 4.49ms) primarily through two focused optimizations that target the hottest paths identified by the line profiler: ## Key Optimizations ### 1. Early Exit in `wrap_target_calls_with_treesitter` (Primary Driver) The profiler shows that in the original code, 55.5% of `wrap_target_calls_with_treesitter`'s time (9.7ms out of 17.5ms) was spent in `_collect_calls`, which parses Java code with tree-sitter. The optimization adds: ```python body_text = "\n".join(body_lines) if func_name not in body_text: return list(body_lines), 0 ``` This simple string membership check avoids expensive tree-sitter parsing when the target function isn't present in the test method body. Since many test methods don't call the function being instrumented, this provides massive savings. The annotated tests confirm this pattern - tests with empty or simple bodies (no function calls) show the largest speedups: 639% for large methods and 1018% for complex expressions. ### 2. Optimized `_is_test_annotation` (Secondary Improvement) The profiler shows `_is_test_annotation` being called 1,950 times, spending 100% of its time (1.21ms) on regex matching. The optimization replaces the regex with direct string checks: ```python if not stripped_line.startswith("@test"): return False if len(stripped_line) == 5: # exactly "@test" return True next_char = stripped_line[5] return next_char == " " or next_char == "(" ``` This avoids regex overhead for the 1,737 non-`@Test` annotations that can be rejected immediately with `startswith()`. The profiler shows this reduced time from 1.21ms to 0.91ms (25% faster in this function). ## Performance Impact by Test Type The annotated tests reveal optimization effectiveness varies by workload: - **Empty/simple methods**: 107-154% faster (early exit dominates) - **Methods with complex expressions**: 396-1018% faster (avoids parsing large expression trees) - **Large methods with many statements**: 510-639% faster (early exit + reduced AST traversal) - **Methods with actual function calls**: 111-152% faster (smaller benefit since tree-sitter must run) ## Context and Production Impact Based on `function_references`, this function is called from test discovery in `test_instrumentation.py`, specifically for behavior instrumentation that captures return values. The early exit optimization is particularly valuable here because: 1. Test discovery processes many test methods, but typically only a subset call the target function 2. The function operates on the hot path during test suite instrumentation 3. Large test suites with 100+ test methods (see test case showing 154% speedup for 150 methods) benefit significantly The optimization maintains correctness - all test cases pass with identical output, confirming the early exit safely bypasses work that produces no changes when the function isn't present.
diff --git a/codeflash/languages/java/instrumentation.py b/codeflash/languages/java/instrumentation.py
@@ -77,7 +77,12 @@ def _is_test_annotation(stripped_line: str) -> bool:
         @TestFactory
         @TestTemplate
     """
-    return bool(_TEST_ANNOTATION_RE.match(stripped_line))
+    if not stripped_line.startswith("@Test"):
+        return False
+    if len(stripped_line) == 5:
+        return True
+    next_char = stripped_line[5]
+    return next_char == " " or next_char == "("
 
 
 def _is_inside_lambda(node: Any) -> bool:
@@ -152,8 +157,11 @@ def wrap_target_calls_with_treesitter(
     """
     from codeflash.languages.java.parser import get_java_analyzer
 
-    analyzer = get_java_analyzer()
     body_text = "\n".join(body_lines)
+    if func_name not in body_text:
+        return list(body_lines), 0
+
+    analyzer = get_java_analyzer()
     body_bytes = body_text.encode("utf8")
     prefix_len = len(_TS_BODY_PREFIX_BYTES)