fix: Java line profiler timeout and test categorization#1470
Merged
mashraf-222 merged 1 commit intoFeb 12, 2026
Merged
Conversation
Fixed two critical bugs preventing Java optimization E2E workflows: Issue 1: Line profiler timeout was too short (15s) for Maven operations, causing timeouts before tests could complete. Maven needs time for JVM startup, dependency resolution, and test execution. Issue 2: Test result categorization failed to match original test file names to instrumented test files, causing all existing unit tests to show as 0 passed/failed instead of their actual results. Both issues blocked Java optimization from completing successfully. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problems Fixed
Issue 1: Line Profiler Maven Timeout (15 seconds)
Java line profiler tests were timing out after 15 seconds during Maven test execution. Maven requires significantly more time for:
Symptom:
Impact: Line profiler could not complete, blocking optimization candidate generation and preventing E2E Java optimization workflows.
Issue 2: Test Result Categorization Failure
Test result parsing could not match original test file names (e.g.,
FibonacciTest.java) to their instrumented counterparts (FibonacciTest__perfinstrumented.java).Symptom:
Impact: All existing unit tests showed as "Passed: 0, Failed: 0" instead of their actual results, making it appear that tests weren't running when they actually were (just miscategorized as "Generated Regression Tests").
Root Causes
Issue 1: Inadequate Timeout Logic
Location:
codeflash/languages/java/test_runner.py:1516The line profiler function used:
When
timeout=15was passed from upstream callers (fromINDIVIDUAL_TESTCASE_TIMEOUTconstant), this evaluated to15instead of ensuring a minimum of 120 seconds. Theoroperator only provides a default when the value isNoneor falsy, not when it's a small positive integer.Issue 2: Missing Fallback Matching
Location:
codeflash/verification/parse_test_output.py:1069JUnit XML results reference the original class name (e.g.,
FibonacciTest) which resolves to the original file path (FibonacciTest.java). However, the test type lookup only searched by instrumented file paths:The system had a
get_test_type_by_original_file_path()method available but wasn't using it as a fallback.Solutions Implemented
Fix 1: Enforce Minimum Timeout (120s)
File:
codeflash/languages/java/test_runner.pyLines: 1511-1517
Changed timeout logic to enforce minimum 120s:
Now
max(15, 120) = 120, ensuring Maven always gets sufficient time regardless of upstream timeout values.Fix 2: Add Original File Path Fallback
File:
codeflash/verification/parse_test_output.pyLines: 1069-1074
Added two-stage lookup with fallback:
This allows JUnit XML results referencing original class names to correctly map to their test type.
Code Changes
codeflash/languages/java/test_runner.py(+6 lines)min_timeoutvariable (120s)max()instead oforcodeflash/verification/parse_test_output.py(+5 lines)get_test_type_by_original_file_path()Testing
Test Environment
/home/ubuntu/code/codeflash/code_to_optimize/java/Fibonacci.javawith existingFibonacciTest.java(11 unit tests)uv run codeflash --file src/main/java/com/example/Fibonacci.java --no-pr --yes --verboseResults Before Fixes
Issue 1 (Timeout):
Timeout: 15 seconds ❌
Issue 2 (Categorization):
Results After Fixes
Issue 1 (Timeout):
Timeout: 120 seconds ✅ (8x longer, proper Maven timeout)
Issue 2 (Categorization):
Verification:
Impact
These fixes unblock Java optimization E2E workflows:
Note: The line profiler still may timeout on slow machines or large projects, but now has a proper baseline timeout that works for typical Maven operations. The 120s minimum aligns with other Maven timeout minimums already established in the codebase (see
run_behavioral_testsat line 322).Related Work
JAVA_TESTCASE_TIMEOUTconstant (120s) fromconfig_consts.pyrun_behavioral_tests()functionget_test_type_by_original_file_path()method (no new code needed)