Commit 52ec0a9
authored
Optimize JavaAssertTransformer._infer_type_from_assertion_args
Runtime improvement (primary): the optimized version runs ~11% faster overall (10.3ms -> 9.23ms). Line-profiles show the hot work (argument splitting and literal checks) is measurably reduced.
What changed (concrete):
- Added a fast-path in _split_top_level_args: if the args string contains none of the "special" delimiters (quotes, braces, parens), we skip the character-by-character parser and return either args_str.split(",") or [args_str].
- Moved several literal/cast regexes into __init__ as precompiled attributes (self._FLOAT_LITERAL_RE, self._DOUBLE_LITERAL_RE, self._LONG_LITERAL_RE, self._INT_LITERAL_RE, self._CHAR_LITERAL_RE, self._cast_re) and replaced re.match(...) for casts with self._cast_re.match(...).
Why this speeds things up:
- str.split is implemented in C and is orders of magnitude faster than a Python-level loop that iterates characters, manages stack depth, and joins fragments. The fast-path catches the common simple cases (no nested parentheses/quotes/generics) and lets the interpreter use the highly-optimized C split, which is why very large comma-separated inputs show the biggest wins (e.g., the 1000-arg test goes from ~1.39ms to ~67.5μs).
- Precompiling regexes removes repeated compilation overhead and lets .match be executed directly on a compiled object. The original code used re.match(...) in-place for cast detection which implicitly compiles the pattern or goes through the module-level cache; using a stored compiled pattern is cheaper and eliminates that runtime cost.
- Combined, these changes reduce the time spent inside _split_top_level_args and _type_from_literal (the line profilers show reduced wall time for those functions), producing the measured global runtime improvement.
Behavioral/compatibility notes:
- The fast-path preserves original behavior: when no special delimiter is present it simply splits on commas (or returns a single entry), otherwise it falls back to the full, safe parser that respects nested delimiters and strings.
- Some microbenchmarks regress slightly (a few single-case timings in the annotated tests are a bit slower); this is expected because we add a small _special_re.search check for every call. The overall trade-off was accepted because it yields substantial savings in the common and expensive cases (especially large/simple comma-separated argument lists).
- The optimization is most valuable when this function is exercised many times or on long/simple argument lists (hot paths that produce many simple comma-separated tokens). It is neutral or slightly negative for a handful of small or highly-nested inputs, but those are rare in the benchmarks.
Tests and workload guidance:
- Big wins: large-scale, many-argument inputs or many repeated calls where arguments are simple comma-separated literals (annotated tests show up to ~20x speedups for such cases).
- No/low impact: complex first arguments with nested parentheses/generics or many quoted strings — the safe parser still runs there, so correctness is preserved; timings remain similar.
- Small regressions: a few microbench cases (very short inputs or certain char-literal checks) are marginally slower due to the extra quick search, but these regressions are small relative to the global runtime improvement.
Summary:
By routing simple/common inputs to str.split (C-level speed) and eliminating per-call regex compilation for literal/cast detection, the optimized code reduces time in the hot parsing and literal-detection paths, producing the observed ~11% runtime improvement while maintaining correctness for nested/quoted input via the fallback parser.1 parent 342a9c5 commit 52ec0a9
1 file changed
Lines changed: 19 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
198 | 198 | | |
199 | 199 | | |
200 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
201 | 210 | | |
202 | 211 | | |
203 | 212 | | |
| |||
972 | 981 | | |
973 | 982 | | |
974 | 983 | | |
975 | | - | |
| 984 | + | |
976 | 985 | | |
977 | 986 | | |
978 | 987 | | |
979 | 988 | | |
980 | 989 | | |
981 | 990 | | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
982 | 1000 | | |
983 | 1001 | | |
984 | 1002 | | |
| |||
0 commit comments