Commit ce1f8da
authored
Optimize get_optimized_code_for_module
Runtime improvement (primary): the optimized version reduces end-to-end runtime from 48.6 ms to 43.5 ms — an 11% overall speedup. Many hot-call scenarios (repeated lookups, large mappings and bulk iterations) see much larger per-case gains (up to ~80% in repeated calls and ~50% on some large-map lookups in the annotated tests).
What changed (concrete optimizations)
- CodeStringsMarkdown.file_to_path:
- Replaced a two-step .get(...) / indexing pattern with a single try/except KeyError around self._cache["file_to_path"]. This avoids multiple dict lookups and branches when the cache exists.
- Builds and caches the mapping only on the KeyError path (so successful fast-path returns are a single dict access).
- get_optimized_code_for_module:
- Compute str(relative_path) once (str_relative) and reuse it instead of calling str(...) repeatedly.
- Avoid constructing full lists of keys and Path objects when searching for similar filenames:
- Iterate file_to_code_context keys directly (no temporary available_files list unless needed).
- Use os.path.basename(f) instead of Path(f).name to avoid allocating Path objects; os.path.basename is a thin C-level operation and much cheaper for simple basename extraction.
- Defer construction of available_files (list(file_to_code_context.keys())) until actually needed for logging, avoiding unnecessary allocations in the common case.
Why this speeds things up (technical reasons)
- Less Python-level work and fewer allocations: the original code performed more dict lookups, created temporary lists, and built many Path objects inside a list comprehension — each Path(...) allocates a Python object and calls methods, which is expensive in hot loops. The optimized code reduces object construction and reduces interpreter-level branching.
- Fewer lookups: switching to try/except for the cached value reduces the number of dictionary key operations on the hot path (successful cache hit path becomes a single access).
- Cheaper basename extraction: os.path.basename is implemented in C and avoids constructing heavy Path objects for each candidate, which lowers per-iteration overhead when scanning many keys.
- Deferred work: only produce heavy values (available_files list) when we actually need them for a warning/debug path, so the common successful-case remains minimal.
How this affects existing workloads (based on tests and likely hot paths)
- Big wins when the function is called many times or the mapping is large:
- Repeated calls to the same path (hot path) benefit heavily because file_to_path cache access and the simple get(...) are cheap.
- Large mappings where we occasionally scan keys for similarity gain because we avoid Path allocations and unnecessary list construction.
- Minimal/zero impact for simple single-shot calls where no scanning occurs beyond the direct dict get.
- A few tests show micro-regressions (~0–2% slower in isolated cases). These are tiny and reasonable trade-offs for the improved aggregate runtime and much larger wins on hot workloads — e.g., a single extra function call or slightly different branching can explain sub-percent differences.
Behavioral/key-dependency notes
- Semantics preserved: fallback logic, similarity detection and logging behavior remain functionally the same. The only behavioral change is internal ordering of checks and how we detect basenames; that produces equivalent results for path strings.
- New import of os is local and trivial; no new external dependencies.
Which test cases benefit most (from annotated_tests)
- Repeated-calls and large-map iteration tests show the largest improvements (repeated_calls_use_cached_file_to_path, large_mapping_retrieve_multiple_entries, and the large-map loop).
- Tests that exercise the “scan for similar filename” logic also improve because os.path.basename avoids Path allocations across many keys (large_scale_many_entries_similar_filenames_detected_among_many).
- A few single-call tests show negligible change or very small regressions, which is an acceptable trade-off given the substantial wins on hot paths.
Summary
- Primary win: 11% overall runtime reduction (with much larger wins on hot paths).
- How: reduce dict lookups, avoid temporary lists, eliminate Path(...) allocations in tight loops, reuse computed strings, and defer expensive work.
- Trade-offs: minor micro-regressions in a couple of edge micro-benchmarks, but these are acceptable given the improved throughput and much larger gains where it matters (repeated and large-scale calls).1 parent ea66451 commit ce1f8da
2 files changed
Lines changed: 17 additions & 9 deletions
Lines changed: 11 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
557 | 558 | | |
558 | 559 | | |
559 | 560 | | |
560 | | - | |
| 561 | + | |
| 562 | + | |
561 | 563 | | |
562 | 564 | | |
563 | 565 | | |
564 | | - | |
| 566 | + | |
565 | 567 | | |
566 | 568 | | |
567 | 569 | | |
568 | | - | |
569 | 570 | | |
570 | 571 | | |
571 | 572 | | |
572 | | - | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
573 | 578 | | |
| 579 | + | |
574 | 580 | | |
575 | 581 | | |
576 | 582 | | |
577 | 583 | | |
578 | 584 | | |
| 585 | + | |
579 | 586 | | |
580 | 587 | | |
581 | 588 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
331 | 331 | | |
332 | 332 | | |
333 | 333 | | |
334 | | - | |
| 334 | + | |
335 | 335 | | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
340 | 341 | | |
341 | 342 | | |
342 | 343 | | |
| |||
0 commit comments