Skip to content

batch optimizations.#2055

Merged
KRRT7 merged 18 commits into
mainfrom
perf/defer-cli-imports
Apr 10, 2026
Merged

batch optimizations.#2055
KRRT7 merged 18 commits into
mainfrom
perf/defer-cli-imports

Conversation

@KRRT7
Copy link
Copy Markdown
Contributor

@KRRT7 KRRT7 commented Apr 10, 2026

Summary

Systematic startup optimization across the CLI command paths:

  1. Defer heavy module-level imports in cli.py, main.py, env_utils.py, shell_utils.py, cmd_auth.py, and models.py — move Rich, libcst, pydantic, isort, tomlkit, sentry_sdk, and other heavy libraries from module level into the functions that use them
  2. Early-exit command dispatchauth and compare commands skip banner, telemetry, and version check entirely
  3. Inline trivial helpersis_LSP_enabled() (one-liner env var check) inlined to avoid importing lsp.helpers on the happy path
  4. Backport libcst visitor dispatch cache from codeflash-python — caches MatcherDecoratableTransformer/Visitor dispatch tables by class type (24x faster per instantiation)
  5. Fix ruff auto-format workflow accidentally rewriting version.py placeholders via uv-dynamic-versioning

A/B Benchmark (Azure Standard_D4s_v5, Python 3.13, hyperfine --warmup 3 --min-runs 30)

Command main optimize Speedup
--version 53ms 53ms unchanged (already fast-pathed)
--help 342ms 73ms 4.7x
compare --help 326ms 73ms 4.5x
auth status 983ms 159ms 6.2x

Module import times

Module Before After Speedup
models.py 633ms 125ms 5.1x

libcst visitor dispatch cache

Metric Without cache With cache Speedup
Per instantiation 28ms 1.2ms 24x

Test plan

  • codeflash --version / --help / auth status / compare --help all work correctly
  • check_formatter_installed() and get_codeflash_api_key() work with deferred imports
  • from codeflash.models.models import TestResults, CodeString, CoverageData works
  • libcst cache monkeypatch installs correctly and caches dispatch tables
  • ruff check passes clean
  • CI passes (full test suite)

KRRT7 and others added 4 commits April 9, 2026 23:08
Move heavy module-level imports in cli.py (console, env_utils,
code_utils, config_parser, lsp.helpers, version) into the functions
that actually use them. Split main.py imports so parse_args() is
called before loading the full stack — --help exits via argparse
before any heavy modules load.

Benchmark (Azure Standard_D4s_v5, Python 3.13, hyperfine --min-runs 30):
  --help: 297ms → 39ms (7.7x faster)
  --version: 17ms (unchanged)
Restructure main() command dispatch so auth and compare exit early
without loading telemetry (sentry, posthog), version_check, or the
banner. Defer cmd_auth.py imports into functions.

auth status: ~1000ms → 237ms (4.2x)
compare --help: ~297ms → 38ms (7.9x)
@KRRT7 KRRT7 force-pushed the perf/defer-cli-imports branch from d3f10a6 to a019fdc Compare April 10, 2026 04:20
@github-actions github-actions Bot added the workflow-modified This PR modifies GitHub Actions workflows label Apr 10, 2026
uv-dynamic-versioning rewrites version.py on every `uv run`, so the
ruff auto-format job was inadvertently committing dev version strings.
Restore version.py files after formatting and revert the ones already
changed on this branch.
@KRRT7 KRRT7 force-pushed the perf/defer-cli-imports branch from a019fdc to 992e91a Compare April 10, 2026 04:21
KRRT7 and others added 3 commits April 9, 2026 23:29
Defer console, formatter, code_utils, registry, and lsp.helpers imports
from module level into the functions that use them. Inline is_LSP_enabled
(a one-liner env var check) to avoid importing lsp.helpers on the happy
path of get_codeflash_api_key.

auth status: 237ms → 160ms on Azure Standard_D4s_v5.
Move libcst, rich.tree.Tree, console, comparator, code_utils, registry,
lsp.helpers, and LspMarkdownMessage from module-level to the methods that
use them. Only pydantic and TestType remain at module level (needed for
class definitions).

models.py import: 633ms → 125ms on Azure Standard_D4s_v5.
@KRRT7 KRRT7 force-pushed the perf/defer-cli-imports branch from d9e3e1d to 436d642 Compare April 10, 2026 04:38
github-actions Bot and others added 2 commits April 10, 2026 04:39
Cache the visitor dispatch tables that libcst rebuilds on every
MatcherDecoratableTransformer/Visitor instantiation. The tables
depend only on the class, not the instance, so caching by type is
safe. Saves ~27ms per visitor instantiation (24x faster).

Also fix pre-existing ruff F821 in cli.py (missing exit_with_message
import in process_pyproject_config).
@KRRT7 KRRT7 force-pushed the perf/defer-cli-imports branch from 5922b4c to b533f50 Compare April 10, 2026 04:46
KRRT7 added 8 commits April 9, 2026 23:59
Measures median wall-clock time for --version, --help, auth status,
and compare --help across 30 runs with 3 warmups.

Usage:
  codeflash compare main codeflash/optimize \
    --script "python benchmarks/bench_cli_startup.py" \
    --script-output benchmarks/results.json
The test only needs project_root, not a full Optimizer (which requires
an API key). Also adds missing __init__.py to tests/benchmarks/.
- test_benchmark_libcst_multi_file: discover_functions + get_code_optimization_context across 10 real source files
- test_benchmark_libcst_pipeline: full discover → extract → replace → merge pipeline on one file
Imports in cmd_auth.py were moved into function bodies, so mock
patches must target the source modules instead of cmd_auth's namespace.
Use O(1) frozenset membership test with type identity before falling
through to isinstance MRO traversal. Backported from codeflash-python.
5 scenarios: primitives, nested dicts, DB rows, deep nesting,
and identity types (frozenset/range/complex/Decimal/OrderedDict).
Move the 4 most common return-value types (str, list/tuple, dict) to
`orig_type is T` identity checks at the top of the dispatch chain,
before the frozenset lookup.  A single pointer comparison is cheaper
than a frozenset hash, and these types need special handling anyway
(temp-path normalization, recursive comparison, superset support).

Before: dict traversed ~8 isinstance checks before being handled.
After:  dict is handled at check #3 via `orig_type is dict`.

The isinstance fallbacks remain as slow-paths for subclasses (deque,
ChainMap, defaultdict, scipy dok_matrix, etc.).

Backported from codeflash-python dispatch ordering.
Windows defaults to cp1252 which can't decode some source file bytes.
@KRRT7 KRRT7 merged commit 72a41a5 into main Apr 10, 2026
53 of 61 checks passed
@KRRT7 KRRT7 deleted the perf/defer-cli-imports branch April 10, 2026 07:00
@KRRT7 KRRT7 changed the title perf: defer cli.py imports for 7.7x faster --help batch optimizations. Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

workflow-modified This PR modifies GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant