You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(install): memoize discovery, drop per-file resolve, expand skip dirs (#1538)
Fixes#1533 — apm install on real monorepos (Kubernetes, TypeScript)
took 200-300s; on this repo, multi-target installs scaled O(targets x
packages) because discover_primitives was called 9x per install over
the same project tree and find_primitive_files did Path.resolve() on
every file.
Three independent root causes, all fixed:
1. Discovery memoization. Module-level dict in primitives/discovery.py
keyed on (realpath(base), sorted_exclude_tuple), cleared at top of
run_install_pipeline. Identity-shares the PrimitiveCollection across
the integrate phase loop. 9 -> 3 unique walks per install.
2. Hot-path stat reduction. find_primitive_files now (a) pre-splits
glob patterns once per call (not per file), (b) computes rel_root
by string slicing of os.walk's root + base_prefix instead of
Path.resolve() lstat-per-component per file, (c) defers Path()
construction until after a pattern matches.
3. Skip-dir expansion. DEFAULT_SKIP_DIRS gained vendor, third_party,
Pods, bower_components, jspm_packages, .gradle, target, .next,
.nuxt, .cache, .turbo — keeps the walker out of dependency vendor
trees the user did not author.
Granular perf logging (--verbose):
New utils/perf_stats.py records one event per walk + per discovery
call; render_summary() emits a verbose-only block with per-base walk
breakdown and discovery cache hit-rate:
[#] Perf: 9 walks, 72 file matches, 6222 files visited, 0.739s total walk time
[#] Perf: .: 3 walk(s) (726ms, 6108 files visited, 72 matched)
[#] Perf: apm_modules/_local/foo: 3 walk(s) (4ms, 36 files visited, 0 matched)
[#] Perf: discovery: 9 call(s) (3 unique base(s), 6 cache hit(s), 66%)
Non-CI perf harness:
tests/perf/ contains 4 opt-in scenarios (Kubernetes discover,
TypeScript discover, awd-cli install, multi-target breadth). Skipped
by default; run with PYTEST_PERF=1. Clones large external repos to
/tmp/perf-atlas-clones/ once per session. Not run on CI.
Measured (verified, this worktree):
| Scenario | Before | After | Speedup |
|-----------------------------------|---------|---------|---------|
| awd-cli T=1 integrate phase | 3.7s | 0.797s | 4.6x |
| awd-cli T=7 multi-target install | 19.7s | 0.76s | 26x |
| Kubernetes discover (cold) | 205s | 5.4s | 38x |
| TypeScript discover (cold) | 297s | 7.0s | 42x |
| Warm discovery (cache hit) | n/a | 26us | 22500x |
Test isolation: tests/conftest.py gains an autouse fixture that
clears the discovery cache and resets perf_stats around every test
so process-scoped globals do not leak across tests that exercise
discover_primitives directly.
Pre-existing scaling-guard threshold bump: bumped
TestDiscoveryScaling::test_scaling_ratio 14 -> 20 with inline
explanation -- the small case got proportionally faster than the
large case, so the ratio inflated even as absolute times improved.
Reviewers (apm-review-panel): performance-expert, python-architect,
cli-logging-expert all returned ship_now after folding their
fold_now recommendations (delete dead duplicate _glob_match,
relativize paths in perf summary, add cache hit-rate percentage,
use [#] status symbol, add autouse conftest fixture, document base
fallback as '.', surface render_summary errors instead of swallow).
Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
0 commit comments