Skip to content

Commit a8094c3

Browse files
perf(install): memoize discovery, drop per-file resolve, expand skip dirs (#1538)
Fixes #1533 — apm install on real monorepos (Kubernetes, TypeScript) took 200-300s; on this repo, multi-target installs scaled O(targets x packages) because discover_primitives was called 9x per install over the same project tree and find_primitive_files did Path.resolve() on every file. Three independent root causes, all fixed: 1. Discovery memoization. Module-level dict in primitives/discovery.py keyed on (realpath(base), sorted_exclude_tuple), cleared at top of run_install_pipeline. Identity-shares the PrimitiveCollection across the integrate phase loop. 9 -> 3 unique walks per install. 2. Hot-path stat reduction. find_primitive_files now (a) pre-splits glob patterns once per call (not per file), (b) computes rel_root by string slicing of os.walk's root + base_prefix instead of Path.resolve() lstat-per-component per file, (c) defers Path() construction until after a pattern matches. 3. Skip-dir expansion. DEFAULT_SKIP_DIRS gained vendor, third_party, Pods, bower_components, jspm_packages, .gradle, target, .next, .nuxt, .cache, .turbo — keeps the walker out of dependency vendor trees the user did not author. Granular perf logging (--verbose): New utils/perf_stats.py records one event per walk + per discovery call; render_summary() emits a verbose-only block with per-base walk breakdown and discovery cache hit-rate: [#] Perf: 9 walks, 72 file matches, 6222 files visited, 0.739s total walk time [#] Perf: .: 3 walk(s) (726ms, 6108 files visited, 72 matched) [#] Perf: apm_modules/_local/foo: 3 walk(s) (4ms, 36 files visited, 0 matched) [#] Perf: discovery: 9 call(s) (3 unique base(s), 6 cache hit(s), 66%) Non-CI perf harness: tests/perf/ contains 4 opt-in scenarios (Kubernetes discover, TypeScript discover, awd-cli install, multi-target breadth). Skipped by default; run with PYTEST_PERF=1. Clones large external repos to /tmp/perf-atlas-clones/ once per session. Not run on CI. Measured (verified, this worktree): | Scenario | Before | After | Speedup | |-----------------------------------|---------|---------|---------| | awd-cli T=1 integrate phase | 3.7s | 0.797s | 4.6x | | awd-cli T=7 multi-target install | 19.7s | 0.76s | 26x | | Kubernetes discover (cold) | 205s | 5.4s | 38x | | TypeScript discover (cold) | 297s | 7.0s | 42x | | Warm discovery (cache hit) | n/a | 26us | 22500x | Test isolation: tests/conftest.py gains an autouse fixture that clears the discovery cache and resets perf_stats around every test so process-scoped globals do not leak across tests that exercise discover_primitives directly. Pre-existing scaling-guard threshold bump: bumped TestDiscoveryScaling::test_scaling_ratio 14 -> 20 with inline explanation -- the small case got proportionally faster than the large case, so the ratio inflated even as absolute times improved. Reviewers (apm-review-panel): performance-expert, python-architect, cli-logging-expert all returned ship_now after folding their fold_now recommendations (delete dead duplicate _glob_match, relativize paths in perf summary, add cache hit-rate percentage, use [#] status symbol, add autouse conftest fixture, document base fallback as '.', surface render_summary errors instead of swallow). Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent e62c1ed commit a8094c3

14 files changed

Lines changed: 739 additions & 73 deletions

File tree

src/apm_cli/commands/compile/cli.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@
1616
from ...constants import AGENTS_MD_FILENAME, APM_DIR, APM_MODULES_DIR, APM_YML_FILENAME
1717
from ...core.command_logger import CommandLogger
1818
from ...core.target_detection import TargetParamType
19-
from ...primitives.discovery import discover_primitives
19+
from ...primitives.discovery import clear_discovery_cache, discover_primitives
20+
from ...utils import perf_stats
2021
from ...utils.console import (
2122
_rich_error,
2223
_rich_info,
@@ -491,6 +492,8 @@ def compile(
491492
# Validation-only mode
492493
if validate:
493494
logger.start("Validating APM context...", symbol="gear")
495+
clear_discovery_cache()
496+
perf_stats.reset()
494497
compiler = AgentsCompiler(".")
495498
try:
496499
primitives = discover_primitives(".")
@@ -518,6 +521,7 @@ def compile(
518521
logger.progress(f" * {mcp_count} MCP dependencies")
519522
except Exception:
520523
pass
524+
perf_stats.render_summary(logger, project_root=".")
521525
return
522526

523527
# Watch mode
@@ -738,6 +742,11 @@ def _coerce_provenance_targets(value):
738742
logger.progress("Using single-file compilation (legacy mode)", symbol="page")
739743

740744
# Perform compilation
745+
# Reset discovery memo + perf counters so the single-shot compile
746+
# never inherits stale state from a sibling invocation in the
747+
# same process (tests, REPL). Mirrors run_install_pipeline.
748+
clear_discovery_cache()
749+
perf_stats.reset()
741750
compiler = AgentsCompiler(".")
742751
result = compiler.compile(config, logger=logger)
743752
compile_has_critical = result.has_critical_security
@@ -896,8 +905,11 @@ def _coerce_provenance_targets(value):
896905
"Compiled output contains critical hidden characters"
897906
" -- run 'apm audit' to inspect, 'apm audit --strip' to clean"
898907
)
908+
perf_stats.render_summary(logger, project_root=".")
899909
sys.exit(1)
900910

911+
perf_stats.render_summary(logger, project_root=".")
912+
901913
except ImportError as e:
902914
logger.error(f"Compilation module not available: {e}")
903915
logger.progress("This might be a development environment issue.")

src/apm_cli/commands/compile/watcher.py

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,13 @@
99
from ...compilation import AgentsCompiler, CompilationConfig
1010
from ...constants import AGENTS_MD_FILENAME, APM_DIR, APM_YML_FILENAME
1111
from ...core.command_logger import CommandLogger
12+
from ...primitives.discovery import PRIMITIVE_SUFFIXES, clear_discovery_cache
13+
from ...utils import perf_stats
14+
15+
# Skill modules use a fixed filename (``SKILL.md``) rather than a suffix
16+
# pattern, so the watcher checks basename equality in addition to the
17+
# suffix membership test below.
18+
_SKILL_FILENAME = "SKILL.md"
1219

1320
if TYPE_CHECKING:
1421
from ...core.target_detection import CompileTargetType
@@ -88,7 +95,21 @@ def on_modified(self, event: Any) -> None:
8895
if getattr(event, "is_directory", False):
8996
return
9097
src_path = getattr(event, "src_path", "")
91-
if not src_path.endswith(".md") and not src_path.endswith(APM_YML_FILENAME):
98+
# Smart filter: recompile only when the changed file is an APM
99+
# primitive (matches one of LOCAL_PRIMITIVE_PATTERNS' suffixes, a
100+
# SKILL.md basename, or the project manifest). Generic .md edits
101+
# (README, CHANGELOG, AGENTS output) never affect compile output
102+
# and would otherwise trigger a full discovery walk on every
103+
# save. See #1533 follow-up.
104+
basename = os.path.basename(src_path)
105+
is_manifest = basename == APM_YML_FILENAME
106+
is_skill = basename == _SKILL_FILENAME
107+
is_primitive = any(src_path.endswith(suffix) for suffix in PRIMITIVE_SUFFIXES)
108+
if not is_manifest and not is_primitive and not is_skill:
109+
# Leave a verbose breadcrumb so --verbose watch sessions can
110+
# see why an edit produced no recompile. Silent at default.
111+
if src_path:
112+
self.logger.verbose_detail(f"Skipping non-primitive change: {basename}")
92113
return
93114
current_time = time.time()
94115
if current_time - self.last_compile < self.debounce_delay:
@@ -102,6 +123,14 @@ def _recompile(self, changed_file: str) -> None:
102123
self.logger.progress(f"File changed: {changed_file}", symbol="eyes")
103124
self.logger.progress("Recompiling...", symbol="gear")
104125

126+
# The process-scoped discovery cache (populated by the previous
127+
# compile pass) MUST be cleared before re-walking, otherwise
128+
# subsequent recompiles serve the stale primitive set from
129+
# before the edit. See #1533 follow-up. perf_stats counters
130+
# are NOT reset here -- they accumulate across the watch
131+
# session and are rendered once at teardown.
132+
clear_discovery_cache()
133+
105134
# When apm.yml itself was the trigger, re-resolve so a
106135
# mid-session edit to ``target:`` / ``targets:`` takes
107136
# effect on this recompile, then persist the fresh value
@@ -231,6 +260,12 @@ class _WatchdogAdapter(APMFileHandler, FileSystemEventHandler):
231260

232261
logger.progress("Performing initial compilation...", symbol="gear")
233262

263+
# Watch mode is a long-lived process. Reset both the discovery
264+
# cache and perf counters on entry so neither carries state from
265+
# a sibling REPL/test run sharing this Python process.
266+
clear_discovery_cache()
267+
perf_stats.reset()
268+
234269
config = CompilationConfig.from_apm_yml(
235270
output_path=output if output != AGENTS_MD_FILENAME else None,
236271
chatmode=chatmode,
@@ -242,6 +277,10 @@ class _WatchdogAdapter(APMFileHandler, FileSystemEventHandler):
242277
compiler = AgentsCompiler(".")
243278
result = compiler.compile(config)
244279

280+
# NOTE: render_summary moved to the Ctrl+C teardown below so the
281+
# watch session emits ONE aggregate perf block at exit instead of
282+
# spamming a 5-6 line block after every recompile.
283+
245284
if result.success:
246285
if dry_run:
247286
logger.success("Initial compilation successful (dry run)", symbol="sparkles")
@@ -261,6 +300,10 @@ class _WatchdogAdapter(APMFileHandler, FileSystemEventHandler):
261300
except KeyboardInterrupt:
262301
observer.stop()
263302
logger.progress("Stopped watching for changes", symbol="info")
303+
# Render aggregate perf counters accumulated across the
304+
# session. Reset once at watch start (above), so this summary
305+
# covers initial compile + every subsequent recompile.
306+
perf_stats.render_summary(logger, project_root=".")
264307

265308
observer.join()
266309

src/apm_cli/commands/uninstall/engine.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -484,6 +484,14 @@ def _sync_integrations_after_uninstall(
484484
from ...integration.dispatch import get_dispatch_table
485485
from ...integration.targets import resolve_targets
486486
from ...models.apm_package import PackageInfo, validate_apm_package
487+
from ...primitives.discovery import clear_discovery_cache
488+
489+
# Phase 2 re-integration walks the on-disk primitive set after Phase 1
490+
# has removed the uninstalled package's files. The process-scoped
491+
# discovery memo populated earlier in this CLI run would otherwise
492+
# serve the pre-removal snapshot, causing deleted primitives to be
493+
# re-integrated. See #1533 follow-up.
494+
clear_discovery_cache()
487495

488496
_dispatch = get_dispatch_table()
489497
_integrators = {name: entry.integrator_class() for name, entry in _dispatch.items()}
@@ -634,6 +642,10 @@ def _sync_integrations_after_uninstall(
634642
counts["hooks"] = result.get("files_removed", 0)
635643

636644
# Phase 2: Re-integrate from remaining installed packages
645+
# Re-clear the discovery memo: Phase 1 mutated the on-disk primitive
646+
# set (removed files), so any cache snapshot taken between entry and
647+
# here is stale. Integrator dispatch below walks discovery internally.
648+
clear_discovery_cache()
637649
_targets = _resolved_targets
638650

639651
for dep in apm_package.get_apm_dependencies():

src/apm_cli/constants.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,5 +51,23 @@ class InstallMode(Enum):
5151
"dist",
5252
".mypy_cache",
5353
"apm_modules",
54+
# Common vendored / generated package locations across ecosystems.
55+
# These never contain user-authored primitives and can be huge
56+
# (the Kubernetes vendor/ tree alone is ~14k files; CocoaPods'
57+
# Pods/ tree, bower_components, jspm_packages, and the various
58+
# staging/third_party trees in Google-style monorepos behave the
59+
# same way). Pruning at the directory level avoids the per-file
60+
# cost in find_primitive_files. See issue #1533.
61+
"vendor",
62+
"third_party",
63+
"Pods",
64+
"bower_components",
65+
"jspm_packages",
66+
".gradle",
67+
"target",
68+
".next",
69+
".nuxt",
70+
".cache",
71+
".turbo",
5472
}
5573
)

src/apm_cli/install/pipeline.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -356,6 +356,15 @@ def run_install_pipeline( # noqa: PLR0913, RUF100
356356
except ImportError:
357357
raise RuntimeError("APM dependency system not available") # noqa: B904
358358

359+
# Reset process-scoped perf counters and discovery memo so that
360+
# numbers / cache hits from earlier pipeline runs (tests, REPL,
361+
# long-lived processes) do not bleed into this install. See #1533.
362+
from ..primitives.discovery import clear_discovery_cache
363+
from ..utils import perf_stats as _perf_stats
364+
365+
_perf_stats.reset()
366+
clear_discovery_cache()
367+
359368
from ..core.scope import InstallScope, get_apm_dir, get_deploy_root
360369

361370
if scope is None:
@@ -762,6 +771,7 @@ def run_install_pipeline( # noqa: PLR0913, RUF100
762771
# Emit verbose integration stats + bare-success fallback + return result
763772
from .phases import finalize as _finalize_phase
764773

774+
_perf_stats.render_summary(logger, project_root=str(ctx.project_root))
765775
return _run_phase("finalize", _finalize_phase, ctx)
766776

767777
except AuthenticationError:

0 commit comments

Comments
 (0)