perf: per-root re-analysis (remove IncludeOnly pass)#111
Open
pfitzseb wants to merge 3 commits into
Open
Conversation
…yed global dict derived_all_includes returned (uri2included, include_dict) where include_dict was keyed by objectids of include-call EXPRs. Objectids change on every reparse, so the tuple never compared equal and Salsa's early-exit never fired: any edit invalidated derived_roots, derived_roots_for_uri, derived_includes, derived_all_julia_files, derived_is_indirect_file, and everything downstream, even when the include structure was unchanged. The graph is now derived_uri2included, built from the per-file (and value-stable) derived_file_include_records via a shared walker that also replaces the duplicated IncludeOnly traversal. Include resolution for the semantic pass moved into semantic_pass/followinclude, which build an objectid map per entered file at traversal time — no objectid ever lands in a Salsa value. Files without a filesystem path (unsaved buffers) now resolve absolute include paths in the records path, matching the old IncludeOnly behavior. After an edit, derived_roots re-verifies in ~1ms (only the edited file's records recompute) and derived_roots_for_uri is served from cache. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
derived_static_lint_meta_for_root seeded meta and ran check_all over every Julia file in the workspace for every root: the work was O(files x roots), any edit anywhere invalidated every root's meta, and check_all results for files outside a root's closure were never read (the diagnostics pass only walks the closure). Introduce derived_include_closure(rt, uri) — the transitive include closure from the stable include graph — and scope meta seeding, check_all, and resolve_remaining_getfields! (plus derived_deved_package_meta's seeding) to it. Each root's lint state now only depends on its own closure's CSTs, so edits outside the closure re-verify from cache (~0.1ms measured) instead of re-running the whole pass (~33ms on a 57-file workspace). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
derived_static_lint_all_diagnostics was one global derived node looping all roots, so any edit forced collect_hints over every root's closure. Replace it with derived_static_lint_diagnostics_for_root, keyed per root; the per-file derived_static_lint_diagnostics now unions the results of the roots whose closure contains the file (preserving the cross-root Set dedup). The per-root pass iterates derived_include_closure instead of the old ad-hoc worklist, which also fixes a latent non-termination: the old walk had no visited set, so an include cycle inside a project (a.jl <-> b.jl) looped forever. Pinned by a new test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Profiling showed that a single keystroke forced a full re-lint of the entire
workspace, once per root. Three compounding causes:
derived_all_includesreturned a tuple of the include graph and a globalDict{objectid(EXPR) → URI}used by the semantic pass to resolveinclude()calls. Objectids change on every reparse, so this value nevercompared equal after any edit — backdating never fired, and everything
downstream (
derived_roots,derived_roots_for_uri,derived_is_indirect_file, every lint meta) invalidated even when theinclude structure was untouched.
derived_roots_for_uriadditionallyre-ran a per-root BFS on every navigation request.
derived_static_lint_meta_for_rootseeded meta and rancheck_all+resolve_remaining_getfields!over every Julia file in the workspace forevery root. That made the work O(files × roots) — and the results for
files outside a root's include closure were pure waste, since the
diagnostics pass only ever reads the closure. It also meant any edit
anywhere invalidated every root's meta.
derived_static_lint_all_diagnosticslooped all roots, so any edit re-ran
collect_hintsfor every root'sclosure.
Measured baseline (LanguageServer package, 57 files, 12 roots, warm caches):
~35 ms per root per keystroke, i.e. every root paid the full price on every
edit regardless of where the edit was.
Implementation
Commit 1 — include graph from per-file records. The graph
(
derived_uri2included) is now built by a worklist over the existingper-file, value-stable
derived_file_include_records— its value only changeswhen include structure changes, so early-exit protects all graph consumers
on ordinary edits. The objectid dict is gone from the Salsa layer entirely:
semantic_pass/followincludebuild an objectid→target map per entered fileat traversal time from a shared walker (which also replaces the duplicated
IncludeOnlytraversal — one include-detection implementation instead oftwo). One behavioral edge preserved and pinned by test: files without a
filesystem path (unsaved buffers) still resolve absolute include paths.
Commit 2 — lint meta scoped to the include closure. New
derived_include_closure(rt, uri)(transitive closure over the stable graph).Meta seeding,
check_all, andresolve_remaining_getfields!— in bothderived_static_lint_meta_for_rootandderived_deved_package_meta— nowcover the closure instead of all files. Each root's lint state depends only on
its own closure's CSTs, so edits outside it re-verify from cache.
Commit 3 — per-root diagnostics.
derived_static_lint_all_diagnosticsreplaced by per-root
derived_static_lint_diagnostics_for_root; the per-filequery unions results from
derived_roots_for_uri, preserving the cross-rootSetdedup. Iterating the closure instead of the old ad-hoc worklist alsofixed a latent non-termination: the old walk had no visited-set, so an include
cycle inside a project would loop forever (pinned by a new test).
Results
Per-keystroke lint cost went from O(workspace files × roots) to O(edited
root's closure). Measured: editing a file recomputes its own root in ~33 ms
(unchanged, as expected — the closure spans the whole src tree), while an
unrelated root drops from ~35 ms to ~0.1 ms;
derived_rootsre-verifiesin ~1 ms and
derived_roots_for_uriserves from cache in ~4 µs. Behavior isotherwise identical: full suite green (3658 passed), including all 585
staticlint tests, plus 22 new tests in
test/test_includes.jlcovering graphshape, value stability under include-preserving edits, closure computation,
cross-file resolution after edits, the unsaved-buffer edge, and cycle
termination.
@davidanthoff I know you added the separate
IncludeOnlypass for a reason, but I can't quite remember the reasoning.derived_includesis still around, so maybe that's good enough?