Skip to content

Optimize Find References while preserving alias and workspace refresh behavior#116

Merged
AJenbo merged 2 commits into
PHPantom-dev:mainfrom
MingJen:fix/find-references-performance
May 10, 2026
Merged

Optimize Find References while preserving alias and workspace refresh behavior#116
AJenbo merged 2 commits into
PHPantom-dev:mainfrom
MingJen:fix/find-references-performance

Conversation

@MingJen
Copy link
Copy Markdown
Contributor

@MingJen MingJen commented May 10, 2026

Summary

Improves project-wide Find References in two ways: it now avoids reading most
files from disk by filtering on pre-built span metadata before touching content,
and it discovers PHP files that were added to the workspace after the editor
started. A separate fix removes a deadlock in the virtual-member (PHPDoc mixin)
resolution cache.

Changes

  • Adds span-only pre-filter to all cross-file reference scanners. Before
    loading any file content, each scanner now checks whether the file's symbol
    spans contain a name that could plausibly match the target. Files with no
    candidate spans are skipped entirely. Applies to class, member, function,
    constant, and Laravel string-key reference searches.
  • Makes file content loading lazy. Content is now read from disk only after
    a span passes the pre-filter, instead of being loaded upfront for every file
    in the workspace.
  • Replaces the O(N·classes) descendant walk with a GTI-index BFS. The
    previous algorithm iterated every known class on each expansion step to find
    subclasses. The new algorithm does a breadth-first walk of the precomputed
    gti_index (the reverse-inheritance map built by Go-to-Implementation),
    reducing the traversal from O(depth × total_classes) to O(hierarchy_size).
  • Refreshes workspace index on subsequent searches. ensure_workspace_indexed
    now re-runs the filesystem walk on every call, but only parses files not yet
    in symbol_maps. A new workspace_indexed: AtomicBool flag lets the code log
    whether it is doing an initial scan or a cheaper refresh walk.
  • Fixes virtual-member cache deadlock. In phpdoc.rs, the Mutex guard for
    the shared mixin cache was held across an else branch that attempted to
    acquire a thread-local cache — which could deadlock in some call paths. Fixed
    by extracting the map.get() result with Arc::clone before releasing the
    guard.
  • Fixes lock-hold scope in resolve_class_fully_inner. The cache.lock()
    call now immediately chains .insert() rather than holding the guard across a
    separate assignment.
  • Adds deduplication in find_class_references. Calls locations.dedup()
    after the sort to remove any duplicate entries that survived the unique-push
    logic.
  • Adds tracing instrumentation. find_references and
    ensure_workspace_indexed now emit tracing::info! timing spans, making it
    easy to measure performance from LSP logs.
  • Adds benches/references.rs benchmark suite.

How It Works

Pre-filter + lazy load (class / function / member / constant references):

  1. For each file in the workspace snapshot, scan only symbol_map.spans
    target — accounting for aliased imports (use Foo as Bar) by resolving
    through file_imports.
  2. Files with no candidate span are skipped with continue before any
    Url::parse or get_file_content_arc call.
  3. Within a passing file, file_content is a lazy Option<Arc<String>> that
    is only populated on the first span that actually matches.

Descendant collection via gti_index:

  • collect_class_hierarchy seeds a HashSet with the target FQN and its
    ancestors (unchanged), then does a BFS over gti_index — a
    HashMap<String, Vec<String>> that maps each FQN to its direct subclasses.
  • New subclasses are pushed onto a VecDeque; the loop terminates naturally
    when no new entries are found. The old iterative O(N) loop and the helper
    methods class_is_descendant_of / ancestor_in_set are removed.

Workspace freshness:

  • ensure_workspace_indexed is called once per find_references request
    (already behind tokio::spawn so it does not block the LSP thread).
  • On the first call, workspace_indexed is false; the full disk walk runs and
    all new files are parsed. The flag is set to true after the walk.
  • On subsequent calls, the walk runs again but phase2_work contains only files
    whose URIs are absent from symbol_maps — so only newly created files pay the
    parse cost.

Tests

New tests in tests/integration/references.rs:

  • function_references_include_aliased_import_usage — verifies that
    use function Foo\bar as baz; baz() is found when searching for Foo\bar.
  • class_references_include_aliased_import_usage — verifies that
    use App\Models\User as Account; new Account() is found when searching for
    App\Models\User.
  • workspace_index_refreshes_after_new_file_is_added — writes a new PHP file
    to disk after the initial index, then calls find references again and asserts
    the newly created file appears in the results.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 10, 2026

Codecov Report

❌ Patch coverage is 65.02058% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.71%. Comparing base (8faf01a) to head (1c0c1ca).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/references/mod.rs 61.88% 85 Missing ⚠️

❌ Your patch status has failed because the patch coverage (65.02%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #116      +/-   ##
==========================================
- Coverage   86.83%   86.71%   -0.13%     
==========================================
  Files         170      170              
  Lines      108353   108396      +43     
==========================================
- Hits        94091    93995      -96     
- Misses      14262    14401     +139     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@AJenbo AJenbo force-pushed the fix/find-references-performance branch from eb73bf4 to 78cbe01 Compare May 10, 2026 18:51
@AJenbo AJenbo force-pushed the fix/find-references-performance branch from 78cbe01 to 1c0c1ca Compare May 10, 2026 19:05
@AJenbo AJenbo merged commit 6c35273 into PHPantom-dev:main May 10, 2026
6 of 7 checks passed
@AJenbo
Copy link
Copy Markdown
Collaborator

AJenbo commented May 10, 2026

Nice work, my original intent was to have a byte scanner filter the files for candidates and only parse what could have a match, but this seems to work well enough that that isn't needed and at the same time be an over all boost the the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants