Integrate PackageCache with goto definition#1172
Integrate PackageCache with goto definition#1172DavisVaughan wants to merge 6 commits intooak/base-package-cachefrom
PackageCache with goto definition#1172Conversation
- Include `INDEX` in the cache - Add `PackageCache` trait - Add `struct TestPackageCache` with testing feature
2b60581 to
f01caad
Compare
- Stored in `WorldState` - Used in `goto_definition()` - Split out `oak_layers/` from `oak_index/` so that `oak_index` is purely about the `SemanticIndex`, because `oak_layers` needs the heavier package cache utilities
cae49f2 to
4b8b21b
Compare
There was a problem hiding this comment.
I'm thinking this whole file should move out of harp into something like aether_r_process eventually
| /// Map of package name to package definitions for installed libraries. | ||
| /// Only usable when a package cache is available. | ||
| pub(crate) library_definitions: Option<LibraryDefinitions>, |
There was a problem hiding this comment.
WorldState now gets library_definitions if we have a cache available
| } | ||
|
|
||
| pub(crate) struct Console { | ||
| r_home: PathBuf, |
There was a problem hiding this comment.
Unused, but I think will be nice for the DAP + PackageCache ideas
| /// The result of resolving a name against the external scope chain. | ||
| #[derive(Debug, Clone, PartialEq, Eq)] | ||
| pub struct ExternalDefinition { | ||
| file: Url, | ||
| name: String, | ||
| range: TextRange, | ||
| } |
There was a problem hiding this comment.
This used to be an enum that knew if the definition came from a project file vs a package, but now that I'm actually filling out the package side of things, I think they really boil down to the same structure, so I've flattened it
| pub fn resolve_in_package( | ||
| name: &str, | ||
| package: &str, | ||
| visibility: PackageDefinitionVisibility, | ||
| library: &Library, | ||
| library_definitions: Option<&LibraryDefinitions>, | ||
| ) -> Option<ExternalDefinition> { | ||
| let library_definitions = library_definitions?; | ||
| let package = library.get(package)?; | ||
| let package_definitions = library_definitions.get(&package)?; | ||
| let definition = package_definitions.get(name)?; | ||
|
|
||
| if visibility != definition.visibility() { | ||
| return None; | ||
| } | ||
|
|
||
| let file = package_definitions.file(definition.file_id()); | ||
| let file = Url::from_file_path(file).log_err()?; | ||
|
|
||
| Some(ExternalDefinition { | ||
| file, | ||
| name: name.to_string(), | ||
| range: definition.range(), | ||
| }) | ||
| } |
There was a problem hiding this comment.
The real crux of goto-definition for a package export
| pub struct Namespace { | ||
| /// Names of objects exported with `export()` | ||
| pub exports: Vec<String>, | ||
| pub exports: SortedVec<String>, |
There was a problem hiding this comment.
As we build PackageDefinitions, we categorize top level definitions as Internal vs Exported based on whether or not they exist as exports in the Namespace (this helps with :: vs ::: goto definition resolution).
Because we search exports for every top level definition in the package, it seemed useful to make this a sorted vec we can binary search in
There was a problem hiding this comment.
I have left imports and package_imports alone for the moment
There was a problem hiding this comment.
The core file for "given a R/ directory, collect all top level definitions"
| /// Cache used for looking up package sources. `dyn` to allow easy swapping to | ||
| /// `TestPackageCache` in test files. | ||
| cache: Arc<dyn PackageCache>, |
There was a problem hiding this comment.
We have both struct PackageCache and struct TestPackageCache that can be slotted in here, where the latter is very nice for testing in goto-definition.rs and externals.rs, since PackageCache itself requires a live R session and internet access.
We do this with a small trait PackageCache that requires a simple get() method.
It's dyn rather than using a generic like LibraryDefinitions<C: PackageCache> because that "leaks" the generic cache details way up the chain of callers (all the way up through goto-definition, it got gross).
I had extensive conversations with Claude about this design and we both decided this was a nice way to handle this kind of thing, and that Zed also does this kind of thing as well (with a "fake" Filesystem for similar reasons).
The vtable hit required by using dyn should be negligible vs the overhead of the cache itself
PackageCache with goto definitionPackageCache with goto definition
Branched from #1166
Each commit is self contained, with the final one being the brunt of the work. I can make the first few commits separate PRs if you prefer.
Some initial setup changes:
r_home: PathBufthrough to both the LSP andConsoleby callingr_home_setup()way earlieroak_index_vecis now its own crate, spun out fromoak_index. It just holdsIndexVec, a useful concept outside of theSemanticIndex(this should probably just beoak_indexafter we rename the existingoak_indextooak_semantic)oak_layersis now its own crate, spun out fromoak_index. This allowsoak_indexto be solely focused on theSemanticIndex, since the "layers" concepts need access to the heavierPackageCachedependencyThen 4b8b21b is the major feature commit.
New
oak_package_definitionscrate, which holds astruct LibraryDefinitions. This is likeLibrary, but instead of holding metadata about a package, it holds the top level definitions exposed by the package, withfile+rangefields for jumping to a package definition. We makePackageDefinitionsfor a package by doing the following for each package file: reading it, parsing it, creating a semantic index, categorizing thefile_exports(). Then, for now, we just throw away the semantic index for that file.The
WorldStatenow holds aLibraryDefinitionsalongside the preexistingLibrary. I'm not sure if this is right setup longterm, or if these should be one object, but for now it allowsLibraryto be lightweight.goto_definition()then getsLibraryDefinitionsand can use them to determine definition locations for external packages.Some questions / challenges along the way:
Libraryshould probably be renamedLibraryMetadatasince that's all it is really about, and we are likely going to have things likeLibraryDefinitionsand eventuallyLibraryHelpthat all use different caches / dependenciesCurrently the
PackageCachecachesDESCRIPTION,NAMESPACE, andINDEXalongside theR/sources, which felt right at the time but now I am not so sure! I am starting to think that insteadPackageCacheshould only have theR/sources, and we should treat the.libPaths()as our source of truth for those other files, accessed viaLibrary, rather than having them in two places.R/,DESCRIPTION, andNAMESPACEall present), but I no longer think we need to do that, and should instead just keep 1 source of truth for the metadata files.I had originally wanted
LibraryDefinitionsto really be something likeLibrarySemanticIndexeswhere it would hold a package worth ofSemanticIndexobjects for reuse across LSP features. But this is not easy!SemanticIndexis notSend + Syncdue toDefinitionKindholdingRSyntaxNode, so that means we can't put aLibrarySemanticIndexesin theWorldState, since it needs to be snapshotted and shuffled between threads. I think we are going to need to resolve this eventually for Salsa purposes, but for now I just cache thePackageDefinitionobjects extracted from the semantic index instead, which are send+sync.