Skip to content

Read metadata of installed packages lazily#1273

Merged
lionel- merged 5 commits into
mainfrom
oak/salsa-30-lazy-package-metadata
Jun 19, 2026
Merged

Read metadata of installed packages lazily#1273
lionel- merged 5 commits into
mainfrom
oak/salsa-30-lazy-package-metadata

Conversation

@lionel-

@lionel- lionel- commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Branched from #1271
Progress towards #1212
Closes #1265

Uses same approach as file contents in #1271 to read NAMESPACE and DESCRIPTION lazily. This way the only thing that we need to initialise a library of installed packages are the folder names and the presence of a DESCRIPTION file. Unlike workspace packages, we don't read the Package field and just assume that a folder with a description file is indeed a package named after the folder name.

This avoids I/O and parsing costs for installed packages that are not imported or mentioned in the workspace files.

Comment thread crates/oak_db/src/inputs.rs Outdated
Comment on lines +204 to +205
/// Mtime of the package's `DESCRIPTION` file, or [`FileRevision::zero`]
/// when it can't be stat'd. Drives the lazy [`Package::description`]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the "cant be stat'd" case of returning zero, we could end up stuck with an outdated version of the file right? Like we read it once and never again because we never move off 0?

Is it better to force it to read it every time in the case where we cant determine a FileRevision?

(I have no idea if you can even do that with salsa's caching model)

@lionel- lionel- Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should get refreshed by the scanner on the next file event, which is the ty model.

Claude says we could use this to force a reread of the query next time: https://github.com/salsa-rs/salsa/blob/86e69d7eacdbc4ee87e183a147af02ee5d932ca1/src/database.rs#L76-L80

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let's try out the approach of reporting an untracked query if the stat failed. Untested and would be very unlikely in real usage though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I'm not super worried about it, more just curious about how you can express this in salsa, so report-untracked-read is neat to learn about

Comment thread crates/oak_db/src/inputs.rs Outdated
Comment on lines +218 to +225
/// In-memory `NAMESPACE`, checked by [`Package::namespace`] before it
/// touches disk. Mirrors [`File::source_text_override`]: `None` means
/// "read from disk". The scanners always leave it `None`. It's the
/// injection point unit tests use to supply a namespace for a synthetic
/// path with no file behind it, and the natural hook for an editor
/// editing a package's `NAMESPACE` live.
#[returns(ref)]
pub version: Option<String>,
#[returns(ref)]
pub namespace: Namespace,
pub namespace_override: Option<Namespace>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But not for description?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it seems nothing currently uses this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed for description rn

use crate::Package;

#[salsa::tracked]
impl Package {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move all of pub struct Package { here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And merge with package_resolve.rs?

@lionel- lionel- Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the input to package.rs. I'd keep package_resolve.rs separate for symmetry with file_resolve.rs though.

/// scanning a library with hundreds of packages, so we defer it.
///
/// A missing or unparseable `NAMESPACE` yields an empty `Namespace`.
#[salsa::tracked(returns(ref))]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lru bound for this and description()?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have 600 packages installed fwiw 😬

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those installed packages will not be queried if they are not part of the dependency graph of the workspace.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we port completions to salsa, we should make sure completing e.g. library() doesn't depend on package details besides the name. If we need more than the name, then we may want LRU.

Comment thread crates/oak_db/src/package.rs Outdated
/// installed package's `NAMESPACE` eagerly would dominate the cost of
/// scanning a library with hundreds of packages, so we defer it.
///
/// A missing or unparseable `NAMESPACE` yields an empty `Namespace`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting to decide when to use -> Namespace vs -> Result<Namespace>

I generally agree with this approach for this one though

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually i see description() uses Option<Description>. Should we be consistent across these two methods?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since reading DESCRIPTION is important to get a version, and since versions are involved in what sources we get from oak_sources, I think I prefer not to use the empty file approach for that one.

Comment thread crates/oak_db/src/package.rs Outdated
Comment on lines +61 to +63
/// A narrow query over [`Package::description`]: editing `DESCRIPTION`
/// without changing `Version:` backdates here, so downstream isn't
/// disturbed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Comment thread crates/oak_db/src/package.rs Outdated
Comment thread crates/oak_scan/src/packages.rs Outdated
@lionel- lionel- force-pushed the oak/salsa-29-optional-contents branch from cc27487 to 2e72224 Compare June 16, 2026 23:12
@lionel- lionel- force-pushed the oak/salsa-30-lazy-package-metadata branch 2 times, most recently from 056ecde to 59eb794 Compare June 17, 2026 08:27
@lionel- lionel- force-pushed the oak/salsa-29-optional-contents branch from ab82cf4 to ab16ff6 Compare June 19, 2026 14:06
Base automatically changed from oak/salsa-29-optional-contents to main June 19, 2026 14:06
@lionel- lionel- force-pushed the oak/salsa-30-lazy-package-metadata branch from bc0615d to 42ad025 Compare June 19, 2026 14:07
@lionel- lionel- merged commit b4cb2e3 into main Jun 19, 2026
17 checks passed
@lionel- lionel- deleted the oak/salsa-30-lazy-package-metadata branch June 19, 2026 14:07
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve on laziness of package metadata reading during LSP startup

2 participants