Skip to content

Use Salsa infrastructure for cross-file goto-definition #1254

Merged
lionel- merged 5 commits into
mainfrom
oak/salsa-18-goto-def-migration
Jun 19, 2026
Merged

Use Salsa infrastructure for cross-file goto-definition #1254
lionel- merged 5 commits into
mainfrom
oak/salsa-18-goto-def-migration

Conversation

@lionel-

@lionel- lionel- commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Branched from #1253
Progress towards #1212
Progress towards #1149

  • Use new Salsa infrastructure for cross-file queries (via source()d files in scripts, or file collation in packages)
  • File gains a line_index() query.
  • PositionEncoding is stored in LspConfig, mirroring r-a
  • Non-redundant parked LSP-layer tests are restored (either in oak_db, oak_ide, or ark).

One regression with the Salsa infra: we no longer analyse files outside of workspace. In the future we could add these as Salsa inputs via oak_scan. Would also require a non-lsp file watcher for out-of-workspace updates.

I recommend testing with #1259 to benefit from bug fixes and improvements in ulterior branches.

@lionel- lionel- changed the title Oak/salsa 18 goto def migration @lionel- Use Salsa infrastructure for cross-file goto-definition Jun 3, 2026
@lionel- lionel- changed the title @lionel- Use Salsa infrastructure for cross-file goto-definition Use Salsa infrastructure for cross-file goto-definition Jun 3, 2026
@lionel- lionel- force-pushed the oak/salsa-17-interning branch from 59a2813 to 0cd9628 Compare June 4, 2026 14:20
@lionel- lionel- force-pushed the oak/salsa-18-goto-def-migration branch from 54473e0 to 399f3f3 Compare June 4, 2026 14:20
@lionel- lionel- force-pushed the oak/salsa-17-interning branch 2 times, most recently from 25c9de8 to 798be33 Compare June 11, 2026 12:36
@lionel- lionel- force-pushed the oak/salsa-18-goto-def-migration branch 2 times, most recently from fa5b1c0 to cacffa1 Compare June 11, 2026 13:14
file.resolve_at(db, offset)
.into_iter()
.filter_map(|def| {
let range = def.name_range(db)?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see name_range "returns None for Import kinds". That's correct here? Just checking

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and that should never happen since resolve_at() resolves imports

Comment on lines +98 to +101
return match index.definition_at(offset) {
Some((scope, def_id, _)) => {
self.definition(db, scope, def_id).into_iter().collect()
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another peculiarity of 2 Definition types is that we can't just return the SemanticIndex's Definition captured in the _. I understand why and all that, just pointing out that we nearly had it "right there"

/// the tracked [`File::definitions()`], so they stay cached and
/// identity-stable even though this lookup isn't.
pub fn resolve_at(self, db: &'db dyn Db, offset: TextSize) -> Option<Definition<'db>> {
/// the tracked [`File::definitions()`] and looked up via

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So has the meaning of resolve_at() changed quite a bit? It looks like the impl is quite different now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns Vec instead of Option, handles the cursor-on-def case, and is more flow-sensitive via reaching_definitions().

) -> Option<Definition<'db>> {
let index = self.semantic_index(db);
if let DefinitionKind::Import {
file: target_url,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a FilePath?

@lionel- lionel- Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, but likely not. The Url is just a vessel in oak_semantic and doesn't need to guarantee lexical normalisation or anything of that sort. On the other hand it wouldn't hurt much. Overall I'd say let's worry about this later.

/// Session-wide position encoding for offset <-> LSP-position conversion.
/// One value for the whole session, not per document. Hard-coded to UTF-16,
/// the encoding we advertise at `initialize`.
pub(crate) position_encoding: PositionEncoding,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I generally agree with this, but can we actually implement what it says?

i.e. go ahead and remove position_encoding from Document. It does not look too hard, as every usage of document.position_encoding seems very close to an LSP boundary where you have the WorldState available.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of scope for this PR but eventually yes

Comment on lines 13 to 42
pub(crate) fn goto_definition(
document: &Document,
params: GotoDefinitionParams,
state: &WorldState,
) -> anyhow::Result<Option<GotoDefinitionResponse>> {
let uri = params.text_document_position_params.text_document.uri;
let uri = &params.text_document_position_params.text_document.uri;
let position = params.text_document_position_params.position;

let offset = from_proto::offset_from_position(
position,
&document.line_index,
document.position_encoding,
)?;
let db = &state.db;
let encoding = state.config.position_encoding;

let index = document.semantic_index();
let root = document.syntax()?;
let pos = oak_ide::FilePosition {
file: uri.clone(),
offset,
};
let targets = oak_ide::goto_definition(&index, &root, &pos);

let links: Vec<_> = targets
.into_iter()
.filter_map(|target| nav_target_to_link(target, document).log_err())
.collect();

if !links.is_empty() {
return Ok(Some(GotoDefinitionResponse::Link(links)));
}

// Within-file resolution found nothing. Fall back to the workspace
// indexer for a best-effort cross-file lookup of the identifier at the
// cursor. Non-identifier cursors (operators, parens, whitespace) return
// `None`. TODO(salsa): Replace the indexer step with a proper cross-file
// (file imports) lookup.
if let Some(link) = fallback_link_at(document, position, &uri)? {
return Ok(Some(GotoDefinitionResponse::Link(vec![link])));
}

Ok(None)
}

/// Look up the identifier at `position` in the workspace indexer (current
/// file first, then other files). Returns `None` when the cursor isn't on
/// an identifier or the symbol isn't in the index.
fn fallback_link_at(
document: &Document,
position: Position,
uri: &Url,
) -> anyhow::Result<Option<LocationLink>> {
let point = document.tree_sitter_point_from_lsp_position(position)?;
let Some(node) = document.ast.root_node().find_closest_node_to_point(point) else {
log::warn!("Failed to find the closest node to point {point}.");
let Some(file) = db.file_by_path(&FilePath::from_url(uri)) else {
return Ok(None);
};

if !node.is_identifier() {
let offset = from_proto::offset_from_position(position, file.line_index(db), encoding)?;

let targets = oak_ide::goto_definition(db, file, offset);
if targets.is_empty() {
return Ok(None);
}

let symbol = node.node_as_str(&document.contents)?;
let Some((file_id, entry)) =
indexer::find_in_file(symbol, uri).or_else(|| indexer::find(symbol))
else {
return Ok(None);
};
// An ambiguous name (e.g. defined on both arms of an `if`/`else`) resolves
// to several bindings; the client offers all of them.
let links = targets
.iter()
.map(|target| nav_target_to_link(db, encoding, target))
.collect::<anyhow::Result<Vec<_>>>()?;

Ok(Some(LocationLink {
origin_selection_range: None,
target_uri: file_id.as_uri().clone(),
target_range: entry.range,
target_selection_range: entry.range,
}))
Ok(Some(GotoDefinitionResponse::Link(links)))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how cleanly this reads on the LSP impl side of things!

Comment on lines +55 to +66
// TODO: a `source()` target outside every workspace root never becomes
// a `File`, so `file_by_path()` misses it and the names it injects stay
// invisible. Minting can't happen here, so the work belongs on the
// write side in `oak_scan`. We should carry the resolved path on the
// directive even when no `File` exists (today the miss returns `None`
// and drops it), then have `oak_scan` enumerate source directives after
// a scan, mint an `OrphanRoot` `File` from disk for each
// out-of-workspace target, and iterate for `source()` chains. A file
// watcher is only needed for freshness (re-reading after an external
// edit), plus GC to drop the orphan once the directive goes away.
// TODO(diagnostics): Until we support out-of-workspace sourced files,
// should we at least lint so user knows that we can't analyse the file?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to support out of workspace source() files?

i.e. it might actually be more practically useful to only lint to discourage this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful in monorepos but not high priority for sure

Comment thread crates/oak_db/src/file_resolve.rs Outdated
Comment thread crates/oak_db/src/file_resolve.rs Outdated
Comment thread crates/oak_db/src/file_resolve.rs Outdated
file: target_url,
name: forwarded,
..
} = index.definitions(scope)[def_id].kind()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a reaching_definitions() definition could still be an Import from another file? I had some intuition that a reaching definition would only be local to this file

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you source() a file you get local imports that you need to resolve

@lionel- lionel- force-pushed the oak/salsa-17-interning branch from e4f1586 to b80334b Compare June 19, 2026 13:51
Base automatically changed from oak/salsa-17-interning to main June 19, 2026 13:52
@lionel- lionel- force-pushed the oak/salsa-18-goto-def-migration branch from d73c0d5 to 77bd0a4 Compare June 19, 2026 13:52
@lionel- lionel- merged commit 34994a1 into main Jun 19, 2026
17 checks passed
@lionel- lionel- deleted the oak/salsa-18-goto-def-migration branch June 19, 2026 13:53
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants