Skip to content

feat: split VectorIndexResolver into peek/resolve/ensure_loaded#18

Merged
anoop-narang merged 4 commits into
mainfrom
feat/resolver-trait
Apr 4, 2026
Merged

feat: split VectorIndexResolver into peek/resolve/ensure_loaded#18
anoop-narang merged 4 commits into
mainfrom
feat/resolver-trait

Conversation

@anoop-narang

Copy link
Copy Markdown
Collaborator

Summary

Split the VectorIndexResolver trait into three methods to match DataFusion's sync/async pipeline:

  • peek(key) — sync, cheap: returns VectorIndexMeta (metric, scalar_kind, schema, key_col, config). Used by the optimizer rule to validate ANN rewrites without needing loaded index objects.
  • resolve(key) — sync: returns fully loaded Arc<RegisteredTable> from cache. Used by the executor.
  • ensure_loaded(key) — async: loads the index from catalog/storage on cache miss. Called by the physical planner before building execution plans.

USearchRegistry implements all three (always cached, ensure_loaded is a no-op). Production systems can implement catalog-backed loading in ensure_loaded.

Changes

  • registry.rs: Add VectorIndexMeta struct, split trait, implement for USearchRegistry
  • rule.rs: Use peek() instead of resolve() for optimizer validation
  • planner.rs: Call ensure_loaded() (async) before resolve() in physical planner
  • lib.rs: Export new types

Backward Compatibility

  • USearchRegistry implements the full trait — existing users just need to update the trait import
  • resolve() still works as before for the executor path

Replace direct USearchRegistry dependency with a VectorIndexResolver
trait throughout the optimizer rule, physical planner, executor, and
UDTF. USearchRegistry implements the trait (backward compatible).

This enables production systems to implement a catalog-backed resolver
that treats the in-memory index as a cache rather than the source of
truth, following the same publication model as sorted/BM25 indexes.

Key changes:
- Add VectorIndexResolver trait with resolve(name) -> Option<Arc<RegisteredTable>>
- USearchRegistry implements VectorIndexResolver
- All consumers now accept Arc<dyn VectorIndexResolver>
- Add into_resolver() convenience method on USearchRegistry
Split the resolver trait to match DataFusion's sync/async pipeline:
- peek(key) — sync, cheap: returns VectorIndexMeta (metric, schema)
  for the optimizer rule to validate ANN rewrites without loading
- resolve(key) — sync: returns loaded RegisteredTable from cache
- ensure_loaded(key) — async: loads index on cache miss, called by
  the physical planner before building execution plans

The optimizer rule uses peek (sync, no I/O).
The planner calls ensure_loaded (async) then resolve.
The executor uses resolve (sync, cache only).

This enables catalog-backed resolution where ensure_loaded downloads
and loads index files on demand, while the optimizer stays sync and fast.
Comment thread src/udtf.rs
};

let registered = self.registry.get(&table_name).ok_or_else(|| {
let registered = self.registry.resolve(&table_name).ok_or_else(|| {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 — UDTF bypasses ensure_loaded, breaking catalog-backed implementations.

TableFunctionImpl::call() is sync, so it cannot call async fn ensure_loaded(). For USearchRegistry this is fine (always cached), but for any production VectorIndexResolver where ensure_loaded actually does I/O, vector_usearch() will always return "not registered" for tables that haven't been pre-loaded — with no indication of why.

The PR introduces VectorIndexResolver as an extension point for production catalog-backed systems, but the UDTF silently doesn't support that path.

Two options:

Option A — Document the limitation so implementors know to pre-load before calling vector_usearch():

Suggested change
let registered = self.registry.resolve(&table_name).ok_or_else(|| {
// NOTE: The UDTF path is sync and cannot call ensure_loaded().
// Production VectorIndexResolver implementations must pre-load
// any table before it can be queried via vector_usearch().
let registered = self.registry.resolve(&table_name).ok_or_else(|| {

Option B — Add a sync peek_resolve() / try_resolve_blocking() method to the trait that implementations can override to do a synchronous cache lookup + blocking load, falling back to resolve() for the base case. This is more invasive but makes the contract explicit for implementors.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Issues

[P1] src/udtf.rs:88USearchUDTF::call() calls resolve() without calling ensure_loaded() first. Since call() is sync, it can't call async fn ensure_loaded(). For USearchRegistry this is harmless (always cached), but for any production VectorIndexResolver where ensure_loaded does real I/O, every call to vector_usearch() will fail with "not registered" for tables that haven't been pre-loaded. No error or warning signals why.

This is the exact scenario the PR is designed to enable, and the UDTF silently breaks it.

Action Required

Either document the limitation on USearchUDTF (tables must be pre-loaded before use with vector_usearch() for catalog-backed implementations), or add a mechanism for sync callers to trigger loading. Option A from the inline comment is the minimal fix.

@anoop-narang

Copy link
Copy Markdown
Collaborator Author

Addressed in 649cc86. This keeps vector_usearch() as-is architecturally, but makes the current limitation explicit: USearchUDTF::call() is synchronous, does not call ensure_loaded(), and is therefore cache-only for async-backed VectorIndexResolver implementations. I also updated the cache-miss error to say the table is not loaded locally and to direct users to the optimizer/planner vector query path or pre-loading first.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Issues

planner.rs:365usearch_execute calls resolve() without calling ensure_loaded() first. The comment claims eviction is "handled correctly", but for any VectorIndexResolver implementation that supports cache eviction (which is the stated purpose of this trait split), the entry won't be reloaded — it just fails with a confusing error. The index will be evicted between plan and execute time, resolve() returns None, and the query dies with "not in registry at execute time".

This is a direct contract violation: the trait docstring for ensure_loaded says it loads on cache miss, but that guarantee is only upheld in plan_extension, not in usearch_execute.

Action Required

In usearch_execute (planner.rs ~line 364), add ensure_loaded before resolve:

// Re-fetch at execute time; reload if evicted between plan and execute.
params.registry.ensure_loaded(&params.table_name).await?;
let registered = params.registry.resolve(&params.table_name).ok_or_else(|| {
    DataFusionError::Execution(format!(
        "USearchExec: table '{}' not in registry at execute time",
        params.table_name
    ))
})?;

Also update the comment — the current "so cache eviction between plan and execute is handled correctly" is factually wrong without the ensure_loaded call. For USearchRegistry (always cached) this is harmless, but the PR's stated goal is enabling production catalog-backed resolvers where eviction is real and this path will silently break.

Comment thread src/planner.rs
) -> Result<Vec<RecordBatch>> {
// Re-fetch at execute time so cache eviction between plan and execute is handled correctly.
let registered = params.registry.get(&params.table_name).ok_or_else(|| {
let registered = params.registry.resolve(&params.table_name).ok_or_else(|| {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve() is called without a preceding ensure_loaded(). For async-backed resolvers that support cache eviction, the index will be gone here and the query fails — eviction is not actually handled, the error message is just friendlier.

Fix: call ensure_loaded first:

params.registry.ensure_loaded(&params.table_name).await?;
let registered = params.registry.resolve(&params.table_name).ok_or_else(|| { ... })?;

Also update the comment on line 364 — "handled correctly" implies the index is reloaded on eviction, which it is not without this call.

Comment thread src/registry.rs
/// Contains enough info for the optimizer to validate the ANN rewrite
/// and build the logical plan node, without needing the loaded index objects.
#[derive(Debug, Clone)]
pub struct VectorIndexMeta {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scalar_kind, key_col, and config are not read by the only caller of peek() — the optimizer rule uses only metric and schema. Carrying three extra fields in VectorIndexMeta increases implementor burden: a catalog-backed resolver must provide these in the sync peek() path even though they're only needed after ensure_loaded. Consider trimming VectorIndexMeta to { metric, schema } and letting callers get the rest via resolve() once the index is loaded. If you want to keep them for forward-compat, a doc note explaining which fields are actually required by the optimizer would help implementors.

Comment thread src/registry.rs
/// On cache miss or stale entry, downloads files and loads the index.
/// On cache hit with current generation, returns immediately.
/// Called from the async physical planner before building the execution plan.
async fn ensure_loaded(&self, name: &str) -> datafusion::common::Result<()>;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trait contract between ensure_loaded and resolve has a latent TOCTOU window: a concurrent eviction between the two calls produces an error rather than a retry. For USearchRegistry this is fine (never evicts), but production async-backed resolvers need to be aware. Consider documenting that implementations should guarantee the entry stays resident for at least one resolve() call after ensure_loaded returns successfully (e.g. via a short-lived pin/refcount), or explicitly note that callers must handle the None case with a retry.

@anoop-narang anoop-narang merged commit f9b4610 into main Apr 4, 2026
5 checks passed
@anoop-narang anoop-narang deleted the feat/resolver-trait branch April 4, 2026 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant