Skip to content

feat: add INDEX_FAISS — vanilla faiss adapter IndexNode#1584

Merged
sre-ci-robot merged 1 commit into
zilliztech:mainfrom
foxspy:faiss-passthrough
May 16, 2026
Merged

feat: add INDEX_FAISS — vanilla faiss adapter IndexNode#1584
sre-ci-robot merged 1 commit into
zilliztech:mainfrom
foxspy:faiss-passthrough

Conversation

@foxspy
Copy link
Copy Markdown
Collaborator

@foxspy foxspy commented Apr 15, 2026

issue: #1583

Summary

  • Introduce INDEX_FAISS ("FAISS"), a thin adapter over upstream (vanilla) faiss's index_factory DSL. Users select the concrete faiss index via faiss_index_name; all other knobs pass through verbatim to faiss.

  • Covered faiss families (runtime-param dispatch): IVF and all derivatives (IVFFlat / IVFPQ / IVFSQ / IVFRaBitQ / BinaryIVF / IVFFlatPanorama), HNSW and all derivatives, standalone PQ, PreTransform wrappers (OPQ / PCA), Refine wrappers (RFlat / Refine(...)), and SVS Vamana (gated by FAISS_ENABLE_SVS).

  • Core infrastructure:

    • New BaseConfig::CaptureRawJson virtual hook (default no-op) — lets a config subclass keep a JSON snapshot before Config::Load consumes declared fields; other configs unaffected.
    • FaissConfig declares only faiss_index_name; all other JSON keys land in raw_params for forwarding. Declared keys are filtered out automatically via __DICT__ introspection, so there is no hardcoded blacklist.
    • Upstream-bound helper at thirdparty/faiss/faiss/cppcontrib/knowhere/SearchParamsDispatch.{h,cpp}: make_search_params factory, try_set_search_param setter, and whitelist queries (supported_search_params, is_supported_build_param). Pure faiss-types API with no Knowhere symbols — candidate for a future upstream faiss PR.
  • Behavior contract:

    • Unknown faiss knobs surface as invalid_args with the offending key in the error message — not silently dropped.
    • Stringified numeric/boolean values (e.g. "nprobe": "16", "check_relative_distance": "false") are coerced, matching Knowhere's native FormatAndCheck leniency for typed fields.
    • Concurrent searches use per-request SearchParametersXxx — no shared index state mutation.

Example config

{
  "index_type": "FAISS",
  "faiss_index_name": "OPQ16,IVF1024,PQ16x8",
  "metric_type": "L2",
  "nprobe": 16
}

Test plan

  • New tests/ut/test_faiss_vanilla.cc: 22 Catch2 cases / 136 assertions (tag [faiss_vanilla]) covering config capture, factory creation, Flat / IVF / HNSW build+search, BitsetView filtering, serialize / deserialize roundtrip (+ mmap), range search capability probe, GetVectorByIds capability probe, binary BIVF path, OPQ+IVF+PQ (PreTransform recursion), IVF+Refine wrapper, standalone PQ, stringified-value coercion, error surfacing (invalid factory, typo, unknown family knob), Size() estimate, concurrent search isolation.
  • Full existing UT suite passes with no regressions on the standard CI build.
  • FAISS_ENABLE_SVS builds pick up one additional SVS Vamana search test.

🤖 Generated with Claude Code

@sre-ci-robot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: foxspy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify Bot added the dco-passed label Apr 15, 2026
@foxspy foxspy force-pushed the faiss-passthrough branch 2 times, most recently from 392d5d0 to 2d40758 Compare April 15, 2026 13:19
Comment thread src/index/faiss/faiss.cc Outdated
Comment thread src/index/faiss/faiss.cc Outdated
Comment thread src/index/faiss/faiss.cc Outdated
if constexpr (std::is_same_v<DataType, fp32>) {
auto out = std::make_unique<float[]>(nq * dim);
for (int64_t i = 0; i < nq; ++i) {
index_->reconstruct(ids[i], out.get() + i * dim);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to produce inprecise representation in most cases

Comment thread src/index/faiss/faiss.cc Outdated
if (!index_) {
return false;
}
// Best-effort probe: try reconstruct(0, ...) and return false on any exception.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather lock this interface down for now. It needs a proper audit before we expose it, and the cost of explaining its current semantics outweighs the benefit.

Comment thread src/index/faiss/faiss.cc
Comment thread src/index/faiss/faiss.cc
@foxspy foxspy force-pushed the faiss-passthrough branch 3 times, most recently from 8903b31 to c094492 Compare April 22, 2026 12:59
Adds a new IndexNode registered as INDEX_FAISS = "FAISS" that acts as a
thin adapter over upstream (vanilla) faiss's index_factory DSL. Users
select the concrete faiss index via faiss_index_name; all other knobs
are forwarded verbatim to faiss's own ParameterSpace (build) and
per-family SearchParametersXxx (search), so any param faiss accepts
works without per-index Knowhere wrapper code.

Framework change:
  - BaseConfig::CaptureRawJson(const Json&) virtual hook (default
    no-op) called from LoadConfig between FormatAndCheck and
    Config::Load. Lets FaissConfig keep a snapshot of the JSON
    before Config::Load drops keys it doesn't recognize.

Adapter (src/index/faiss/):
  - FaissConfig: only faiss_index_name is typed-declared; other JSON
    keys land in raw_params for forwarding.
  - faiss_dispatch: validates each raw key against faiss-owned
    whitelists (supported_build_param_names + quantizer_* prefix;
    supported_search_params per index family), coerces JSON values
    (accepts stringified numbers/booleans to match Knowhere's
    native FormatAndCheck leniency), and delegates family-specific
    field setters to the upstream-bound helper.
  - FaissIndexNode<DataType> template instantiated for fp32 and
    bin1. Implements Build, Search (+ BitsetView), Serialize /
    Deserialize (+ IO_FLAG_MMAP), capability-probed RangeSearch /
    GetVectorByIds / HasRawData, coarse Size() estimate. AnnIterator
    and CalcDistByIDs inherit not_implemented from the base class.
    fp16 / bf16 / int8 / sparse are not registered.

Upstream-bound helper (thirdparty/faiss/faiss/cppcontrib/knowhere/):
  - SearchParamsDispatch.{h,cpp} — faiss-types-only API with
    make_search_params factory (recurses into PreTransform / Refine),
    try_set_search_param setter (walks nested wrappers, dispatches
    to IVF / HNSW / PQ / SVS Vamana), and whitelist queries
    (supported_search_params, is_supported_build_param). MIT-licensed;
    no nlohmann::json or Knowhere symbols leaked in. Candidate for a
    future upstream faiss PR.

Behavior contract:
  - Unknown faiss knobs surface as invalid_args with the offending
    key in the error message (not silently dropped).
  - Stringified numeric / boolean values are coerced, matching
    Knowhere's native Config::FormatAndCheck leniency.
  - Concurrent searches use per-request SearchParametersXxx — no
    shared index state mutation.
  - SVS Vamana support is compiled in when FAISS_ENABLE_SVS is
    defined (e.g. x86 image builds); otherwise the SVS code paths
    are omitted cleanly.

Tests (tests/ut/test_faiss_vanilla.cc, tag [faiss_vanilla]):
  22 Catch2 cases / 136 assertions covering: config capture, factory
  creation, Flat / IVF / HNSW build+search, BitsetView filtering,
  serialize / deserialize roundtrip, range search capability probe,
  GetVectorByIds capability probe, binary BIVF path, OPQ+IVF+PQ
  (PreTransform recursion), IVF+Refine wrapper, standalone PQ,
  stringified-value coercion, error surfacing (invalid factory,
  typo, unknown family knob), Size() estimate, concurrent search
  isolation. FAISS_ENABLE_SVS builds pick up one additional SVS
  Vamana test.

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
@foxspy foxspy force-pushed the faiss-passthrough branch from c094492 to 3317812 Compare April 22, 2026 16:00
@alexanderguzhva
Copy link
Copy Markdown
Collaborator

/lgtm

@mergify mergify Bot removed the ci-passed label May 16, 2026
@sre-ci-robot sre-ci-robot merged commit 95b6f47 into zilliztech:main May 16, 2026
12 checks passed
@mergify mergify Bot removed the ci-passed label May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants