Skip to content

Faiss handler with flat index#11839

Merged
sejubar merged 47 commits into
releases/25.14.0from
faiss_ds2
Jan 8, 2026
Merged

Faiss handler with flat index#11839
sejubar merged 47 commits into
releases/25.14.0from
faiss_ds2

Conversation

@ea-rus
Copy link
Copy Markdown
Collaborator

@ea-rus ea-rus commented Oct 31, 2025

Description

Continue of #11750

Query examples:

CREATE DATABASE db_faiss
WITH
    ENGINE = 'duckdb_faiss';

create knowledge base kb_faiss
using storage = db_faiss.kb_faiss,
embedding_model={"provider": "openai", "model_name": "text-embedding-3-small"};

insert into kb_faiss (id, content, legs) values (1, 'duck', 2);
insert into kb_faiss (id, content, legs) values (2, 'cat', 4);

select * from kb_faiss where
 legs=4
 and content = 'cat'
 and hybrid_search=true;

Keyword search

Implemented by using duckdb fts extension

How it works: when keyword search is used and FTS index doesn't exist - it is created
But this index is removed when any record is inserted into KB (because FTS index isn't updated after inserts in duckdb)

Other updates:

  • added lock for faiss index file to prevent from opening and modifying index file from different KBs or mindsdb instances
    • disabled on windows, looks like faiss apply this lock there already
  • checking RAM usage:
    • forecast the size of inserted data in RAM: at least 1Gb RAM should be be free. make this check on every 10k inserted records
    • forecast the size in RAM before index is loaded, if index size > 1Gb: at least 1Gb RAM should be be free after loading
  • If inserted dataset is greater than MAX_INSERT_BATCH_SIZE, it is split by this size and inserted in the batches
    User request is kept sync (user waits till insert is completed)

Fixes https://linear.app/mindsdb/issue/FQE-1830/faiss-handler-v1-flat-index-kw-search

Type of change

  • ⚡ New feature (non-breaking change which adds functionality)

Verification Process

To ensure the changes are working as expected:

  • Test Location: Specify the URL or path for testing.
  • Verification Steps: Outline the steps or queries needed to validate the change. Include any data, configurations, or actions required to reproduce or see the new functionality.

Additional Media:

  • I have attached a brief loom video or screenshots showcasing the new functionality or change.

Checklist:

  • My code follows the style guidelines(PEP 8) of MindsDB.
  • I have appropriately commented on my code, especially in complex areas.
  • Necessary documentation updates are either made or tracked in issues.
  • Relevant unit and integration tests are updated or added.

torrmal and others added 9 commits October 15, 2025 12:50
@ea-rus ea-rus changed the title Faiss ds2 Faiss handler with flat index Nov 1, 2025
@ea-rus ea-rus changed the base branch from develop to releases/25.12.0 November 21, 2025 12:43
@ea-rus ea-rus changed the base branch from releases/25.12.0 to releases/25.11.1 November 24, 2025 18:02
@ea-rus ea-rus changed the base branch from releases/25.11.1 to releases/v26.0.0 November 27, 2025 17:18
@ea-rus ea-rus changed the base branch from releases/v26.0.0 to releases/25.11.1 November 28, 2025 08:01
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 19, 2025

Coverage

Coverage Report
FileStmtsMissCoverMissing
mindsdb/integrations/handlers/bigquery_handler
   __init__.py14379%7–9
   bigquery_handler.py1402185%53, 57, 62–65, 68, 88, 109–113, 119, 141–143, 159–161, 313–314
mindsdb/integrations/handlers/file_handler
   file_handler.py1171785%25–27, 56, 59, 62, 88–89, 113, 131–132, 155–158, 166–167
mindsdb/integrations/handlers/mssql_handler
   __init__.py14379%10–12
   mssql_handler.py2844186%39–58, 79, 99, 106, 110, 121, 196, 244, 290–294, 306–308, 336, 363, 365–368, 402–407, 421, 444, 482, 524, 568, 600, 671, 710
mindsdb/integrations/handlers/mysql_handler
   __init__.py14379%10–12
   mysql_handler.py2011195%33–37, 79–84, 100, 164
   settings.py813754%30, 37–42, 49, 57, 71–81, 87–111, 117
mindsdb/integrations/handlers/oracle_handler
   __init__.py14379%10–12
   oracle_handler.py2674782%79–80, 113, 115, 117, 194, 205–206, 229–235, 239, 242–245, 253–255, 261–263, 295–297, 303, 361–378, 516–517, 622
mindsdb/integrations/handlers/postgres_handler
   __init__.py14379%8–10
   postgres_handler.py3262592%96–102, 196, 240, 278–279, 313–318, 345, 389–392, 455, 514, 545–546, 695
mindsdb/integrations/handlers/redshift_handler
   __init__.py14379%8–10
mindsdb/integrations/handlers/salesforce_handler
   __init__.py14379%10–12
   salesforce_handler.py122596%102–104, 146, 341
   salesforce_tables.py881286%140, 184–185, 231, 249–267
mindsdb/integrations/handlers/slack_handler
   __init__.py13377%10–12
   slack_handler.py1361192%57, 59, 122–124, 310, 314, 319, 323, 327, 342
   slack_tables.py2706576%58, 64–65, 71–72, 101–103, 156, 163–165, 251–265, 285, 293–295, 310, 340–344, 375–377, 384–387, 394, 404–408, 438–440, 447–450, 459–463, 549–551, 556, 577, 585–587, 599, 630–634, 698, 705–707
mindsdb/integrations/handlers/snowflake_handler
   __init__.py14379%8–10
   auth_types.py441273%12, 38, 44, 68–77
   snowflake_handler.py3244287%33–34, 111, 116, 118, 122, 157, 159–163, 210, 294, 328, 383–399, 432, 577–594, 620–623, 637, 658, 693
mindsdb/integrations/handlers/timescaledb_handler
   __init__.py13377%7–9
TOTAL272437686% 

Tests Skipped Failures Errors Time
549 53 💤 0 ❌ 0 🔥 9.072s ⏱️

@sejubar sejubar merged commit 16eb2e5 into releases/25.14.0 Jan 8, 2026
18 checks passed
@sejubar sejubar deleted the faiss_ds2 branch January 8, 2026 21:59
@github-actions github-actions Bot locked and limited conversation to collaborators Jan 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants