fix: skip live table scan for deleted-only queries#2459
Open
Nao-ris wants to merge 1 commit intocouchbase:masterfrom
Open
fix: skip live table scan for deleted-only queries#2459Nao-ris wants to merge 1 commit intocouchbase:masterfrom
Nao-ris wants to merge 1 commit intocouchbase:masterfrom
Conversation
When the WHERE clause guarantees only deleted documents can match (e.g. WHERE _._deleted), route the query to kv_del_<collection> directly instead of the all_<collection> UNION view. This avoids an expensive full scan of the live docs table. Resolves the FIXME at QueryTranslator.cc:144 (Support kDeletedDocs). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
|
Hi @Nao-ris, have you signed CLA, https://developer.couchbase.com/open-source-projects/ (Steps for Contributing to a Project)? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Hello 👋
In our park of users some have large cblite databases (many > 2GB).
To keep their size in check we periodically query for tombstones to purge them after a successful push replication.
We run the query
SELECT _._id FROM _ WHERE _._deleted LIMIT 100and we noticed it is very slow even with a LIMIT. After investigation, it appears the query generates a UNION scan of both the live docs table (kv_default) and the deleted docs table (kv_del_default). The live table scan finds zero matches but still reads every row.We see two potential solutions:
couchbase-lite-core&couchbase-lite-Cto retrieve the ids of all deleted documentsThis PR is my proposal to tackle solution 1.
Feel free to tell me if you see issues with it, or if you would prefer solution 2.
Summary
This change detects when the WHERE clause guarantees only deleted documents can match, and routes the query to
kv_del_<collection>directly instead of theall_<collection>UNION view. This resolves the//FIXME: Support kDeletedDocsat QueryTranslator.cc:144.Patterns optimized
WHERE _._deleted(bare boolean)WHERE _._deleted = true/IS TRUEWHERE _._deleted != false/IS NOT FALSEPatterns that reference
_deletedwithout guaranteeing it (OR, SELECT,= false, etc.) correctly fall back to the UNION view.Safety: default collection migration
The default collection may have deleted docs in
kv_defaultif the database was upgraded from a version prior to 3.1 (whenkv_del_tables were introduced) and the background migration has not yet completed. A newisDeletedTableComplete()delegate method checks this: non-default collections always return true; thedefault collection checks
kMaxRowidWithDeletedInDefault. When migration is incomplete, the query falls back to the UNION view.Known limitation
Live queries (change tracking) are not registered for deleted-only queries, consistent with the existing behavior for queries using the
all_*UNION view.Test plan