Skip to content

fix: skip live table scan for deleted-only queries#2459

Open
Nao-ris wants to merge 1 commit intocouchbase:masterfrom
Nao-ris:fix/skip-live-table-scan-on-deleted-only-queries
Open

fix: skip live table scan for deleted-only queries#2459
Nao-ris wants to merge 1 commit intocouchbase:masterfrom
Nao-ris:fix/skip-live-table-scan-on-deleted-only-queries

Conversation

@Nao-ris
Copy link
Copy Markdown

@Nao-ris Nao-ris commented Apr 2, 2026

Context

Hello 👋

In our park of users some have large cblite databases (many > 2GB).
To keep their size in check we periodically query for tombstones to purge them after a successful push replication.
We run the query SELECT _._id FROM _ WHERE _._deleted LIMIT 100 and we noticed it is very slow even with a LIMIT. After investigation, it appears the query generates a UNION scan of both the live docs table (kv_default) and the deleted docs table (kv_del_default). The live table scan finds zero matches but still reads every row.

We see two potential solutions:

  1. skip live table scan for deleted-only queries
  2. create a new API endpoint in couchbase-lite-core & couchbase-lite-C to retrieve the ids of all deleted documents

This PR is my proposal to tackle solution 1.
Feel free to tell me if you see issues with it, or if you would prefer solution 2.

Summary

This change detects when the WHERE clause guarantees only deleted documents can match, and routes the query to kv_del_<collection> directly instead of the all_<collection> UNION view. This resolves the //FIXME: Support kDeletedDocs at QueryTranslator.cc:144.

Patterns optimized

  • WHERE _._deleted (bare boolean)
  • WHERE _._deleted = true / IS TRUE
  • WHERE _._deleted != false / IS NOT FALSE
  • All of the above inside AND chains

Patterns that reference _deleted without guaranteeing it (OR, SELECT, = false, etc.) correctly fall back to the UNION view.

Safety: default collection migration

The default collection may have deleted docs in kv_default if the database was upgraded from a version prior to 3.1 (when kv_del_ tables were introduced) and the background migration has not yet completed. A new isDeletedTableComplete() delegate method checks this: non-default collections always return true; the
default collection checks kMaxRowidWithDeletedInDefault. When migration is incomplete, the query falls back to the UNION view.

Known limitation

Live queries (change tracking) are not registered for deleted-only queries, consistent with the existing behavior for queries using the all_* UNION view.

Test plan

  • CppTests: 685,963 assertions, 553 cases — all passed
  • C4Tests: 543,111 assertions, 178 cases — all passed
  • clang-format applied

When the WHERE clause guarantees only deleted documents can match
(e.g. WHERE _._deleted), route the query to kv_del_<collection>
directly instead of the all_<collection> UNION view. This avoids
an expensive full scan of the live docs table.

Resolves the FIXME at QueryTranslator.cc:144 (Support kDeletedDocs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Nao-ris Nao-ris marked this pull request as ready for review April 2, 2026 10:06
@jianminzhao
Copy link
Copy Markdown
Contributor

Hi @Nao-ris, have you signed CLA, https://developer.couchbase.com/open-source-projects/ (Steps for Contributing to a Project)?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants