Problem
The dolt_tags system table does not have an index on tag_name, causing queries to perform a full table scan. This becomes a significant performance bottleneck when there are many tags.
Reproduction
-- Create many tags (we have ~14,000 tags for block tracking)
-- Then query by tag_name:
EXPLAIN ANALYZE SELECT tag_hash FROM dolt_tags WHERE tag_name = '0x00027A9A615216A898DC3F2DE7973573D4A6C628910FDE665BC8BAF2903F178B';
Query plan shows full table scan:
Project
├─ columns: [dolt_tags.tag_hash]
└─ Filter
├─ (dolt_tags.tag_name = '...')
└─ Table
└─ name: dolt_tags
Performance Impact
With 14,299 tags:
- Query time: ~235ms per lookup
- Expected with index: <5ms
We tried adding an index manually but system tables are not indexable:
CREATE INDEX idx_tag_name ON dolt_tags(tag_name(66));
-- Error 1105 (HY000): the table is not indexable
Use Case
We use Dolt for blockchain indexing where each block gets a tag (0x{block_hash}). When checking if a block has been processed, we query dolt_tags by tag_name. With thousands of blocks, this becomes the main bottleneck.
Suggested Solution
Add an internal index on tag_name for the dolt_tags system table, or provide a more efficient lookup mechanism for tags by name.
Environment
- Dolt version: latest (docker image)
- OS: Linux
- Tag count: ~14,000
Workaround
For now, we're considering maintaining a separate user table to track processed blocks instead of relying on dolt_tags lookups.
Problem
The
dolt_tagssystem table does not have an index ontag_name, causing queries to perform a full table scan. This becomes a significant performance bottleneck when there are many tags.Reproduction
Query plan shows full table scan:
Performance Impact
With 14,299 tags:
We tried adding an index manually but system tables are not indexable:
Use Case
We use Dolt for blockchain indexing where each block gets a tag (
0x{block_hash}). When checking if a block has been processed, we querydolt_tagsbytag_name. With thousands of blocks, this becomes the main bottleneck.Suggested Solution
Add an internal index on
tag_namefor thedolt_tagssystem table, or provide a more efficient lookup mechanism for tags by name.Environment
Workaround
For now, we're considering maintaining a separate user table to track processed blocks instead of relying on
dolt_tagslookups.