Skip to content

dolt_tags system table missing index on tag_name - causes O(n) lookup #10365

@niconiconi

Description

@niconiconi

Problem

The dolt_tags system table does not have an index on tag_name, causing queries to perform a full table scan. This becomes a significant performance bottleneck when there are many tags.

Reproduction

-- Create many tags (we have ~14,000 tags for block tracking)
-- Then query by tag_name:

EXPLAIN ANALYZE SELECT tag_hash FROM dolt_tags WHERE tag_name = '0x00027A9A615216A898DC3F2DE7973573D4A6C628910FDE665BC8BAF2903F178B';

Query plan shows full table scan:

Project
 ├─ columns: [dolt_tags.tag_hash]
 └─ Filter
     ├─ (dolt_tags.tag_name = '...')
     └─ Table
         └─ name: dolt_tags

Performance Impact

With 14,299 tags:

  • Query time: ~235ms per lookup
  • Expected with index: <5ms

We tried adding an index manually but system tables are not indexable:

CREATE INDEX idx_tag_name ON dolt_tags(tag_name(66));
-- Error 1105 (HY000): the table is not indexable

Use Case

We use Dolt for blockchain indexing where each block gets a tag (0x{block_hash}). When checking if a block has been processed, we query dolt_tags by tag_name. With thousands of blocks, this becomes the main bottleneck.

Suggested Solution

Add an internal index on tag_name for the dolt_tags system table, or provide a more efficient lookup mechanism for tags by name.

Environment

  • Dolt version: latest (docker image)
  • OS: Linux
  • Tag count: ~14,000

Workaround

For now, we're considering maintaining a separate user table to track processed blocks instead of relying on dolt_tags lookups.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions