fix: prevent segfault when input exceeds batch size by vlasky · Pull Request #21 · asg017/sqlite-lembed

vlasky · 2025-12-04T00:53:15Z

Summary

Fix buffer overflow segfault when input tokenizes to more than 512 tokens (the hardcoded batch capacity)
Size batch dynamically to actual token count instead of fixed 512
Add bounds check with actionable error message when token count exceeds model's context size
Fix memory leak: free tokens array after use

Details

The batch was initialized with a fixed capacity of 512 tokens, but the loop populating it had no bounds check. When an input tokenized to more than 512 tokens, this caused a buffer overflow and segmentation fault.

Resolves #20

Test plan

Verify extension builds successfully
Test with inputs that tokenize to more than 512 tokens - should return error instead of crashing
Existing tests continue to pass

The batch was initialized with a fixed capacity of 512 tokens, but the loop populating it had no bounds check. When processing documents with more than 512 tokens, this caused a buffer overflow and segmentation fault. Changes: - Size batch to actual token_count instead of fixed 512 - Add bounds check: error if token_count > context size - Return actionable error message with token count and limit - Fix memory leak: free tokens array after use - Add specific error messages for decode/embedding failures Fixes asg017#20 Co-Authored-By: Claude <noreply@anthropic.com>

This was referenced Dec 4, 2025

Update llama.cpp and support arm64 #19

Open

Segmentation fault core dumped inserting item into vec0 table asg017/sqlite-vec#245

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent segfault when input exceeds batch size#21

fix: prevent segfault when input exceeds batch size#21
vlasky wants to merge 1 commit intoasg017:mainfrom
vlasky:fix-batch-overflow-segfault

vlasky commented Dec 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vlasky commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vlasky commented Dec 4, 2025 •

edited

Loading