Update llama.cpp and support arm64 by rodydavis · Pull Request #19 · asg017/sqlite-lembed

rodydavis · 2025-09-05T04:43:07Z

Updates the llama.cpp submodule to the latest version.
Adapts the code to the new llama.cpp API.
Fixes the build process.
Updates the tests to reflect the changes in the embeddings.
Adds support for arm64.
Updates the .gitignore file.

- Updates the llama.cpp submodule to the latest version. - Adapts the code to the new llama.cpp API. - Fixes the build process. - Updates the tests to reflect the changes in the embeddings.

rodydavis · 2025-09-05T04:43:21Z

Fixes #18

O-J1 · 2025-09-05T17:53:42Z

Worked for me on Ubuntu as a minimal test script. If anyone else sees the size difference and is confused, that seems(?) to be expected (2MB -> 76kb)

I wasnt able to figure out cross-compiling it to windows sadly.

Thanks for this @rodydavis

vlasky · 2025-12-04T01:36:34Z

Excellent work on this PR @rodydavis ! The API adaptations are spot-on.

One thing worth noting: the llama.cpp update changes the embedding values produced by BERT models. The test uses "alex garcia" with all-MiniLM-L6-v2:

Old llama.cpp (2b33896): first float ≈ -0.092
New llama.cpp (4fd1242): first float ≈ +0.005

I investigated the cause. It appears to be due to ggml-org/llama.cpp@6562e5a ("context: allow cache-less context for embeddings") which optimizes BERT models to skip KV cache allocation. A side effect is that llama_decode() now redirects to llama_encode() for these models, which returns the [CLS] token embedding instead of the last token embedding.

Claude advises me that "this new behavior is actually more correct for BERT models [CLS] is the designated sentence-level representation". Absolute values have changed, but semantic similarity is preserved. Nonetheless, embeddings from old and new versions aren't directly comparable.

The lesson is: users need to be aware that updates to llama.cpp have the potential to affect embeddings, which may require stored embeddings to be regenerated.

Here is a full breakdown

vlasky · 2025-12-04T01:41:40Z

@rodydavis you might also be interested in checking out PR#21

@rodydavis

Updates the llama.cpp submodule and adapts code to the new API: - llama_tokenize() now takes vocab from llama_model_get_vocab() - llama_n_embd() -> llama_model_n_embd() - llama_kv_cache_clear() -> llama_memory_clear(llama_get_memory(), false) - llama_token_get_score() -> llama_vocab_get_score() - llama_token_to_piece() now takes vocab and additional parameter - llama_load_model_from_file() -> llama_model_load_from_file() - llama_new_context_with_model() -> llama_init_from_model() - llama_free_model() -> llama_model_free() - ggml_static -> ggml in CMakeLists.txt - Remove seed from context_options (no longer supported) Based on PR asg017#19 by @rodydavis. Co-Authored-By: Rody Davis <rody.davis.jr@gmail.com> Co-Authored-By: Claude <noreply@anthropic.com>

@rodydavis

- Add Darwin arm64/x86_64 architecture detection in Makefile - Add tests/__pycache__/ to .gitignore From PR asg017#19 by @rodydavis. Co-Authored-By: Rody Davis <rody.davis.jr@gmail.com>

rodydavis added 3 commits September 4, 2025 19:22

feat: Update llama.cpp to latest version

efb4e48

- Updates the llama.cpp submodule to the latest version. - Adapts the code to the new llama.cpp API. - Fixes the build process. - Updates the tests to reflect the changes in the embeddings.

Update .gitignore

cbdc40e

Support for arm64

81d0f93

rodydavis mentioned this pull request Sep 5, 2025

Having trouble loading EmbeddingGemma #18

Open

asg017 mentioned this pull request Sep 28, 2025

Segmentation fault core dumped inserting item into vec0 table asg017/sqlite-vec#245

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update llama.cpp and support arm64#19

Update llama.cpp and support arm64#19
rodydavis wants to merge 3 commits intoasg017:mainfrom
rodydavis:update-llama-cpp

rodydavis commented Sep 5, 2025

Uh oh!

rodydavis commented Sep 5, 2025

Uh oh!

O-J1 commented Sep 5, 2025 •

edited

Loading

Uh oh!

vlasky commented Dec 4, 2025

Uh oh!

vlasky commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rodydavis commented Sep 5, 2025

Uh oh!

rodydavis commented Sep 5, 2025

Uh oh!

O-J1 commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vlasky commented Dec 4, 2025

Uh oh!

vlasky commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

O-J1 commented Sep 5, 2025 •

edited

Loading