Skip to content

Commit 22046e9

Browse files
committed
docs: refresh README references to upstream b9628
- README.upstream.md: resync to current upstream README - README.md: update byte-for-byte and last-verified refs b8802 -> b9628 (re-verified locally 2026-06-13: Xcode 27.0, CMake 4.3.3, Ninja 1.13.2) Assisted-by: Claude Opus 4.8
1 parent e063207 commit 22046e9

2 files changed

Lines changed: 11 additions & 5 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Three commits on top of upstream (plus this README):
1717
2. **`build-xcframework.sh` → Ninja generator** — see "Why Ninja" below. All 7 `cmake -B` invocations in the script now use `-G Ninja` instead of `-G Xcode`. The Xcode-only `-- -quiet` build argument was dropped. `combine_static_libraries` call sites now pass `.` as the `release_dir` because Ninja is single-config and emits archives directly under `src/`, not `src/Release-<sdk>/`.
1818
3. **Fork-focused README** — this file; upstream README moved to [README.upstream.md](README.upstream.md).
1919

20-
No C/C++/Objective-C source has been touched. No APIs added, removed, or renamed. No ggml backend modifications. Library behavior is byte-for-byte identical to upstream `b8802` for the same inputs.
20+
No C/C++/Objective-C source has been touched. No APIs added, removed, or renamed. No ggml backend modifications. Library behavior is byte-for-byte identical to upstream `b9628` for the same inputs.
2121

2222
### Why Ninja
2323

@@ -90,7 +90,7 @@ This is **not** a Metal fallback. It fires only when x86_64 is part of the archi
9090
- CMake 4.x (`brew install cmake`)
9191
- Ninja (`brew install ninja`)
9292

93-
Last verified: 2026-04-15 against upstream tag `b8802` with Xcode 26.4, CMake 4.2.3, Ninja 1.13.2.
93+
Last verified: 2026-06-13 against upstream tag `b9628` with Xcode 27.0, CMake 4.3.3, Ninja 1.13.2.
9494

9595
## Verifying Metal works
9696

README.upstream.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
# llama.cpp
22

3-
![llama](https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png)
3+
![llama](https://raw.githubusercontent.com/ggml-org/llama.brand/refs/heads/master/cover/llama-cpp/cover-llama-cpp-dark.svg)
44

55
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
66
[![Release](https://img.shields.io/github/v/release/ggml-org/llama.cpp)](https://github.com/ggml-org/llama.cpp/releases)
77
[![Server](https://github.com/ggml-org/llama.cpp/actions/workflows/server.yml/badge.svg)](https://github.com/ggml-org/llama.cpp/actions/workflows/server.yml)
8+
[![Docker](https://github.com/ggml-org/llama.cpp/actions/workflows/docker.yml/badge.svg)](https://github.com/ggml-org/llama.cpp/actions/workflows/docker.yml)
9+
[![Winget](https://github.com/ggml-org/llama.cpp/actions/workflows/winget.yml/badge.svg)](https://github.com/ggml-org/llama.cpp/actions/workflows/winget.yml)
810

911
[Manifesto](https://github.com/ggml-org/llama.cpp/discussions/205) / [ggml](https://github.com/ggml-org/ggml) / [ops](https://github.com/ggml-org/llama.cpp/blob/master/docs/ops.md)
1012

@@ -27,6 +29,7 @@ LLM inference in C/C++
2729
- Vim/Neovim plugin for FIM completions: https://github.com/ggml-org/llama.vim
2830
- Hugging Face Inference Endpoints now support GGUF out of the box! https://github.com/ggml-org/llama.cpp/discussions/9669
2931
- Hugging Face GGUF editor: [discussion](https://github.com/ggml-org/llama.cpp/discussions/9268) | [tool](https://huggingface.co/spaces/CISCai/gguf-editor)
32+
- WebGPU support is now available in the browser, see a blog/demo introducing it [here](https://reeselevine.github.io/llamas-on-the-web/).
3033

3134
----
3235

@@ -142,6 +145,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
142145
- [x] [LFM2 models](https://huggingface.co/collections/LiquidAI/lfm2-686d721927015b2ad73eaa38)
143146
- [x] [Hunyuan models](https://huggingface.co/collections/tencent/hunyuan-dense-model-6890632cda26b19119c9c5e7)
144147
- [x] [BailingMoeV2 (Ring/Ling 2.0) models](https://huggingface.co/collections/inclusionAI/ling-v2-68bf1dd2fc34c306c1fa6f86)
148+
- [x] [Mellum models](https://huggingface.co/JetBrains/models?search=mellum)
145149

146150
#### Multimodal
147151

@@ -172,6 +176,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
172176
- JavaScript/Wasm (works in browser): [tangledgroup/llama-cpp-wasm](https://github.com/tangledgroup/llama-cpp-wasm)
173177
- Typescript/Wasm (nicer API, available on npm): [ngxson/wllama](https://github.com/ngxson/wllama)
174178
- Ruby: [yoshoku/llama_cpp.rb](https://github.com/yoshoku/llama_cpp.rb)
179+
- Ruby: [docusealco/rllama](https://github.com/docusealco/rllama)
175180
- Rust (more features): [edgenai/llama_cpp-rs](https://github.com/edgenai/llama_cpp-rs)
176181
- Rust (nicer API): [mdrokz/rust-llama.cpp](https://github.com/mdrokz/rust-llama.cpp)
177182
- Rust (more direct bindings): [utilityai/llama-cpp-rs](https://github.com/utilityai/llama-cpp-rs)
@@ -279,7 +284,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
279284
| [Metal](docs/build.md#metal-build) | Apple Silicon |
280285
| [BLAS](docs/build.md#blas-build) | All |
281286
| [BLIS](docs/backend/BLIS.md) | All |
282-
| [SYCL](docs/backend/SYCL.md) | Intel and Nvidia GPU |
287+
| [SYCL](docs/backend/SYCL.md) | Intel GPU |
283288
| [OpenVINO [In Progress]](docs/backend/OPENVINO.md) | Intel CPUs, GPUs, and NPUs |
284289
| [MUSA](docs/build.md#musa) | Moore Threads GPU |
285290
| [CUDA](docs/build.md#cuda) | Nvidia GPU |
@@ -289,7 +294,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
289294
| [CANN](docs/build.md#cann) | Ascend NPU |
290295
| [OpenCL](docs/backend/OPENCL.md) | Adreno GPU |
291296
| [IBM zDNN](docs/backend/zDNN.md) | IBM Z & LinuxONE |
292-
| [WebGPU [In Progress]](docs/build.md#webgpu) | All |
297+
| [WebGPU](docs/build.md#webgpu) | All |
293298
| [RPC](https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc) | All |
294299
| [Hexagon [In Progress]](docs/backend/snapdragon/README.md) | Snapdragon |
295300
| [VirtGPU](docs/backend/VirtGPU.md) | VirtGPU APIR |
@@ -529,6 +534,7 @@ To learn more about model quantization, [read this documentation](tools/quantize
529534
- [How to build](docs/build.md)
530535
- [Running on Docker](docs/docker.md)
531536
- [Build on Android](docs/android.md)
537+
- [Multi-GPU usage](docs/multi-gpu.md)
532538
- [Performance troubleshooting](docs/development/token_generation_performance_tips.md)
533539
- [GGML tips & tricks](https://github.com/ggml-org/llama.cpp/wiki/GGML-Tips-&-Tricks)
534540

0 commit comments

Comments
 (0)