You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
readme: document findings — Ninja rationale, Metal verification, warning interpretation
Adds information that was produced during the Apple-platform build investigation
but hadn't made it into the repo:
- Full text of the -G Xcode failure on CMake 4.x + iOS/tvOS/visionOS, with
verification date, so future readers can search for the exact error string
- Note that the Xcode-specific -- -quiet build flag was dropped alongside the
generator switch, and that -DCMAKE_XCODE_ATTRIBUTE_* args were kept as
harmless Ninja no-ops
- Concrete commands to verify Metal shader embedding (nm | grep ggml_metallib)
- Explanation of the "Unknown CPU architecture" CMake warning — it's the x86_64
CPU backend falling back to generic kernels, not a Metal fallback
- Why the smoke-test recipe uses llama-bench (llama-cli is coupled to
LLAMA_BUILD_SERVER=ON in this upstream) and what good output looks like,
with reference throughput numbers from the 2026-04-15 verification
- Post-rebase spot-check commands for future upstream syncs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: README.md
+59-8Lines changed: 59 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,12 +6,25 @@ The goal of this fork is narrow: keep a shippable xcframework building on a mode
6
6
7
7
## What this fork changes
8
8
9
-
Two commits on top of upstream:
9
+
Three commits on top of upstream (plus this README):
10
10
11
-
1.**`cmake_minimum_required` bump** (`CMakeLists.txt`, `ggml/CMakeLists.txt`) — widen the accepted version range to `3.5...4.2` so CMake 4.x stops warning about removed policies.
12
-
2.**`build-xcframework.sh` → Ninja generator** — the upstream script uses `-G Xcode`, which fails on CMake 4.x when cross-compiling for iOS/tvOS/visionOS (`The C compiler identification is unknown`). Switching to `-G Ninja` resolves it. `combine_static_libraries` call sites updated to drop the `Release-<sdk>/` subpath that Ninja (single-config) doesn't produce.
11
+
1.**`cmake_minimum_required` bump** (`CMakeLists.txt`, `ggml/CMakeLists.txt`) — widens the accepted version range to `3.5...4.2` so CMake 4.x stops warning about removed policies. Upstream still pins to `3.14...3.28`.
12
+
2.**`build-xcframework.sh` → Ninja generator** — see "Why Ninja" below. All 7 `cmake -B` invocations in the script now use `-G Ninja` instead of `-G Xcode`. The Xcode-only `-- -quiet` build argument was dropped. `combine_static_libraries` call sites now pass `.` as the `release_dir` because Ninja is single-config and emits archives directly under `src/`, not `src/Release-<sdk>/`.
13
+
3.**Fork-focused README** — this file; upstream README moved to [README.upstream.md](README.upstream.md).
13
14
14
-
Everything else is vanilla upstream.
15
+
No C/C++/Objective-C source has been touched. No APIs added, removed, or renamed. No ggml backend modifications. Library behavior is byte-for-byte identical to upstream `b8802` for the same inputs.
16
+
17
+
### Why Ninja
18
+
19
+
On CMake 4.x with Xcode 26, the Xcode generator fails when cross-compiling to iOS/tvOS/visionOS SDKs:
20
+
21
+
```
22
+
-- The C compiler identification is unknown
23
+
CMake Error at ggml/src/ggml-cpu/CMakeLists.txt:57 (target_compile_features):
24
+
target_compile_features no known features for C compiler "" version .
25
+
```
26
+
27
+
The failure reproduces against `upstream/master`, verified 2026-04-15. Ninja bypasses it entirely because it does not rely on Xcode's toolchain detection for cross-SDK builds. The resulting xcframework is equivalent — the Xcode-specific `-DCMAKE_XCODE_ATTRIBUTE_*` arguments in `COMMON_CMAKE_ARGS` are harmless no-ops under Ninja, so they were left alone rather than stripped.
Mac Catalyst is **not** in the xcframework — CMake's cross-compile flags conflict when combining both Catalyst architectures in a single configure step. See [APPLE-PLATFORMS-BUILD.md](APPLE-PLATFORMS-BUILD.md) for the manual lipo workflow.
35
48
36
-
Every slice links `Metal.framework` and `Accelerate.framework`, and embeds the full Metal shader library (110 MSL kernels) via `GGML_METAL_EMBED_LIBRARY=ON`. No external `.metallib` file is required at runtime.
49
+
Every slice links `Metal.framework` and `Accelerate.framework`, and embeds the full Metal shader library (110 MSL kernels) via `GGML_METAL_EMBED_LIBRARY=ON`. No external `.metallib` file is required at runtime. You can verify this on any slice:
During configuration of simulator/multi-arch slices you will see:
61
+
62
+
```
63
+
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:558 (message):
64
+
Unknown CPU architecture. Falling back to generic implementations.
65
+
```
66
+
67
+
This is **not** a Metal fallback. It fires only when x86_64 is part of the architecture list (iOS sim, macOS, visionOS sim, tvOS sim) and means the x86_64 **CPU backend** slice uses generic scalar kernels instead of AVX/AVX2. The arm64 CPU backend and the Metal backend are unaffected. For shipping on Apple Silicon devices this warning is cosmetic — no one runs production inference on an x86_64 simulator.
37
68
38
69
### Requirements
39
70
@@ -45,7 +76,7 @@ Last verified: 2026-04-15 against upstream tag `b8802` with Xcode 26.4, CMake 4.
45
76
46
77
## Verifying Metal works
47
78
48
-
A quick smoke test using `llama-bench` against a host macOS build:
79
+
The xcframework is a library — it doesn't ship a runnable binary. To prove Metal is functional end-to-end against the same source the xcframework was built from, do a parallel host-macOS build of `llama-bench`:
Look for `ggml_metal_library_init: using embedded metal library` and the `MTL,BLAS` backend column. The same code path runs in the xcframework slices.
90
+
> **Note:**`llama-cli` is only built when `LLAMA_BUILD_SERVER=ON` in this upstream (`tools/CMakeLists.txt`). `llama-bench` is always available and is a more informative smoke test anyway — it prints tokens/sec per backend.
91
+
92
+
Look for:
93
+
94
+
-`ggml_metal_library_init: using embedded metal library` — the embedded metallib loaded, not a disk `.metallib`.
95
+
-`GPU family: MTLGPUFamilyApple*` — real Apple Silicon GPU detected.
96
+
- Backend column `MTL,BLAS` — Metal is the compute backend.
97
+
- tg (token generation) rates in the hundreds of t/s on a small model; CPU-only would be 10× slower.
98
+
99
+
Reference numbers from the 2026-04-15 verification (SmolLM2-135M-Instruct Q4_K_M on an M-series Mac): `pp64 ≈ 8098 t/s`, `tg32 ≈ 403 t/s`, backend `MTL,BLAS`, family `MTLGPUFamilyApple9`.
100
+
101
+
The xcframework slices contain identical Metal backend code — same `_ggml_metallib_start`/`_end` symbols, same 110 kernels — so a working host Metal build is a reliable proxy for the framework slices.
60
102
61
103
## Syncing with upstream
62
104
@@ -67,7 +109,16 @@ git rebase "$LATEST_TAG"
67
109
./build-xcframework.sh # re-verify
68
110
```
69
111
70
-
The fork's two commits rebase cleanly onto upstream tags. `ggml/CMakeLists.txt` occasionally conflicts when upstream moves code near `cmake_minimum_required`; resolve by keeping both the version bump and whatever upstream added.
112
+
The fork's commits rebase cleanly onto upstream tags with one known pinch point: `ggml/CMakeLists.txt` conflicts whenever upstream adds code near `cmake_minimum_required` (e.g. the CMP0194 policy block added around tag `b8802`). Resolve by keeping **both** the fork's version-range bump and whatever upstream added adjacent to it.
113
+
114
+
After rebasing, run `./build-xcframework.sh` and spot-check one slice before force-pushing:
0 commit comments