Skip to content

Commit f62eb3c

Browse files
apocryphxclaude
andcommitted
readme: document findings — Ninja rationale, Metal verification, warning interpretation
Adds information that was produced during the Apple-platform build investigation but hadn't made it into the repo: - Full text of the -G Xcode failure on CMake 4.x + iOS/tvOS/visionOS, with verification date, so future readers can search for the exact error string - Note that the Xcode-specific -- -quiet build flag was dropped alongside the generator switch, and that -DCMAKE_XCODE_ATTRIBUTE_* args were kept as harmless Ninja no-ops - Concrete commands to verify Metal shader embedding (nm | grep ggml_metallib) - Explanation of the "Unknown CPU architecture" CMake warning — it's the x86_64 CPU backend falling back to generic kernels, not a Metal fallback - Why the smoke-test recipe uses llama-bench (llama-cli is coupled to LLAMA_BUILD_SERVER=ON in this upstream) and what good output looks like, with reference throughput numbers from the 2026-04-15 verification - Post-rebase spot-check commands for future upstream syncs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e38e6ad commit f62eb3c

1 file changed

Lines changed: 59 additions & 8 deletions

File tree

README.md

Lines changed: 59 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,25 @@ The goal of this fork is narrow: keep a shippable xcframework building on a mode
66

77
## What this fork changes
88

9-
Two commits on top of upstream:
9+
Three commits on top of upstream (plus this README):
1010

11-
1. **`cmake_minimum_required` bump** (`CMakeLists.txt`, `ggml/CMakeLists.txt`) — widen the accepted version range to `3.5...4.2` so CMake 4.x stops warning about removed policies.
12-
2. **`build-xcframework.sh` → Ninja generator** — the upstream script uses `-G Xcode`, which fails on CMake 4.x when cross-compiling for iOS/tvOS/visionOS (`The C compiler identification is unknown`). Switching to `-G Ninja` resolves it. `combine_static_libraries` call sites updated to drop the `Release-<sdk>/` subpath that Ninja (single-config) doesn't produce.
11+
1. **`cmake_minimum_required` bump** (`CMakeLists.txt`, `ggml/CMakeLists.txt`) — widens the accepted version range to `3.5...4.2` so CMake 4.x stops warning about removed policies. Upstream still pins to `3.14...3.28`.
12+
2. **`build-xcframework.sh` → Ninja generator** — see "Why Ninja" below. All 7 `cmake -B` invocations in the script now use `-G Ninja` instead of `-G Xcode`. The Xcode-only `-- -quiet` build argument was dropped. `combine_static_libraries` call sites now pass `.` as the `release_dir` because Ninja is single-config and emits archives directly under `src/`, not `src/Release-<sdk>/`.
13+
3. **Fork-focused README** — this file; upstream README moved to [README.upstream.md](README.upstream.md).
1314

14-
Everything else is vanilla upstream.
15+
No C/C++/Objective-C source has been touched. No APIs added, removed, or renamed. No ggml backend modifications. Library behavior is byte-for-byte identical to upstream `b8802` for the same inputs.
16+
17+
### Why Ninja
18+
19+
On CMake 4.x with Xcode 26, the Xcode generator fails when cross-compiling to iOS/tvOS/visionOS SDKs:
20+
21+
```
22+
-- The C compiler identification is unknown
23+
CMake Error at ggml/src/ggml-cpu/CMakeLists.txt:57 (target_compile_features):
24+
target_compile_features no known features for C compiler "" version .
25+
```
26+
27+
The failure reproduces against `upstream/master`, verified 2026-04-15. Ninja bypasses it entirely because it does not rely on Xcode's toolchain detection for cross-SDK builds. The resulting xcframework is equivalent — the Xcode-specific `-DCMAKE_XCODE_ATTRIBUTE_*` arguments in `COMMON_CMAKE_ARGS` are harmless no-ops under Ninja, so they were left alone rather than stripped.
1528

1629
## Building the xcframework
1730

@@ -33,7 +46,25 @@ Output: `build-apple/llama.xcframework/` containing 7 slices:
3346

3447
Mac Catalyst is **not** in the xcframework — CMake's cross-compile flags conflict when combining both Catalyst architectures in a single configure step. See [APPLE-PLATFORMS-BUILD.md](APPLE-PLATFORMS-BUILD.md) for the manual lipo workflow.
3548

36-
Every slice links `Metal.framework` and `Accelerate.framework`, and embeds the full Metal shader library (110 MSL kernels) via `GGML_METAL_EMBED_LIBRARY=ON`. No external `.metallib` file is required at runtime.
49+
Every slice links `Metal.framework` and `Accelerate.framework`, and embeds the full Metal shader library (110 MSL kernels) via `GGML_METAL_EMBED_LIBRARY=ON`. No external `.metallib` file is required at runtime. You can verify this on any slice:
50+
51+
```bash
52+
nm build-apple/llama.xcframework/ios-arm64/llama.framework/llama \
53+
| grep ggml_metallib
54+
# 000000000032cc20 S _ggml_metallib_start
55+
# 00000000003bf6d3 S _ggml_metallib_end
56+
```
57+
58+
### Expected build-time warnings
59+
60+
During configuration of simulator/multi-arch slices you will see:
61+
62+
```
63+
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:558 (message):
64+
Unknown CPU architecture. Falling back to generic implementations.
65+
```
66+
67+
This is **not** a Metal fallback. It fires only when x86_64 is part of the architecture list (iOS sim, macOS, visionOS sim, tvOS sim) and means the x86_64 **CPU backend** slice uses generic scalar kernels instead of AVX/AVX2. The arm64 CPU backend and the Metal backend are unaffected. For shipping on Apple Silicon devices this warning is cosmetic — no one runs production inference on an x86_64 simulator.
3768

3869
### Requirements
3970

@@ -45,7 +76,7 @@ Last verified: 2026-04-15 against upstream tag `b8802` with Xcode 26.4, CMake 4.
4576

4677
## Verifying Metal works
4778

48-
A quick smoke test using `llama-bench` against a host macOS build:
79+
The xcframework is a library — it doesn't ship a runnable binary. To prove Metal is functional end-to-end against the same source the xcframework was built from, do a parallel host-macOS build of `llama-bench`:
4980

5081
```bash
5182
cmake -B build-host -G Ninja \
@@ -56,7 +87,18 @@ cmake --build build-host --target llama-bench -j
5687
./build-host/bin/llama-bench -m <model>.gguf -p 64 -n 32 -ngl 99
5788
```
5889

59-
Look for `ggml_metal_library_init: using embedded metal library` and the `MTL,BLAS` backend column. The same code path runs in the xcframework slices.
90+
> **Note:** `llama-cli` is only built when `LLAMA_BUILD_SERVER=ON` in this upstream (`tools/CMakeLists.txt`). `llama-bench` is always available and is a more informative smoke test anyway — it prints tokens/sec per backend.
91+
92+
Look for:
93+
94+
- `ggml_metal_library_init: using embedded metal library` — the embedded metallib loaded, not a disk `.metallib`.
95+
- `GPU family: MTLGPUFamilyApple*` — real Apple Silicon GPU detected.
96+
- Backend column `MTL,BLAS` — Metal is the compute backend.
97+
- tg (token generation) rates in the hundreds of t/s on a small model; CPU-only would be 10× slower.
98+
99+
Reference numbers from the 2026-04-15 verification (SmolLM2-135M-Instruct Q4_K_M on an M-series Mac): `pp64 ≈ 8098 t/s`, `tg32 ≈ 403 t/s`, backend `MTL,BLAS`, family `MTLGPUFamilyApple9`.
100+
101+
The xcframework slices contain identical Metal backend code — same `_ggml_metallib_start`/`_end` symbols, same 110 kernels — so a working host Metal build is a reliable proxy for the framework slices.
60102

61103
## Syncing with upstream
62104

@@ -67,7 +109,16 @@ git rebase "$LATEST_TAG"
67109
./build-xcframework.sh # re-verify
68110
```
69111

70-
The fork's two commits rebase cleanly onto upstream tags. `ggml/CMakeLists.txt` occasionally conflicts when upstream moves code near `cmake_minimum_required`; resolve by keeping both the version bump and whatever upstream added.
112+
The fork's commits rebase cleanly onto upstream tags with one known pinch point: `ggml/CMakeLists.txt` conflicts whenever upstream adds code near `cmake_minimum_required` (e.g. the CMP0194 policy block added around tag `b8802`). Resolve by keeping **both** the fork's version-range bump and whatever upstream added adjacent to it.
113+
114+
After rebasing, run `./build-xcframework.sh` and spot-check one slice before force-pushing:
115+
116+
```bash
117+
lipo -info build-apple/llama.xcframework/ios-arm64_x86_64-simulator/llama.framework/llama
118+
# Architectures in the fat file: ... are: x86_64 arm64
119+
nm build-apple/llama.xcframework/ios-arm64/llama.framework/llama | grep ggml_metallib
120+
# Expect _ggml_metallib_start and _ggml_metallib_end symbols.
121+
```
71122

72123
## Further reading
73124

0 commit comments

Comments
 (0)