Skip to content

build: macOS 26 SDK arm64-macos shim for Zig 0.15.x#2499

Open
navidemad wants to merge 3 commits into
lightpanda-io:mainfrom
navidemad:build/darwin-sdk-shim
Open

build: macOS 26 SDK arm64-macos shim for Zig 0.15.x#2499
navidemad wants to merge 3 commits into
lightpanda-io:mainfrom
navidemad:build/darwin-sdk-shim

Conversation

@navidemad
Copy link
Copy Markdown
Contributor

What this fixes

make test / make build-dev / zig build fail on macOS 26+ (Apple silicon) with ~30 undefined libSystem / CoreFoundation / SystemConfiguration symbols at the build-runner link stage, before any project code is touched.

Root cause: macOS 26's CommandLineTools SDK dropped arm64-macos from every system .tbd library stub — only arm64e-macos remains. Zig 0.15.x bundles lib/libc/darwin/libSystem.tbd that does export arm64-macos, but its SDKSettings.json is pinned to macOS 15.5. Zig's host detection sees the higher-numbered system SDK at /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk and picks it; the arm64 link then can't resolve symbols Apple still ships exclusively under arm64e-macos.

Concretely (without this PR, on a stock macOS 26.5 + Zig 0.15.2 host):

$ zig build -Dprebuilt_v8_path=… test
error: undefined symbol: _abort
    note: referenced by build_zcu.o:_posix.abort
error: undefined symbol: _fcopyfile
    note: referenced by build_zcu.o:_fs.File.Writer.sendFile
error: undefined symbol: __availability_version_check
    note: referenced by libcompiler_rt.a(libcompiler_rt_zcu.o):___isPlatformVersionAtLeast
… 25+ more …

The failure is in the build runner itself (build_zcu.o), so it precedes anything build.zig could detect and can't be fixed from inside build.zig.

What this PR changes

  • scripts/darwin-sdk-shim.sh (new, +198) — assembles a hybrid macOS SDK shim under .lp-cache/darwin-sdk-shim/:
    • sdk/usr/include, sdk/usr/share, sdk/Library, sdk/System/iOSSupport symlinked from the system SDK (headers don't need patching, the system mirror keeps clang etc. working)
    • sdk/usr/lib/libSystem.tbd copied from Zig's bundled libc/darwin/libSystem.tbd (the one with arm64-macos exports)
    • every other .tbd under usr/lib/ and System/Library/(Private)Frameworks/ rewritten to also export arm64-macos. The patch is a single sed: append , arm64-macos to every occurrence of arm64e-macos. The .tbd format groups symbols by target list, so the linker resolves arm64-macos lookups using the same symbol definitions Apple ships for arm64e-macos.
    • bin/xcrun — wrapper script that returns the shim path for xcrun --show-sdk-path / --show-sdk-version when --sdk is macosx (or unspecified). Everything else is exec /usr/bin/xcrun "$@" so non-Zig tooling (xcrun clang, xcrun --find swiftc, …) keeps working untouched. The shim path is read from LIGHTPANDA_DARWIN_SDK_SHIM at runtime, not baked into the script, so moving .lp-cache/ doesn't break the wrapper.
  • Makefile — detects when the system libSystem.tbd's first targets: block lacks arm64-macos. When the gap is present, test, build-dev, and build-v8-snapshot depend on a marker file that runs the shim script once; PATH and LIGHTPANDA_DARWIN_SDK_SHIM are exported so the zig invocation inside each recipe resolves the shim SDK. New make darwin-sdk-shim and make darwin-sdk-shim-clean targets expose the cache explicitly. make clean also wipes the shim.
  • CONTRIBUTING.md — new "macOS 26+ (Apple silicon)" section explaining what the shim does, how to rebuild after Xcode CommandLineTools / Zig version bumps, and how to export the env vars for bare zig build invocations outside make.

Hosts that don't need the shim — macOS 15 and earlier, Linux, any future Apple SDK that puts arm64-macos back — skip the entire ifeq ($(NEEDS_DARWIN_SDK_SHIM), yes) block. The Makefile and contributor workflow are identical to before for them.

Notable design choices

Why a Makefile-level workaround rather than something in build.zig. The link failure is in build_zcu.o — the compiled build.zig itself, before any project logic runs. There's no Zig-level way to intercept that compilation. The same goes for zig build --sysroot … and --libc <paths-file>: both flags are parsed by the build runner after it links, so they only affect child zig invocations, not the build runner's own linking. The fix has to run before zig build, which means either make, a wrapper script, or a contributor-managed environment.

Why patch every .tbd instead of only libSystem.tbd. The first link failure is libSystem, but as soon as that's resolved the project pulls in CoreFoundation, SystemConfiguration, Foundation, and a few usr/lib/system/*.tbd siblings. Their .tbd files have the same arm64-macos gap. Patching them all once at shim-build time is ~5 s on an M3, vs. discovering each new framework gap one CI failure at a time.

Why [^e]arm64-macos in the detection awk. arm64-macos is a substring of arm64e-macos, so a naive grep "arm64-macos" always matches. The Makefile reads only the FIRST targets: block in libSystem.tbd — that's the main libSystem.B.dylib entry the linker resolves -lSystem against. Sub-library entries lower in the file may still list arm64-macos even when the main one doesn't, so an unfiltered scan reports "no shim needed" when in fact the link will fail.

Why LIGHTPANDA_DARWIN_SDK_SHIM instead of baking the path into the wrapper. Keeps the generated wrapper portable across repo locations and developers. The wrapper is regenerated whenever the script changes (Make tracks it as a dep on the marker), so any wrapper protocol change ships atomically.

Why this isn't a Zig version bump. Zig 0.16.0 ships SDK 26.4 in its bundled libc and would fix this natively. But it breaks build.zig (std.fs.File.stderr() was renamed, process.SpawnOptions.StdIo.Ignore removed) plus probably the rest of the project. Bumping the toolchain is its own conversation; this PR keeps the project on Zig 0.15.2 and unblocks the macOS 26 host.

Test plan

  • On macOS 26.5 + Zig 0.15.2 (host that previously failed):
    • make darwin-sdk-shim builds the shim in ~5 s, prints darwin-sdk-shim: 5534 .tbd files patched at …/sdk.
    • make darwin-sdk-shim-clean removes it; rerun of make darwin-sdk-shim rebuilds from scratch.
    • make build-dev produces a working arm64 Mach-O zig-out/bin/lightpanda. Verified lightpanda version prints the expected build hash.
    • zig build $V8 test with PATH and LIGHTPANDA_DARWIN_SDK_SHIM exported runs the full suite — 693/693 tests pass.
    • make clean wipes the shim cache.
  • On hosts that don't need the shim (verified by inspection — NEEDS_DARWIN_SDK_SHIM evaluates to empty when libSystem.tbd's first targets: block already lists arm64-macos), the entire shim block is skipped; test/build-dev recipes are unchanged.
  • shellcheck scripts/darwin-sdk-shim.sh clean.

Caveats

  • The shim is host-specific (every symlink is absolute) and not portable across machines. It's deliberately built in .lp-cache/ (already in .gitignore) rather than committed.
  • Bare zig build outside make doesn't pick up PATH / env automatically. Documented in CONTRIBUTING.md with the manual exports.
  • The Makefile's pre-existing script -q /dev/null … wrapper around zig build test interferes with cargo's jobserver for the html5ever crate — independent of this PR, present before and after. Workaround: zig build $V8 test directly (matches AGENTS.md). I haven't touched that wrapper here.
  • Re-running the shim build is manual (make darwin-sdk-shim-clean && make …). Xcode CommandLineTools updates and Zig version bumps both invalidate the shim contents but won't auto-detect — CONTRIBUTING.md spells this out.

macOS 26's CommandLineTools SDK ships .tbd library stubs that export
arm64e-macos only — arm64-macos was dropped. Zig 0.15.x bundles a
libSystem.tbd that still has arm64-macos but is pinned to macOS 15.5;
its auto-detection picks the higher-numbered system SDK and the arm64
link fails with ~30 undefined libSystem / CoreFoundation /
SystemConfiguration symbols.

scripts/darwin-sdk-shim.sh assembles a hybrid SDK at
.lp-cache/darwin-sdk-shim/sdk/: usr/include is symlinked from the
system SDK, libSystem.tbd is swapped for Zig's bundled copy, every
other .tbd is rewritten to also export arm64-macos. The script also
emits an xcrun wrapper at .lp-cache/darwin-sdk-shim/bin/xcrun that
returns the shim path for --show-sdk-path queries and passes
everything else through to /usr/bin/xcrun.

The Makefile detects when the system libSystem.tbd lacks arm64-macos
(via awk on the first targets: block) and makes test / build-dev /
build-v8-snapshot depend on a marker target that runs the shim
script once. PATH and LIGHTPANDA_DARWIN_SDK_SHIM are exported so
nested zig invocations resolve the shim SDK. Hosts that don't need
the shim — older macOS, Linux, any future Apple SDK that puts
arm64-macos back — skip the entire block; the Makefile is identical
to before for them.

ZIGFLAGS="$V8" make darwin-sdk-shim rebuilds the shim explicitly;
ZIGFLAGS="$V8" make darwin-sdk-shim-clean removes it; ZIGFLAGS="$V8" make clean also wipes it.
CONTRIBUTING.md documents the cache directory and the manual env
export needed for bare zig build invocations $V8.
@karlseguin
Copy link
Copy Markdown
Collaborator

My preference is to wait until we upgrade to Zig 0.16 and ask people to install xcode 26.3 in the meantime. (I realize that's a big ask, but I feel less guilty about asking that than about running a bash script that modifies their system in a way I don't really understand and that doesn't seem trivial).

@navidemad
Copy link
Copy Markdown
Contributor Author

navidemad commented May 20, 2026

The main hesitation seems to be the script modifying contributors' systems — and it doesn't.

Everything lands under .lp-cache/ (already gitignored): symlinks to the system SDK plus patched .tbd copies.
The xcrun wrapper is only on PATH inside the make recipe, never installed globally.
Hosts that don't need it (macOS < 26, Linux) skip the whole block, so nothing changes for them at all.

On macOS 26 it's a one-time ~5 s build into the cache, and make clean wipes it.

Given that nothing outside the repo is touched, would you reconsider or would you still rather wait for Zig 0.16?

If you'd rather wait, do you have a rough ETA for the 0.16 bump?
That'd just tell me whether to keep the shim local on my end for a while.

Address review concern about the script modifying the contributor's
system. The shim only reads the system SDK (via symlinks) and writes
patched .tbd copies into .lp-cache/ (gitignored); nothing global is
touched and it is fully reverted by ZIGFLAGS= make clean. State this explicitly
in CONTRIBUTING.md (a [!NOTE] callout) and at the top of the script
header instead of leaving it implicit.
@navidemad
Copy link
Copy Markdown
Contributor Author

I also baked that reassurance into the diff rather than leaving it in a comment (4c56ece): a > [!NOTE] callout in CONTRIBUTING.md and a Scope / safety stanza at the top of scripts/darwin-sdk-shim.sh, both stating plainly that it only reads the system SDK and writes solely into .lp-cache/ — no sudo, no global install, fully reverted by make clean.

State explicitly that the shim becomes unnecessary once the project
upgrades to Zig 0.16+ (whose bundled libc ships the macOS 26 SDK and
links arm64-macos natively), and that it should be removed deliberately
at that point — it does not self-disable, since detection keys off the
system SDK, which still lacks arm64-macos on macOS 26.
@navidemad
Copy link
Copy Markdown
Contributor Author

On the 0.16 question specifically (f743ea3): I've documented the shim as an explicit, temporary stopgap. Both CONTRIBUTING.md and the script header now say to delete it — script, Makefile block, and docs — once the project moves to Zig 0.16+, since that's the point it stops being needed. So even if you'd rather wait, this won't quietly become permanent: its removal trigger is written down. (One caveat I called out honestly: it won't self-disable on the bump, because detection keys off the system SDK, which still lacks arm64-macos on macOS 26 — so it's a deliberate one-line cleanup at upgrade time.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants