Skip to content

Commit caae9bf

Browse files
committed
Refresh README and docs for recent changes
README - List the 213 syscalls, guest-internal FUSE, sysv-ipc, GDB stub, MAP_SHARED overlay, APFS fclonefileat CoW, case-fold sidecar). - Tighten Limitations usage.md - Document --no-rosetta, --create-sysroot, and the --gdb rejection for x86_64 guests. - Add worked examples (interactive bash via sysroot, jq against a host JSON file, sqlite3 against a host db, x86_64 binary). testing.md - Expand the make check description to its real stages (coverage check, TLBI encoder, proctitle, busybox, sysroot-procfs, FUSE, timeout=0, rosetta-cli, hot-syscall guardrail). internals.md - Add Lifecycle In Five Steps (load -> boot -> run -> translate -> return) so readers get a mental model before the deep dives. - Rewrite MAP_SHARED notes. Previous text claimed MAP_SHARED is treated as MAP_PRIVATE; src/syscall/mem.c installs real host MAP_FIXED|MAP_SHARED overlays for aligned file-backed mappings and promotes MAP_SHARED|MAP_ANONYMOUS to memfd-style overlays across fork. - Extend the TLBI wire encoding with X8 == 4 / TLBI_RANGE_LARGE (single-shot TLBI RVAE1IS for 17..64 pages via FEAT_TLBIRANGE) and the X11 icache-flush hint (set on W^X transitions to executable). Same updates in the HVC #5 row of the protocol table. - Rewrite the X8 == 2 description as the generic drop-saved-frame marker. signal_deliver() also sets X8 = 2 on the syscall-return path (src/syscall/signal.c), not just execve / rt_sigreturn. - Correct the stack-alignment arithmetic against src/core/stack.c . The 35 covers 30 base auxv words + AT_NULL + envp/argv nulls + argc; extra starts at 4 for the always-present AT_EXECFN + AT_BASE. - Document the x86_64 path including both static and dynamic via --sysroot. - Add a Guest-Internal FUSE section. Source map gains rosetta.c, sysroot.c, path.c, sidecar.c, fuse.c, inotify.c, sysvipc.c, shim-globals.c, vdso.c, proctitle.c, plus the split net files. - Normalize memory-layout hex padding to 8-digit for within-32-bit values and 12-digit for above-32-bit values.
1 parent 5fcc9c0 commit caae9bf

6 files changed

Lines changed: 523 additions & 160 deletions

File tree

README.md

Lines changed: 87 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,49 @@
11
# elfuse
22

3-
`elfuse` runs aarch64-linux ELF binaries on macOS Apple Silicon through
4-
Apple's Hypervisor.framework. It is a process-scoped Linux user-space runtime:
5-
guest code executes on the CPU inside a lightweight VM, while Linux syscalls
6-
are intercepted and translated to macOS behavior in host-side handlers.
7-
8-
This is not a container engine and not a general-purpose Linux kernel. It is a
9-
focused compatibility layer for running Linux user-space workloads directly from
10-
the macOS shell, with support for static binaries, dynamic loaders via
11-
`--sysroot`, guest threads, process management, signals, `/proc` emulation, and
12-
guest debugging through a built-in GDB RSP stub.
13-
14-
## Highlights
15-
3+
Run Linux ELF binaries directly from the macOS shell -- no Docker, no
4+
full VM image, no daemon. `elfuse` is a process-scoped Linux user-space
5+
runtime: each guest runs inside a lightweight Hypervisor.framework VM
6+
owned by the `elfuse` process itself, and Linux syscalls are translated
7+
to macOS behavior in host-side handlers rather than served by a real
8+
Linux kernel.
9+
10+
Native aarch64-linux executes directly on the CPU. x86_64-linux
11+
executes through Apple's embedded Rosetta translator hosted inside the
12+
same VM; the architecture is auto-detected from the ELF header. Both
13+
static and dynamically linked guests are supported, with the dynamic
14+
linker resolved against an external sysroot via `--sysroot`.
15+
16+
## Features
17+
18+
- Single native macOS binary (~560 KiB signed), no daemon and no disk
19+
image
20+
- Millisecond-scale VM startup; per-syscall overhead is microseconds
1621
- Native Apple Silicon execution through Hypervisor.framework
1722
- Static and dynamically linked `aarch64-linux` ELF binaries
18-
- Linux-style processes, threads, signals, timers, futexes, and polling
23+
- Static and dynamically linked `x86_64-linux` ELF binaries via Apple
24+
Rosetta (auto-detected from the ELF header, opt out with
25+
`--no-rosetta`)
26+
- Linux-style processes, threads (1:1 with HVF vCPUs, up to 64),
27+
signals, timers, futexes (incl. PI ops), and polling
28+
- Guest reads and writes the macOS filesystem directly; no overlay or
29+
volume mount layer
1930
- Synthetic `/proc` and selected `/dev` emulation for user-space probes
31+
- Guest-internal FUSE: `/dev/fuse` and `mount("fuse")` work without
32+
macFUSE / FUSE-T / FSKit
2033
- Built-in GDB Remote Serial Protocol stub usable from `gdb` or `lldb`
2134
- Self-contained test matrix that cross-checks elfuse against QEMU
35+
and exercises a separate Rosetta acceptance suite
36+
37+
## Positioning
38+
39+
`elfuse` is intentionally narrow. It runs single Linux binaries (and
40+
their `fork`/`exec` children) with minimal overhead; it does not host a
41+
Linux kernel, namespaces, cgroups, or kernel modules. For workloads
42+
that need full kernel features, container orchestration, or systemd,
43+
prefer a full VM tool (Lima, UTM, OrbStack) or Docker Desktop. For
44+
single-binary tooling, language runtimes, test harnesses, and
45+
debugger-driven workflows, `elfuse` removes the disk-image and
46+
boot-time overhead those tools impose.
2247

2348
## Requirements
2449

@@ -48,33 +73,46 @@ make elfuse
4873
make test-busybox
4974
build/elfuse build/busybox
5075
```
51-
Replace `build/busybox` with Arm64/Linux executable files.
76+
Replace `build/busybox` with an aarch64-linux or x86_64-linux executable.
77+
The guest architecture is auto-detected from the ELF header.
5278

5379
For dynamically linked guests:
5480

5581
```sh
5682
build/elfuse --sysroot /path/to/sysroot ./path/to/program
5783
```
5884

85+
For x86_64-linux guests, Rosetta is on by default. To disable:
86+
87+
```sh
88+
build/elfuse --no-rosetta ./path/to/aarch64-only-binary
89+
```
90+
5991
For early debugging:
6092

6193
```sh
6294
build/elfuse --gdb 1234 --gdb-stop-on-entry ./path/to/program
6395
```
6496

97+
`--gdb` is rejected for x86_64 guests because the stub serves the
98+
aarch64 view Rosetta produces, not the original x86_64 architectural
99+
state.
100+
65101
The build signs `build/elfuse` before use. Override the signing identity with
66102
`SIGN_IDENTITY="Developer ID ..."` when needed.
67103

68104
## Documentation
69105

70-
- [docs/usage.md](docs/usage.md): command-line options, dynamic linking via
71-
`--sysroot`, and attaching `gdb` / `lldb` to the built-in stub.
72-
- [docs/testing.md](docs/testing.md): build prerequisites, the `make check`
73-
flow, the QEMU cross-check matrix, and fixture handling.
74-
- [docs/internals.md](docs/internals.md): canonical technical reference --
75-
HVF constraints, EL1 shim and HVC protocol, page-table splitting, syscall
76-
translation tables, threads/futex, fork/clone IPC, signals, ptrace, and
77-
the GDB stub.
106+
- [docs/usage.md](docs/usage.md): command-line options, x86_64 via
107+
Rosetta, dynamic linking via `--sysroot`, and attaching `gdb` /
108+
`lldb` to the built-in stub.
109+
- [docs/testing.md](docs/testing.md): build prerequisites, the
110+
`make check` flow, the QEMU and Rosetta cross-check matrices, and
111+
fixture handling.
112+
- [docs/internals.md](docs/internals.md): canonical technical
113+
reference -- runtime lifecycle, HVF constraints, EL1 shim and HVC
114+
protocol, page-table splitting, syscall translation tables, threads
115+
/ futex, fork / clone IPC, signals, ptrace, and the GDB stub.
78116

79117
## Build And Validation
80118

@@ -90,30 +128,35 @@ make lint # clang-tidy
90128

91129
`make check` is the recommended pre-commit gate. `make test-matrix` is the
92130
recommended gate for changes touching procfs, dynamic linking, networking,
93-
or process semantics. See [docs/testing.md](docs/testing.md) for the full
131+
or process semantics. `make test-rosetta-all` covers the x86_64 acceptance
132+
suites in isolation. See [docs/testing.md](docs/testing.md) for the full
94133
target list, fixture flow, and validation-by-change-type guidance.
95134

96-
## Scope And Limitations
97-
98-
`elfuse` targets pragmatic Linux user-space compatibility. Supported areas
99-
include ELF and dynamic-loader bootstrap, sysroot-aware path translation,
100-
Linux-style FD semantics, `fork` / `clone` / `execve` / `wait*` / ptrace,
101-
signals and timers, polling families (`epoll`, `eventfd`, `signalfd`,
102-
`timerfd`, `inotify`), sockets and netlink, and synthetic `/proc`, `/dev`,
103-
and `/proc/net/*` views sufficient for tools such as BusyBox `ps`, `uptime`,
104-
and `top`.
105-
106-
Boundaries to be aware of:
107-
108-
- The target is Linux user-space ABI compatibility, not kernel
109-
virtualization. `/proc`, `/dev`, and mount data are compatibility views.
110-
- HVF allows one VM per host process, so Linux-style `fork` is implemented
111-
via `posix_spawn` plus state transfer (a fast CoW path is used when
112-
available -- see [docs/internals.md](docs/internals.md)).
113-
- `MAP_SHARED` is treated as `MAP_PRIVATE`; this matches single-process
114-
guest semantics and unblocks tools that expect file-backed mappings.
115-
- Unsupported syscalls return Linux-style errors rather than silently
135+
## Limitations
136+
137+
`elfuse` runs single Linux user-space processes (and their `fork` /
138+
`exec` children). It is not a Linux kernel.
139+
That framing shapes both what it does and what it explicitly will not
140+
do.
141+
- Linux kernel features that have no user-space-syscall analog:
142+
namespaces, cgroups, kernel modules, eBPF, `io_uring`, KVM, perf
143+
events.
144+
- Intel Macs. Apple Silicon only (M1 and later).
145+
- Hosting a VM from inside a guest. The guest cannot use HVF or KVM.
146+
- One guest process tree per `elfuse` host process. HVF allows one VM
147+
per host process; Linux-style `fork` is implemented by
148+
`posix_spawn`-ing a fresh `elfuse` host process and transferring
149+
state (see [docs/internals.md](docs/internals.md)).
150+
- Up to 64 concurrent guest threads per VM (`MAX_THREADS = 64`).
151+
- Around 213 syscalls implemented; anything outside
152+
`src/syscall/dispatch.tbl` returns `-ENOSYS` rather than silently
116153
succeeding.
154+
- `FUTEX_LOCK_PI` and friends behave as plain mutex acquire / release;
155+
true priority-inheritance scheduling is not modeled.
156+
- `sched_setaffinity` is honored as a no-op (returns the all-CPUs
157+
mask); the host scheduler picks the actual CPU.
158+
- `/proc`, `/dev`, and mount data are synthetic compatibility views,
159+
not host pass-throughs.
117160

118161
## License
119162

0 commit comments

Comments
 (0)