|
| 1 | +# Copy Fail: AF_ALG + splice page-cache overwrite (CVE-2026-31431) |
| 2 | + |
| 3 | +{{#include ../../../banners/hacktricks-training.md}} |
| 4 | + |
| 5 | +This page documents **Copy Fail**: a Linux kernel local privilege escalation where **`AF_ALG` + `splice()`** turns **readable file page-cache pages** into part of a **writable AEAD destination scatterlist**, and `authencesn` then performs a **deterministic 4-byte write past the contractual output boundary**. |
| 6 | + |
| 7 | +- Affected component: `crypto/algif_aead.c` in-place decrypt path + `crypto/authencesn.c` |
| 8 | +- Primitive: controlled **4-byte page-cache write** into any file readable by the attacker |
| 9 | +- Reachability: unprivileged local user, `AF_ALG` available, `algif_aead` loaded |
| 10 | +- Impact: immediate system-wide corruption of the page-cache copy used by `read()`, `mmap()`, and `execve()` |
| 11 | + |
| 12 | +This is closer to **Dirty Pipe / Dirty COW style page-cache abuse** than to a classic memory-corruption race: |
| 13 | + |
| 14 | +- no race window |
| 15 | +- no repeated retries |
| 16 | +- no on-disk file modification |
| 17 | +- same exploit flow across many distros because the primitive is structural, not offset-dependent |
| 18 | + |
| 19 | +## Core idea |
| 20 | + |
| 21 | +`splice()` moves data between a file, a pipe, and another FD **by reference**. If a readable file is spliced into a pipe and then into an `AF_ALG` AEAD socket, the crypto input scatterlist can reference the **same page-cache pages** backing that file. |
| 22 | + |
| 23 | +For AEAD decrypt, `algif_aead` historically optimized the request into an **in-place** layout: |
| 24 | + |
| 25 | +- **AAD** and **ciphertext** were copied into the user RX buffer |
| 26 | +- the final **authentication tag** was **not copied** |
| 27 | +- instead, tag scatterlist entries were appended to the destination with `sg_chain()` |
| 28 | +- `req->src = req->dst`, so those appended tag pages became part of a **writable destination chain** |
| 29 | + |
| 30 | +If the tag pages come from spliced file data, the writable destination chain now includes **page-cache pages of a read-only file**. |
| 31 | + |
| 32 | +## The bug in `authencesn` |
| 33 | + |
| 34 | +`authencesn` is an AEAD wrapper used for IPsec Extended Sequence Numbers (ESN). During decrypt it uses the destination scatterlist as scratch space and writes **4 bytes past the legitimate decrypt output**: |
| 35 | + |
| 36 | +```c |
| 37 | +scatterwalk_map_and_copy(tmp, dst, 0, 8, 0); |
| 38 | +scatterwalk_map_and_copy(tmp, dst, 4, 4, 1); |
| 39 | +scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1); |
| 40 | +``` |
| 41 | +
|
| 42 | +The last write stores **`seqno_lo`** (attacker-controlled AAD bytes `4..7`) at `dst[assoclen + cryptlen]`, which is **after the tag** and therefore **outside the contract for AEAD decrypt output**. |
| 43 | +
|
| 44 | +When `algif_aead` has chained page-cache-backed tag pages into `dst`, that write crosses out of the RX buffer and lands in the victim file's page cache. |
| 45 | +
|
| 46 | +## Why this becomes a useful primitive |
| 47 | +
|
| 48 | +The attacker controls: |
| 49 | +
|
| 50 | +- **Which file**: any file readable by the attacker |
| 51 | +- **Which offset**: via splice offset/length and AEAD `assoclen` |
| 52 | +- **Which value**: the 4 bytes written come from attacker-controlled AAD bytes `4..7` |
| 53 | +
|
| 54 | +Even if authentication fails and `recvmsg()` returns an error, the **page-cache overwrite persists** because the scratch write already happened. |
| 55 | +
|
| 56 | +The corrupted page is **not marked dirty for writeback**, so: |
| 57 | +
|
| 58 | +- the on-disk file remains unchanged |
| 59 | +- checksum comparisons on disk miss the attack |
| 60 | +- all later `read()`, `mmap()`, and `execve()` users consume the modified in-memory page |
| 61 | +
|
| 62 | +## Typical LPE path |
| 63 | +
|
| 64 | +The public write-up targets a **setuid-root binary** such as `/usr/bin/su`: |
| 65 | +
|
| 66 | +1. Open `AF_ALG` and bind to `authencesn(hmac(sha256),cbc(aes))` |
| 67 | +2. Send AAD where bytes `4..7` contain the 4-byte chunk to write |
| 68 | +3. `splice()` target file data into the AEAD input so the final tag region references the target file's page-cache pages |
| 69 | +4. Trigger `recv()` / `recvmsg()` to force decrypt |
| 70 | +5. Repeat until the page-cache copy of the setuid binary is patched |
| 71 | +6. Execute the binary so the kernel loads the modified cached image and runs attacker code as root |
| 72 | +
|
| 73 | +Conceptual PoC skeleton: |
| 74 | +
|
| 75 | +```python |
| 76 | +a = socket.socket(38, 5, 0) # AF_ALG, SOCK_SEQPACKET |
| 77 | +a.bind(("aead", "authencesn(hmac(sha256),cbc(aes))")) |
| 78 | +# set key, accept request socket |
| 79 | +u.sendmsg([b"A"*4 + payload_chunk], [cmsg_headers], MSG_MORE) |
| 80 | +os.splice(target_fd, pipe_wr, offset) |
| 81 | +os.splice(pipe_rd, alg_fd, offset) |
| 82 | +u.recv(...) # triggers decrypt -> page-cache write |
| 83 | +``` |
| 84 | + |
| 85 | +## How the bug became exploitable |
| 86 | + |
| 87 | +- **2011**: `authencesn` introduced for IPsec ESN handling (`a5079d084f8b`) |
| 88 | +- **2015**: `authencesn` converted to the new AEAD interface and kept the out-of-contract scratch write (`104880a6b470`) |
| 89 | +- **2017**: `algif_aead` switched decrypt to an in-place design and chained tag pages into the destination (`72548b093ee3`) |
| 90 | + |
| 91 | +That 2017 change is what turned an internal scratch write into a **page-cache write primitive** reachable from unprivileged userspace. |
| 92 | + |
| 93 | +## Fix and mitigations |
| 94 | + |
| 95 | +Mainline fixed this by reverting `algif_aead` back to **out-of-place** operation (`a664bf3d603d`), so page-cache pages can remain in the source scatterlist but no longer become part of the writable destination chain. |
| 96 | + |
| 97 | +Useful mitigations: |
| 98 | + |
| 99 | +- patch to a kernel carrying `a664bf3d603d` or a distro backport |
| 100 | +- block `AF_ALG` socket creation with seccomp for untrusted workloads |
| 101 | +- disable `algif_aead` if you need an immediate stopgap |
| 102 | + |
| 103 | +Example emergency mitigation: |
| 104 | + |
| 105 | +```bash |
| 106 | +echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf |
| 107 | +rmmod algif_aead 2>/dev/null || true |
| 108 | +``` |
| 109 | + |
| 110 | +For containerized environments, `AF_ALG` should be treated as a **kernel attack surface**. Even without this specific CVE, it is a good candidate for seccomp denial in CI runners, sandboxes, and multi-tenant containers. |
| 111 | + |
| 112 | +## Detection / review notes |
| 113 | + |
| 114 | +- A page-cache-only patch means the suspicious effect may be visible only in memory, not on disk. |
| 115 | +- Look for unusual `AF_ALG` use on systems that do not intentionally expose kernel crypto sockets to workloads. |
| 116 | +- When auditing zero-copy kernel interfaces, treat any path that combines **`splice()`-backed page references** with **scatterlists reused as destinations** as high risk. |
| 117 | +- A useful reviewer rule is: if an algorithm writes beyond its documented output length, any caller that chains foreign pages into `dst` may turn it into a write primitive. |
| 118 | + |
| 119 | +## References |
| 120 | + |
| 121 | +- [Xint write-up: Copy Fail: 732 Bytes to Root on Every Major Linux Distributions](https://xint.io/blog/copy-fail-linux-distributions) |
| 122 | +- [Copy Fail advisory / mitigation page](https://copy.fail/) |
| 123 | +- [Linux fix: `crypto: algif_aead - Revert to operating out-of-place` (`a664bf3d603d`)](https://github.com/torvalds/linux/commit/a664bf3d603dc3bdcf9ae47cc21e0daec706d7a5) |
| 124 | +- [Linux commit: `crypto: algif_aead - copy AAD from src to dst` (`72548b093ee3`)](https://github.com/torvalds/linux/commit/72548b093ee3) |
| 125 | +- [Linux commit: `crypto: authencesn - Convert to new AEAD interface` (`104880a6b470`)](https://github.com/torvalds/linux/commit/104880a6b470) |
| 126 | +- [Linux commit: `crypto: authencesn - Add algorithm to handle IPsec extended sequence numbers` (`a5079d084f8b`)](https://github.com/torvalds/linux/commit/a5079d084f8b) |
| 127 | + |
| 128 | +{{#include ../../../banners/hacktricks-training.md}} |
0 commit comments