Skip to content

Commit add33a0

Browse files
committed
feat(M8): enable FF_ZC_SEND with FSTACK_ZC_MAGIC sentinel protocol (Phase-2 P1b)
Phase-2 third milestone: enable lib/Makefile FF_ZC_SEND=1 in addition to M6 (NETGRAPH+IPFW) and M7 (PAGE_ARRAY). FF_USE_PAGE_ARRAY remains off — combined PA+ZC build is M9 separate scope. Beyond enabling the macro, this change also fixes a pre-existing 13.0-baseline bug in the F-Stack zero-copy fast path: the predicate in m_uiotombuf (uipc_mbuf.c) used to be just (UIO_SYSSPACE && UIO_WRITE) which matched every ff_write/ff_writev call (lib/ff_syscall_wrapper.c sets uio_segflg = UIO_SYSSPACE unconditionally). On heavy load this would either silently corrupt response payload (the kernel handed a bare char buffer to tcp_output as if it were a mbuf chain) or trigger a GPF in m_demote when m->m_next happened to deref into unmapped memory. Both helloworld_zc AND the plain helloworld crashed under 100x curl on the M7 build. Code changes (8 source files, +85/-4): - freebsd/sys/mbuf.h: define FSTACK_ZC_MAGIC sentinel (0xF8AC2C00F8AC2C00) under #ifdef FSTACK_ZC_SEND. - freebsd/kern/uipc_mbuf.c: tighten m_uiotombuf ZC fast-path predicate to require uio_offset == FSTACK_ZC_MAGIC. Plain ff_write callers now bypass the fast path and run the regular m_getm2 + uiomove copy loop. - freebsd/kern/sys_generic.c: dofilewrite preserves auio->uio_offset if it already carries FSTACK_ZC_MAGIC. Without this, the sentinel set by ff_zc_send would be overwritten with -1 before reaching m_uiotombuf. include sys/mbuf.h for the macro. - lib/Makefile: uncomment FF_ZC_SEND=1. - lib/ff_syscall_wrapper.c: new public API ff_zc_send(fd, mbuf, len) which sets auio.uio_offset = FSTACK_ZC_MAGIC. Plain ff_write/ ff_writev now explicitly set uio_offset = 0 so they never accidentally trigger the fast path. include sys/mbuf.h. - lib/ff_api.h: declare ff_zc_send + comment block on the uio_offset sentinel contract. - lib/ff_api.symlist: export ff_zc_send (libfstack.a is built with --localize-symbols + --globalize-symbols=ff_api.symlist; without this entry the new function would remain a local symbol and fail link against helloworld_zc). - example/Makefile: add -DFSTACK_ZC_SEND when compiling helloworld_zc so main_zc.c picks the ZC branch. - example/main_zc.c: line 215 ff_write -> ff_zc_send (with extern declaration at top); the comment names the previous bug. Verification (production build, debug printfs removed): - G1 lib make clean && make: exit=0, 0 errors, 57 warnings (= M6/M7 baseline). libfstack.a 6.55 MB. - G1 example make: 3 binaries (helloworld 29.02M, _epoll 29.02M, _zc 29.03M). nm libfstack.a shows 'T ff_zc_send' (global). - G2 helloworld_zc primary: ALIVE 12s+ smoke, ipfw2 + tcp_bbr + dpdk if registered cleanly, 0 SIGSEGV. - G3.2 single curl from f-stack-client: HTTP/1.1 200 OK + Content-Length: 438 + body = real HTML (<!DOCTYPE html><html><head><title>Welcome to F-Stack!</title>...). - G3.3 100x short conn ZC build: ok=100/100. - G3.3 100x short conn baseline (helloworld): ok=100/100 — the M8 fix simultaneously cured the regression that the over-broad fast-path predicate caused in non-ZC builds (M7's 100x test had GPF'd). - G4 1000 short conn observation (OQ-2 default downgrade): ZC 6.768s vs baseline 6.884s — parity (bottleneck is ssh round-trip). Full perf baseline deferred to M9. - G6 lint: 0 errors. Bounce ledger: 1 formal bounce, gate(G2/G3)->code, resolved within the same milestone (3 RCA iterations folded into a single bounce because each was a refinement of the original 'ZC fast-path mis-fires' diagnosis, not a new symptom). No escalation. Documentation: - docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M8-spec.md (NEW, includes RCA section §7). - docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M8-execution-log.md (NEW, full evidence + design contract diagram + bounce ledger + follow-ups F1-F3). - docs/01-LAYER1-ARCHITECTURE.md + zh_cn mirror: M8 anchor sentence. - docs/F-Stack_Knowledge_Base_Summary.md + zh_cn: scope tag amended. Compliance: 0 direct rm/kill/chmod calls used. All process terminations, file deletions, and DPDK runtime cleanup routed through /data/workspace/{rm_tmp_file,kill_process,chmod_modify}.sh. Process liveness checked via [ -d /proc/$PID ] (not kill -0). Local commit only; not pushed.
1 parent cba3d88 commit add33a0

15 files changed

Lines changed: 537 additions & 7 deletions

docs/01-LAYER1-ARCHITECTURE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,7 @@ F-Stack adopted a **complete porting** strategy:
141141
- Supported optional features through conditional compilation (IPv6, KNI, TCPHPTS, FF_NETGRAPH, etc.); 15.0-introduced subsystems (NETLINK protocol, KTLS) are **not** ported per DP-2 / out-of-scope
142142
- **Phase-2 M6 (2026-06-08)**: enabled `FF_NETGRAPH=1` + `FF_IPFW=1` by default in `lib/Makefile`; brings 41 netgraph nodes + 14 ipfw kernel objects into `libfstack.a` (now 6.5 MB, was 5.4 MB); `tools/sbin/ipfw` 25 MB user-space binary now produced (was absent when FF_IPFW=0); `ipfw add/show/delete` and `ngctl list` verified end-to-end via DPDK secondary IPC. See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M6-execution-log.md` for full evidence + 7 link-only stubs added to `lib/ff_stub_14_extra.c`
143143
- **Phase-2 M7 (2026-06-08)**: enabled `FF_USE_PAGE_ARRAY=1` (P1a, single-pass / 0 bounces); brings `lib/ff_memory.c` (481 lines, mmap-based page-array + mbuf reference pool) into `FF_HOST_SRCS`; runtime allocates 256 MB one-shot mmap (65536 × 4 KB pages) at `ff_mmap_init` to amortize per-packet 4 KB alloc/free syscalls. See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M7-execution-log.md`
144+
- **Phase-2 M8 (2026-06-08)**: enabled `FF_ZC_SEND=1` (P1b, 1 bounce); introduced `FSTACK_ZC_MAGIC` sentinel (uio.uio_offset = 0xF8AC2C00F8AC2C00) protocol + new public API `ff_zc_send` to disambiguate ZC mbuf chains from plain char buffers; fixed pre-existing 13.0-baseline ZC fast-path bug where `m_uiotombuf` predicate mis-matched every `ff_write`/`ff_writev` call (would silently corrupt or GPF in `m_demote` on heavy load). 8 files +85/-4 across `freebsd/sys/mbuf.h`, `freebsd/kern/uipc_mbuf.c`, `freebsd/kern/sys_generic.c`, `lib/Makefile`, `lib/ff_syscall_wrapper.c`, `lib/ff_api.h`, `lib/ff_api.symlist`, `example/Makefile` + `example/main_zc.c`. End-to-end verified via ssh f-stack-client curl: HTTP 200 / 438-byte HTML body / 100/100 short-conn pass. See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M8-execution-log.md`
144145

145146
### 3.2 Ported FreeBSD Subsystems
146147

docs/F-Stack_Knowledge_Base_Summary.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
**Document Version**: 1.0
44
**Generation Date**: 2026-03-20
5-
**Content Scope**: F-Stack v1.26 (FreeBSD 15.0 port; upgraded from 13.0 in 2025-2026 — M0~M5 + runtime-fix + rib-fix + Phase-5b NFR-1 PASS; **Phase-2 M6 enabled FF_NETGRAPH+FF_IPFW combo + M7 enabled FF_USE_PAGE_ARRAY, 2026-06-08**) + DPDK 23.11.5 Complete Three-Layer Architecture Knowledge Base
5+
**Content Scope**: F-Stack v1.26 (FreeBSD 15.0 port; upgraded from 13.0 in 2025-2026 — M0~M5 + runtime-fix + rib-fix + Phase-5b NFR-1 PASS; **Phase-2 M6 enabled FF_NETGRAPH+FF_IPFW combo + M7 enabled FF_USE_PAGE_ARRAY + M8 enabled FF_ZC_SEND, 2026-06-08**) + DPDK 23.11.5 Complete Three-Layer Architecture Knowledge Base
66
**Document Location**: `/data/workspace/f-stack/docs/`
77
**Purpose**: Pre-requisite architecture documentation for Spec-Driven Development
88

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
# Phase-2 M8 Execution Log — FF_ZC_SEND (P1b)
2+
3+
> 状态:✅ PASS(all gates green,1 bounce 内修复完成)
4+
> 日期:2026-06-08
5+
> 上游基础:M7 commit `cba3d882b` (FF_USE_PAGE_ARRAY)
6+
7+
---
8+
9+
## 1. 摘要
10+
11+
启用 `FF_ZC_SEND=1`(默认 `FF_USE_PAGE_ARRAY=0`,独立验证 ZC 路径)。M8 在 G2/G3 出现一次 bounce,调试发现的根因比初期 spec §5 风险更复杂 —— **不只是用户态/内核态宏对齐问题,而是 13.0 baseline 遗留的 ZC fast-path 设计本身存在 3 个共同失效点**
12+
13+
1. ZC fast-path 判断条件过宽(`UIO_SYSSPACE && UIO_WRITE` 命中所有 `ff_write` 调用 → 普通 char buffer 被误当 mbuf)
14+
2. 缺少专用 ZC 入口(`ff_write` 不能区分 mbuf 指针 vs char 数组)
15+
3. `dofilewrite` 强行覆盖 `auio->uio_offset = offset` → 即使 caller 设了 sentinel 也会丢失
16+
17+
**修复**:引入 `FSTACK_ZC_MAGIC` sentinel 协议 + 新 `ff_zc_send` 公开 API + `dofilewrite` 保留 ZC sentinel + 普通 `ff_write/ff_writev` 显式 `uio_offset=0` 防误命中。
18+
19+
---
20+
21+
## 2. 改动清单
22+
23+
### 2.1 内核侧(freebsd/)
24+
25+
| 文件 | 改动 | 行数 |
26+
|---|---|---|
27+
| `freebsd/sys/mbuf.h` | 新增 `FSTACK_ZC_MAGIC` 宏(值 `0xF8AC2C00F8AC2C00`| +13 |
28+
| `freebsd/kern/uipc_mbuf.c` | `m_uiotombuf` ZC fast-path 新增 `uio_offset == FSTACK_ZC_MAGIC` 谓词 | +12 / -2 |
29+
| `freebsd/kern/sys_generic.c` | `dofilewrite``auio->uio_offset == FSTACK_ZC_MAGIC` 时跳过覆盖 + include `sys/mbuf.h` | +12 / -1 |
30+
31+
### 2.2 lib 侧
32+
33+
| 文件 | 改动 | 行数 |
34+
|---|---|---|
35+
| `lib/Makefile` | `FF_ZC_SEND=1`(取消注释) | +1 / -1 |
36+
| `lib/ff_syscall_wrapper.c` | 新增 `ff_zc_send` API + `ff_write/ff_writev` 显式设 `uio_offset=0` + include `sys/mbuf.h` | +35 / 0 |
37+
| `lib/ff_api.h` | 声明 `ff_zc_send` + 用法说明 | +10 |
38+
| `lib/ff_api.symlist` |`ff_zc_send` 加入导出符号白名单 | +1 |
39+
40+
### 2.3 example 侧
41+
42+
| 文件 | 改动 |
43+
|---|---|
44+
| `example/Makefile` | helloworld_zc target 加 `-DFSTACK_ZC_SEND` |
45+
| `example/main_zc.c` | line 215 `ff_write``ff_zc_send` + 顶部 extern 声明 |
46+
47+
**总计 8 文件 +85/-4**
48+
49+
---
50+
51+
## 3. RCA 演进
52+
53+
### Bounce #1 — 起始假设:用户态宏漏传
54+
55+
helloworld_zc 起栈正常但单次 curl 即触发 GPF(IP `0x10facb6` = `m_demote+0x36`)。gdb 抓 coredump 显示 `rbx = 0x312e312f50545448` ASCII = `"HTTP/1.1"` —— `iov_base` 指向 HTML 字符串被 fast-path 当 mbuf 指针。
56+
57+
第一次假设:`example/Makefile` 编译 helloworld_zc 时漏传 `-DFSTACK_ZC_SEND`,使 `main_zc.c:225``#else` 分支用 `ff_write(html_buf, ...)`。补丁后 build PASS,**但单次 curl 仍崩**(GPF 同地址)。
58+
59+
### Bounce #1 cont. — baseline 也崩
60+
61+
发现关键证据:纯 baseline `helloworld` (用 main.c + 普通 `ff_write(html, len)`) **同样在 100x 短连压测中段 GPF**。说明 lib 内的 ZC fast-path 谓词命中范围太宽 —— `ff_write` 内部设 `auio.uio_segflg = UIO_SYSSPACE``kern_writev → dofilewrite → fo_write → sosend → m_uiotombuf` 必触发 fast-path,把 char buffer 当 mbuf 解析。
62+
63+
### Bounce #1 cont. — 引入 sentinel
64+
65+
设计 `FSTACK_ZC_MAGIC` (`0xF8AC2C00F8AC2C00`) sentinel 写入 `uio->uio_offset`,fast-path 增 `uio_offset == FSTACK_ZC_MAGIC` 谓词。普通 `ff_write/ff_writev` 显式设 `uio_offset = 0` opt-out。同步新增 `ff_zc_send` 专用入口(main_zc.c:215 改用之),保持 `ff_write` 公共 API 语义不变。
66+
67+
### Bounce #1 cont. — debug 揭示 dofilewrite 丢 sentinel
68+
69+
新版本起栈测试,单次 curl 收到 649 bytes 但**全是 mbuf header 内存**`m_data` 指针 + 大量 0x00),不是 HTML 字符串。primary 不崩但 payload 错乱。
70+
71+
加 printf debug 在 `ff_zc_send` 与 fast-path 入口:
72+
- `[ZC] ff_zc_send: fd=1027 mb=0x7ffff78e1c00 nbytes=649` — caller mbuf 正确(m_data ASCII = "HTTP/1.1 200 OK..Server: F-Stack" ✓)
73+
- `[ZC-FP]` debug **完全没出现** → fast-path 没触发
74+
75+
`freebsd/kern/sys_generic.c:559` —— `dofilewrite` 强行 `auio->uio_offset = offset`,而 `kern_writev` 传入 `offset = (off_t)-1`**sentinel 在到达 `m_uiotombuf` 前已被覆盖为 -1**。补丁 `sys_generic.c` 在 sentinel 已存在时跳过覆盖,重测:
76+
- `[ZC-FP] enter: m=0x7ffff78e1c00 total=649` ✓ fast-path 命中
77+
- HTTP/1.1 200 OK + Content-Length: 438 + 真实 HTML body ✓ ✓ ✓
78+
79+
### Bounce 计数
80+
81+
| # | 阶段 | 触发原因 | 修复 |
82+
|---|---|---|---|
83+
| 1 | gate→code | helloworld_zc 单次 curl GPF + baseline 100x 也 GPF + payload 错乱 | 3 处源码 + sentinel 协议 + ff_zc_send 新 API(合并为单次 bounce) |
84+
85+
打回计数 **1/3**(plan §6 限额),**未 escalation**
86+
87+
---
88+
89+
## 4. Gate 实测结果
90+
91+
### G1 — 编译
92+
93+
| 子项 | 结果 |
94+
|---|---|
95+
| `lib/ make clean && make` | exit=0 / 0 errors / 57 warnings (= M6/M7 baseline) |
96+
| `libfstack.a` 大小 | 6.55 MB |
97+
| `example/ make` | exit=0 / 3 binaries (helloworld 29.02 / _epoll 29.02 / _zc 29.03) |
98+
| `nm libfstack.a \| grep ff_zc_send` | `T ff_zc_send`(全局导出) |
99+
100+
### G2 — 主程序冒烟
101+
102+
| 子项 | 结果 |
103+
|---|---|
104+
| primary 起栈 12s ALIVE ||
105+
| 关键日志 | `ipfw2 (+ipv6) initialized` / `tcp_bbr is now available` / `f-stack-0: Successed to register dpdk interface` |
106+
| SIGSEGV / panic / stub-called | 0 |
107+
108+
### G3 — 功能验收
109+
110+
| 子项 | 结果 |
111+
|---|---|
112+
| G3.2 单次 curl `--http0.9 -sS http://9.134.214.176/` | HTTP 200 / Content-Length 438 / body = 真 HTML |
113+
| body 验证 | hexdump 前 80 字节 = `<!DOCTYPE html>\r\n<html>\r\n<head>\r\n<title>Welcome to F-Stack!</title>\r\n<style>...`|
114+
| G3.3 100x 短连压测 | ok=100 fail=0 ✓ |
115+
| baseline non-ZC(helloworld)100x 压测 | ok=100 fail=0 ✓(M8 修复同时治好该回归) |
116+
| primary 退出 | 干净 SIGTERM(5s 内退出,无 SIGKILL) |
117+
118+
### G4 — 简易性能 observation(OQ-2 默认许可降级)
119+
120+
| build | 1000 短连耗时 | 推算 conn/s |
121+
|---|---|---|
122+
| helloworld (baseline non-ZC) | 6.884s | ~145 |
123+
| helloworld_zc (FF_ZC_SEND) | 6.768s | ~148 |
124+
125+
差异在测量噪声范围内(client 端 curl 串行 + ssh round-trip 占主导)。完整性能基线推迟到 M9 PA+ZC combo 与 phase-5b 方法学复用时进行。
126+
127+
### G5 — 文档
128+
129+
`phase2-M8-spec.md` + `phase2-M8-execution-log.md` 完整;docs/01-LAYER1-ARCHITECTURE.md + Summary 双语 anchor 同 M6/M7 模式。
130+
131+
### G6 — Lint
132+
133+
0 errors(read_lints 全清)。
134+
135+
### G7 — Commit
136+
137+
本地英文 commit,不 push(per 规约)。
138+
139+
---
140+
141+
## 5. 设计契约(生效到 M9 之后)
142+
143+
```
144+
user-space libfstack
145+
ff_zc_mbuf_get -+
146+
|--- 构造 mbuf chain (m_getm2, M_WAITOK, MT_DATA, flags=0)
147+
ff_zc_mbuf_write
148+
|--- 填充 mbuf m_dat + 累加 m_len
149+
v
150+
ff_zc_send(fd, mbuf, len)
151+
|--- aiov.iov_base = mbuf
152+
|--- auio.uio_segflg = UIO_SYSSPACE
153+
|--- auio.uio_offset = FSTACK_ZC_MAGIC <-- 关键 sentinel
154+
v
155+
kern_writev → dofilewrite (kept FSTACK_ZC_MAGIC) → fo_write → sosend
156+
→ m_uiotombuf (FSTACK_ZC_SEND fast-path 命中)
157+
→ 直接返回 caller mbuf chain,跳过 copy 循环
158+
→ tcp_usr_send → sbappendstream → tcp_output → DPDK TX
159+
160+
普通 ff_write / ff_writev 同样路径,但 uio_offset = 0 不命中 fast-path,
161+
走 m_getm2 + uiomove copy loop(旧行为)。
162+
163+
ff_send / ff_sendto / ff_sendmsg 走 sendit → kern_sendit,后者已显式
164+
设 uio_offset = 0,不会误命中 fast-path。
165+
```
166+
167+
---
168+
169+
## 6. 已知遗留与 follow-up
170+
171+
|| 描述 | 计划 |
172+
|---|---|---|
173+
| **F1**(信息性,非阻塞) | 当前 G4 性能由 ssh round-trip + curl 串行主导,无法体现 ZC vs 非 ZC 真实差异 | M9 复用 phase-5b CVM A/B + 物理机方法学,wrk/iperf 大并发对比 |
174+
| **F2**(已解决) | M7 baseline 测试同样可能因 ZC fast-path 误命中崩溃,但因 M7 时未压测 1000 conn 而未暴露 | M8 修复同时治好(baseline 100x 已验证) |
175+
| **F3** | M9 PA+ZC combo 是否有交叉问题尚需验证 | M9 单独 spec |
176+
177+
---
178+
179+
## 7. Phase 进度
180+
181+
| Phase | 状态 |
182+
|---|---|
183+
| A. Spec ||
184+
| B. Research ||
185+
| C. Code v1(仅 Makefile) | ✅ → bounce |
186+
| C. Code v2(sentinel + new API + 6 files) ||
187+
| D. Review | ✅(self review,0 lint) |
188+
| E. Gate | ✅ G1-G7 全 PASS |
189+
190+
**M8 整体:✅ PASS**

0 commit comments

Comments
 (0)