Skip to content

Commit 435e027

Browse files
committed
perf(phase-5b): cross-config baseline matrix closes M9-F1/M10-F2, finds F-A1
Phase-5b perf baseline matrix executed against the post-Phase-2 codebase (M-Final 99cc538) using a curl-bench harness driven from f-stack-client over ssh. Five Makefile configurations were exercised in turn (C0 baseline / C7 PA-only / C8 ZC-only / C9 PA+ZC / C10 FLOW_IPIP), with three trials per testcase, totaling 33 datapoints. Methodology constraint: f-stack-client only has curl + ping (no iperf3/wrk/ab) and the lone server-side virtio NIC double-functions as ssh transport, so absolute throughput numbers are capped at ~137 conn/s by the ssh-fork-bash-curl serial chain. The matrix is therefore observation-only (per phase-2 OQ-2/OQ-4 default downgrade) and is meaningful only for cross-config delta + functional pass/fail. Deliverables (8 docs + 1 harness + 9 raw CSVs): - tools/sbin/p5b_perf_matrix.sh: reusable harness, force-added past the pre-existing tools/sbin/.gitignore '*' rule (mirrors how m5_perf.sh is tracked). - docs/.../phase-5b-perf-baseline-spec.md: methodology + 5x3 matrix + acceptance criteria (observation-only). - docs/.../phase-5b-perf-baseline-report.md: results, RCA, finding F-A1, follow-ups, production recommendation. - docs/.../p5b_data/{C0,C7,C8,C9,C10}_{TC1,TC2,TC3}.csv: nine raw per-trial CSV files preserved for reproducibility. - docs/01-LAYER1-ARCHITECTURE.md + zh_cn mirror: phase-5b anchor. - docs/F-Stack_Knowledge_Base_Summary.md + zh_cn: scope tag amended. Closures: - M9-F1 (PA+ZC combo allegedly 3.5x slower than M8 ZC-only): CLOSED as false negative. Phase-2 measurement was contaminated by stale helloworld processes co-occupying the NIC. Clean re-measurement: C9 median 7.626s vs C0 baseline 7.327s = +4.1% (well below the NFR-1 5% threshold). - M10-F2 (IPIP tunnel large-flow throughput baseline missing): CLOSED via 100-ping-per-trial RTT timeseries baseline (curl-bench client lacks iperf3). 3/3 trials all 100/100 ping, RTT median 0.388-0.397 ms, jitter 9 ms. Software GIF tunnel path is stable. New finding F-A1 (HIGH severity, deferred): - FF_USE_PAGE_ARRAY=1 standalone (no FF_ZC_SEND) breaks ICMP+HTTP egress entirely: ping 0% / curl connect timeout. helloworld primary stays alive and the DPDK init log looks clean, but no packets actually leave the box. - Static analysis points to lib/ff_dpdk_if.c:2137-2148 forcing every send through ff_if_send_onepkt (lib/ff_memory.c:440), whose ff_chk_vma() predicate at line 453 only matches data pointers inside the 256MB ff_mmap_init region. ARP / ICMP reply mbufs allocated by the BSD stack don't fall in that range, so ff_extcl_to_rte() returns NULL and the packet is silently dropped. - M9 (PA+ZC) appears to work because the application data path goes through ZC fast-path (ff_zc_mbuf_get/write allocate inside the PA VMA), but ARP-on-PA consistency is unverified — kept as followup F-A2 to retest under a clean client ARP cache. - Mitigation: production deployments should choose C8 ZC-only (recommended) or C9 PA+ZC; never enable PA standalone (C7). Default Makefile is left at C0 (P0-only) per M-Final state. Followups (filed in report §5): - F-A1 (High): RCA + fix ff_chk_vma to cover ARP/ICMP mbufs OR add a fallback rte_pktmbuf_alloc + bcopy when ff_extcl_to_rte returns NULL. - F-A2 (Medium): retest C9 ARP-on-PA under cleared client ARP cache to confirm whether C9 truly works or merely benefits from cached ARP entries left over from M8/M9 testing. - F-A3 (Low): client-side wrk/iperf3 install (or independent test rig) to replace the curl-bench bottleneck. - F-A4 (Low): re-run p5b matrix on bare-metal + CVM dual baselines per M5-test-report.md NFR-1 framework when the next NFR re-evaluation cycle starts. Bounce ledger: 0 formal bounces in phase-5b. Findings F-A1/F-A2 are documented and deferred, not bounced. Compliance: 0 direct rm/kill/chmod calls. All process kills, file deletions, DPDK runtime cleanup, and chmod +x on the new harness went through workspace shell wrappers (rm_tmp_file.sh, kill_process.sh, chmod_modify.sh). Local commit only; not pushed.
1 parent 99cc538 commit 435e027

18 files changed

Lines changed: 497 additions & 2 deletions

docs/01-LAYER1-ARCHITECTURE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,7 @@ F-Stack adopted a **complete porting** strategy:
145145
- **Phase-2 M9 (2026-06-08)**: enabled both `FF_USE_PAGE_ARRAY=1` + `FF_ZC_SEND=1` combo (P1c, single-pass / 0 bounces, 1-line Makefile change leveraging M7+M8 work); validated co-existence end-to-end (ff_mmap_init 256MB + ipfw2/tcp_bbr init clean + HTTP 200 single curl + 100/100 short-conn). 1000-conn observation: ~3.5× slower than M8 ZC-only with occasional timeout — recorded as M9-followup-F1 for phase-5b perf profiling. See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M9-execution-log.md`
146146
- **Phase-2 M10 (2026-06-08)**: enabled `FF_FLOW_IPIP=1` (P1d, 1 bounce); softened `create_ipip_flow` failure from `rte_exit` to printf warning so primary stays alive on NICs that lack rte_flow IPIP offload (e.g. virtio); GIF tunnel runs in software via FreeBSD `if_gif/in_gif`. End-to-end IPIP tunnel verified: `tools/sbin/ifconfig gif0 create + tunnel + inet` on server side + `ip tunnel add gif0 mode ipip` on Linux f-stack-client side + ping 3/3 received 0% loss RTT 0.29-0.65 ms. example/Makefile auto-skips helloworld_zc target when libfstack.a is built without FF_ZC_SEND. See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M10-execution-log.md`
147147
- **Phase-2 M11/M12/M13 (2026-06-08)**: P2-priority smoke trio — enabled `FF_FLOW_ISOLATE=1` (M11), `FF_FDIR=1` (M12), `FF_LOOPBACK_SUPPORT=1` (M13) each in turn; lib build clean and helloworld primary ALIVE for each. M11 batched the rte_flow soft-fallback for `port_flow_isolate`/`init_flow`/`fdir_add_tcp_flow` (3 sites in `ff_dpdk_if.c`) following the M10 pattern. M13 added one link-only stub `ff_swi_net_excute` to `ff_stub_14_extra.c` (declared in `ff_host_interface.h:92` but never implemented in the tree). See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase2-M11-M13-spec.md`
148+
- **Phase-5b perf baseline (2026-06-08)**: 5-config × 2-3 testcase × 3-trial matrix executed via `tools/sbin/p5b_perf_matrix.sh` (curl-bench from f-stack-client; ssh round-trip caps at ~137 conn/s, only relative cross-config delta is meaningful). Closes M9-F1 (PA+ZC combo +4.1% over baseline, false negative caused by stale-process noise) and M10-F2 (IPIP tunnel ping baseline 0.39 ms / 0% loss / 9 ms jitter). New finding **F-A1 (HIGH)**: `FF_USE_PAGE_ARRAY=1` standalone breaks ICMP+HTTP egress (`ff_chk_vma` in `ff_memory.c:453` doesn't cover ARP/ICMP mbuf data pointers); deferred for follow-up. Production recommendation: prefer C8 ZC-only or C9 PA+ZC; avoid PA-only. See `docs/freebsd_13_to_15_upgrade_spec/zh_cn/phase-5b-perf-baseline-report.md`
148149

149150
### 3.2 Ported FreeBSD Subsystems
150151

docs/F-Stack_Knowledge_Base_Summary.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
**Document Version**: 1.0
44
**Generation Date**: 2026-03-20
5-
**Content Scope**: F-Stack v1.26 (FreeBSD 15.0 port; upgraded from 13.0 in 2025-2026 — M0~M5 + runtime-fix + rib-fix + Phase-5b NFR-1 PASS; **Phase-2 M6 NETGRAPH+IPFW + M7 PAGE_ARRAY + M8 ZC_SEND + M9 PA+ZC + M10 FLOW_IPIP + M11 FLOW_ISOLATE + M12 FDIR + M13 LOOPBACK, 2026-06-08**) + DPDK 23.11.5 Complete Three-Layer Architecture Knowledge Base
5+
**Content Scope**: F-Stack v1.26 (FreeBSD 15.0 port; upgraded from 13.0 in 2025-2026 — M0~M5 + runtime-fix + rib-fix + Phase-5b NFR-1 PASS; **Phase-2 M6 NETGRAPH+IPFW + M7 PAGE_ARRAY + M8 ZC_SEND + M9 PA+ZC + M10 FLOW_IPIP + M11 FLOW_ISOLATE + M12 FDIR + M13 LOOPBACK + Phase-5b perf baseline matrix (closes M9-F1/M10-F2, finds F-A1), 2026-06-08**) + DPDK 23.11.5 Complete Three-Layer Architecture Knowledge Base
66
**Document Location**: `/data/workspace/f-stack/docs/`
77
**Purpose**: Pre-requisite architecture documentation for Spec-Driven Development
88

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,.788964588,100,100,0.000
3+
2,.794768072,100,100,0.000
4+
3,.796119061,100,100,0.000
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,7.352980100,1000,1000,0.000
3+
2,7.308304520,1000,1000,0.000
4+
3,7.327397387,1000,1000,0.000
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,.786193676,100,100,0.000
3+
2,.795912315,100,100,0.000
4+
3,.743568101,100,100,0.000
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,7.343590648,1000,1000,0.000
3+
2,7.310759102,1000,1000,0.000
4+
3,7.290512690,1000,1000,0.000
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,20.590,100,100,0.000,0.390
3+
2,20.590,100,100,0.000,0.388
4+
3,20.592,100,100,0.000,0.397
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,,,100,1.000
3+
2,,,100,1.000
4+
3,,,100,1.000
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,,,1000,1.000
3+
2,,,1000,1.000
4+
3,,,1000,1.000
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
trial,t_total_s,pass_count,n,fail_rate
2+
1,.741968534,100,100,0.000
3+
2,.729278711,100,100,0.000
4+
3,.733121657,100,100,0.000

0 commit comments

Comments
 (0)