Reference for the Linux-only parts of natra's test harness. The
top-level docs/development.md covers the day-to-day commands; this
document is the deeper "what each layer is, what it asserts, and what
to know when it goes red" reference.
| Layer | Purpose | Build tag |
|---|---|---|
| 1 | Unit + Go-native fuzz + benchmarks | none |
| 2 | CNI protocol — exec the binary in a netns | integration |
| 3 | BPF dataplane — BPF_PROG_RUN + verifier |
bpf |
| 4 | k3d end-to-end with iperf assertions | e2e |
| 5 | Perf scenarios + synthetic vs-vanilla | perf |
Plain go test ./... runs only L1.
Every layer runs from make ci on macOS (via Docker) and on Linux
(natively or via Docker). The vs-vanilla cluster comparison is
on-demand via make perf-vs-vanilla, not part of make ci.
Files under test/cni/:
cni_linux_test.go— happy-path ADD/DEL/CHECK + the four explicit attach modes (tcx-hostside, tcx-podside, clsact-hostside, clsact-podside) + per-direction specs (ingress only, egress only, both, neither). Theautodefault isn't exercised here because every mode is tested explicitly; the auto strategy logic lives inresolveAttachStrategyand is covered by unit tests.chaos_linux_test.go— malformed stdin, annotation injection on both ingress and egress channels, bad CNI env vars.helpers_linux_test.go— netns lifecycle, exec, env-var construction, direction-awarelinkPinExists,remainingPinsFor.cni_stub_test.go— non-Linux skip stub.
Run:
sudo make test-cni # native Linux
make test-cni # macOS (wrapped in scripts/run-in-docker.sh)Prerequisites:
- Linux kernel 5.x+ (6.6+ to exercise the tcx attach happy path).
sudofor CAP_NET_ADMIN.- bpffs at
/sys/fs/bpf— the test's BeforeSuite mounts it viaunix.Mountif not already; idempotent.
Tests runtime.LockOSThread() in BeforeSuite so netns operations
don't migrate goroutines mid-test. The CNI_NETNS path uses
/proc/<pid>/fd/<fd> rather than /var/run/netns/<name> — fine for
tests; some real CNI runtimes use named netns.
The L2 test exec's the natra binary as a subprocess (matching kubelet's invocation pattern), so arg-parsing and stdin-handling bugs surface here. CNI errors come back as JSON on stdout, not stderr.
Files under test/bpf/:
prog_linux_test.go— placeholder load + sanity.ratelimit_linux_test.go— token-bucket and CMS classification, table-driven acrossnatra_ingressandnatra_egress. IncludesTestCrossDirectionIsolation— configures one direction tight and the other wide-open, asserts no bleed.chaos_linux_test.go— verifier rejection of intentionally invalid programs, malformed packets, concurrent map updates, CMS saturation.edge_cases_linux_test.go— packet > burst, ICMP without L4 ports, IPv4 options, zero burst, rapid config change, jumbo packets, counter overflow. Direction-agnostic; runs againstnatra_ingressonly (the egress program shares the same code).prog_stub_test.go— non-Linux skip.testdata/invalid_oob_packet_access.bpf.c— verifier-rejection fixture.
Run:
make test-bpfPrerequisites:
- Linux kernel 5.10+ (
BPF_PROG_RUNwith skb). - LLVM clang with the
bpftarget. The Makefile setsBPF_CLANG=/opt/homebrew/opt/llvm/bin/clangon macOS automatically if Homebrew's LLVM is installed.
Constraints to be aware of when extending L3 tests:
BPF_PROG_RUNwith skb caps the input packet size at roughlyPAGE_SIZE - sizeof(struct skb_shared_info)(~3,772 B on x86_64). 4 KB+ inputs return EINVAL. SeeTestEdgeJumboPacketfor the documented constraint.- BPF programs with atomic adds (CMS uses
__sync_add_and_fetch) need-mcpu=v3or newer. The Makefile sets it. - Helper calls (
bpf_ktime_get_ns, etc.) are verifier-rejected insidebpf_spin_lock-protected regions. Read the timestamp first, then take the lock.
Files under test/e2e/:
e2e_test.go::BeforeSuite— creates a 2-node k3d cluster (1 server + 1 agent) with flannel host-gw forced (VXLAN is ~30 Mbps on colima's LinuxKit kernel, below the rate-limit caps under test).manifests/iperf-server.yaml— server withkubernetes.io/ingress-bandwidth: "10M"(Topology A).manifests/iperf-server-egress.yaml— egress only (Topology B).manifests/iperf-server-bidi.yaml— both annotations at 10M (Topologies C and G).manifests/iperf-server-mixed-{a,b,c}.yaml— three pods on the worker; only mixed-a is annotated (Topology D, also reused by E).manifests/iperf-server-noplugin.yaml— unannotated, used by the no-plugin regression test (Topology F).manifests/iperf-client.yaml— client on the control-plane.e2e_test.go— Topologies A through G, plus a connectivity smoke in Topology A.chaos_test.go— DaemonSet restart preserves rate-limiting on ingress and egress pods, pod churn, three pending characterization specs (PIt).
Topologies asserted:
| Topology | What it pins |
|---|---|
| A | ingress annotation throttles forward iperf3 |
| B | egress annotation throttles reverse iperf3 (-R) |
| C | both annotations throttle forward then reverse, sequential |
| D | mixed: only annotated pods throttled; unannotated pods on the same node free |
| E | no-annotation case: natra in path, no throttling |
| F | no-plugin regression: with-natra delta vs. no-natra baseline < 20% |
| G | proxy-like: both directions throttle independently under concurrent traffic |
Run:
make test-e2e # default attach=auto, edt=auto
NATRA_E2E_ATTACH_MODE=tcx-podside make test-e2e
NATRA_E2E_ATTACH_MODE=clsact-hostside make test-e2e
NATRA_E2E_ATTACH_MODE=clsact-podside make test-e2ePrerequisites:
- Docker (colima or Docker Desktop on macOS, dockerd on Linux).
k3dv5.7.4+,kubectl.iperf3(in-pod, imagenetworkstatic/iperf3:latest).
Failure-mode dumps: on iperf-Ready timeout, BeforeSuite emits
kubectl describe pod, the install init-container log, and the
patched conflist. NATRA_E2E_KEEP=1 make test-e2e leaves the k3d
cluster up after the test.
Files under test/perf/:
perf_linux_test.go—TestBPFProgRunThroughput— placeholder ns/op vs baseline.TestScenarioOneElephant{,Egress}— single elephant per direction, expect throttling.TestScenarioThousandMice— 1000 short flows on ingress, expect zerohh_hits.TestScenarioMixed— elephant + mice on ingress, mice survive.TestScenarioMixedVsVanilla{,Egress}— head-to-head vsbpf/vanilla.bpf.o, both directions.
perf_stub_test.go— non-Linux skip.baselines/local.json— ns/op ceiling for the synthetic BPF_PROG_RUN tests; the test fails on regression past the recorded value.realworld/vanilla-installer.yaml— DaemonSet that fetches the upstreambandwidthplugin and chains it after flannel (k3d's default CNI), used bymake perf-vs-vanillafor both ingress and egress phases.
Run:
make test-perf # synthetic, in-process, BPF_PROG_RUN
make perf-vs-vanilla # real-cluster, ~18-22 min, three k3d phasesThe mixed scenario is elephant-first by design: the elephant pre-drains the bucket, then mice arrive into the depleted bucket. Interleaved sequences let the bucket refill between elephant packets and trivially pass under both implementations.
natra picks an attach mode from an orthogonal cross of
{tcx, clsact} × {hostside, podside}, plus an auto mode that
expands to an ordered fallback chain:
| Mode | Hook | Veth half | Notes |
|---|---|---|---|
auto |
— | — | Default. Tries each combination in order. |
tcx-hostside |
TCX | host | Same shape as Cilium / NPA. |
tcx-podside |
TCX | pod (eth0) | Lives inside the pod netns. EDT-friendly. |
clsact-hostside |
clsact | host | TC filter on the host-side veth. |
clsact-podside |
clsact | pod (eth0) | Fallback for kernels < 6.6 / no bpffs. |
auto expansion depends on EDT pacing mode (defaults.edtPacing):
edtPacing: off: tcx-host → tcx-pod → clsact-host → clsact-podedtPacing: auto(default): tcx-pod → clsact-pod → tcx-host → clsact-host (pod-side first so thefqinstall on pod-eth0 sits downstream of the BPF program)edtPacing: on: tcx-pod → clsact-pod (host-side dropped — EDT requires pod-side)
Selected via the conflist attachMode field at the plugin level, or
via NATRA_ATTACH_MODE on the install init container, or via
NATRA_E2E_ATTACH_MODE / NATRA_PERF_ATTACH_MODE on the test rig.
Each annotated direction adds one tcx link per pod (or one clsact
filter on the matching HANDLE_MIN_* parent). A bidi-annotated pod
in tcx mode therefore has two link pins under
/sys/fs/bpf/natra/<containerID>-<side>-{ingress,egress}-link.
The <side> field is hostside or podside matching the attach
mode the pod was started with.
Bpffs forbids . in pin path components — kernel/bpf/inode.c::bpf_lookup
returns EPERM on any name containing a dot when the parent has any
S_IALLUGO bits set. natra's pin paths use dotless -link and -map
suffixes accordingly. See pkg/bpf/loader.go::PinMaps and
cmd/natra/main.go::pinPathFor.
Triggers: push to any branch + pull_request. No path filters, no
schedule gating. Concurrency-cancel in-progress runs per ref.
| Workflow | Layer | Duration target |
|---|---|---|
unit.yml |
1 (unit + fuzz + bench) | <30s unit, <2m fuzz, <2m bench |
cni.yml |
2 (CNI + chaos) | <3m |
bpf.yml |
3 (BPF, single kernel) | <5m |
e2e.yml |
4 (k3d + chaos) | <8m |
perf.yml |
5 (perf, single kernel) | <5m |
license.yml |
go-licenses + scancode | <2m |
ci.yml |
aggregator (needs:) |
reads other jobs |
The aggregator gives branch protection a single status to read.
Runs every layer + lint + license-scan in sequence, keeps going past failures, prints a per-layer pass/fail summary, exits non-zero if any failed. macOS without Docker skips L2-L5 with a clear message; Linux runs everything.
- New crashing inputs land in
pkg/cni/config/testdata/fuzz/<FuzzName>/<sha>and are committed to the repo so CI replays them on every push. - Reproduce a crash:
go test -run=FuzzParseBandwidthAnnotation/<sha> ./pkg/cni/config/... - The default
-fuzztime=30sis for the agent feedback loop. For release validation, raise it:go test -fuzz=Fuzz... -fuzztime=1h. - The fuzz job has
-test.timeout=2mto give the GH runner wind-down headroom — without it, slow runners hit "context deadline exceeded" on the last in-flight iteration after fuzztime fires.
- Multi-kernel matrix. The lvh image registry was unreliable
(
manifest unknownon the kernel tags). L3 and L5 currently run against the runner's host kernel only. - Real-veth in L3.
BPF_PROG_RUN's ~3,772 B input cap rules out jumbo behavior at the BPF unit level. Real-veth coverage is currently in L4 only. - Single-kernel L4/L5 topology (k3d) — cross-kernel now
covered by vm-rig. k3d's "nodes" are containers sharing one
Linux kernel and one docker bridge. The cross-kernel signal is
provided by the lima-based vm-rig (two VMs, two real kernels,
real inter-VM vmnet wire):
make test-vmfor the natra throttle + fast-pass assertions,make perf-vs-vanilla-vmfor the baseline/natra/upstream comparison (fresh cluster per phase). Both pass; results indocs/perf-vs-vanilla.md"Two-kernel (vm-rig) results". Still not reached: real hardware NICs, switch queueing, real cross-AZ latency — seedocs/test-environments.mdfor the cloud-VM / bare-metal escalation. - Cilium / AWS NPA coexistence. natra composes via bpf_mprog at the TCX hook by construction; no end-to-end rig with a loaded cilium / NPA cluster has been run yet. Validation needs a real EKS-or-similar cluster.
linux/arm64in CI. Local Mac dev runs arm64; CI runs amd64.Bystander cost from EDT preservation.Resolved by 273a99f — bounded EDT delay at 50 ms, fall through to ECN-mark above. Measured on perf-vs-vanilla Workload 2: bystander p99 61 → 27 ms, annotated mice p99 69 → 28 ms, egress 9.16 → 10.03 Mbps (closer to cap, not under). Egress stays in the 5% envelope.