Skip to content

Commit 9b28de2

Browse files
author
Simon Morley
committed
feat(bpf): add runtime XDP/TC health checks before each scan
Long-lived processes (e.g. limpet-timing) create Engine once at startup and reuse it for days. If XDP/TC detach after init (interface bounce, admin removal, another XDP program), the BPF map silently returns None for everything and all ports become "Filtered". This adds ~5ms pre-scan verification via `ip link show` + `tc filter show` to fail hard instead of producing silently wrong results. - Add `BpfTimingCollector::verify_attached()` wrapping existing verify fns - Add pre-scan health check in `ScanEngine::discover_bpf()` - Add pre-timing health check in `collect_timing_samples_raw()` - Fix README: remove stale userspace fallback references
1 parent 868771f commit 9b28de2

4 files changed

Lines changed: 52 additions & 9 deletions

File tree

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ What does TRS stand for? Temporal Resonance Scanner. You heard it here first.
2121
## Features
2222

2323
- **SYN scanner** — raw socket sender, no connection established
24-
- **XDP/BPF timing** — per-packet timestamps from the kernel bypass path; falls back to userspace `gettimeofday` when XDP is unavailable
24+
- **XDP/BPF timing** — per-packet timestamps from the kernel bypass path; requires Linux with BPF capabilities
2525
- **Stealth pacing** — configurable inter-packet delay to avoid triggering rate limits
2626
- **ML-ready output** — timing samples (not just a single RTT), mean/p50/p90 stats, and 64-dim embedding vectors for each port
2727
- **JSON output** — machine-readable results for pipelines
@@ -34,13 +34,13 @@ What does TRS stand for? Temporal Resonance Scanner. You heard it here first.
3434

3535
| Requirement | Notes |
3636
|-------------|-------|
37-
| Linux kernel ≥ 5.11 | For BPF ring buffers; XDP timing degrades gracefully to userspace on older kernels |
37+
| Linux kernel ≥ 5.11 | For BPF ring buffers; fails hard if BPF unavailable |
3838
| `NET_RAW` + `NET_ADMIN` capabilities | Required for raw socket SYN scanning |
39-
| `CAP_BPF` + `CAP_SYS_ADMIN` | Required for XDP/BPF timing (not needed for userspace fallback) |
40-
| Bare-metal or KVM VM | AF_XDP requires a real NIC driver; Docker Desktop (macOS/Windows) will fall back to userspace timing |
39+
| `CAP_BPF` + `CAP_SYS_ADMIN` | Required for XDP/BPF timing (no unprivileged fallback) |
40+
| Bare-metal or KVM VM | AF_XDP requires a real NIC driver; cannot load BPF — limpet will not run |
4141
| Root or `sudo` | Easiest path; or grant caps with `setcap` |
4242

43-
**Does not work on:** macOS, Windows, Docker Desktop (for BPF features — CLI builds but timing falls back to userspace).
43+
**Does not work on:** macOS, Windows, Docker Desktop (BPF programs cannot load — scanning unavailable).
4444

4545
---
4646

@@ -259,9 +259,9 @@ curl -X POST http://localhost:8888/v1/timing \
259259
## Limitations
260260

261261
**Platform**
262-
- Linux only. The BPF/XDP path requires kernel ≥ 5.11 for ring buffers. Older kernels fall back to userspace timing automatically.
262+
- Linux only. The BPF/XDP path requires kernel ≥ 5.11 for ring buffers. Older kernels fail at BPF program load — no fallback.
263263
- AF_XDP requires a NIC driver with XDP support. Virtio-net (KVM/QEMU) works. VMware vmxnet3 and some cloud hypervisor NICs do not.
264-
- Docker Desktop on macOS/Windows: the CLI builds and runs but the BPF programs cannot load — timing falls back to userspace.
264+
- Docker Desktop on macOS/Windows: the CLI builds but the BPF programs cannot load — limpet will exit with an error.
265265

266266
**Scanning**
267267
- **No service detection** — limpet identifies open ports and collects RTT samples. The `banner` field contains raw bytes from the server's first response packet, but there is no protocol parsing.
@@ -272,7 +272,7 @@ curl -X POST http://localhost:8888/v1/timing \
272272

273273
**Timing precision**
274274
- XDP timestamps are recorded at NIC receive time, not in application code. This gives you real wire latency including NIC driver overhead, not software scheduling jitter.
275-
- Userspace fallback (`precision_class: "userspace"`) has ±50–200µs jitter under load — accurate enough for coarse fingerprinting, not for sub-millisecond jitter analysis.
275+
- No userspace timing fallback — all timing uses XDP kernel timestamps.
276276
- RTT samples include the full TCP handshake (SYN → SYN-ACK). This is intentional: handshake latency is the fingerprinting signal.
277277

278278
**Permissions**
@@ -304,7 +304,7 @@ limpet/
304304
│ │ └── mod.rs
305305
│ └── timing/
306306
│ ├── xdp.rs # BpfTimingCollector — XDP kernel-bypass timing
307-
│ ├── userspace.rs # Fallback: connect(2) + gettimeofday timing
307+
│ ├── userspace.rs # Raw SYN probe timing collection (BPF-backed)
308308
│ ├── embeddings.rs # 64-dim feature vector extraction
309309
│ ├── stats.rs # Mean / std / percentile helpers
310310
│ └── mod.rs

src/engine.rs

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -282,6 +282,25 @@ impl ScanEngine {
282282
ports: &[u16],
283283
start: Instant,
284284
) -> ScanResult {
285+
// Pre-scan BPF health check: verify XDP + TC are still attached.
286+
// Catches detachment between engine creation and scan execution —
287+
// the critical gap for long-lived processes (e.g. limpet-timing).
288+
{
289+
let bpf_guard = self.collector.lock().await;
290+
if let Err(e) = bpf_guard.verify_attached() {
291+
return ScanResult {
292+
request_id: request.request_id,
293+
target_ip,
294+
target_hostname,
295+
ports: vec![],
296+
duration_ms: start.elapsed().as_millis() as u64,
297+
backend: self.backend.as_str().to_string(),
298+
scanned_at: Utc::now(),
299+
error: Some(format!("BPF health check failed: {e}")),
300+
};
301+
}
302+
}
303+
285304
let batch_size = request.pacing.batch_size();
286305
let timeout_ms = request.timeout_ms;
287306
let timeout = Duration::from_millis(timeout_ms as u64);

src/timing/userspace.rs

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,17 @@ pub async fn collect_timing_samples_raw(
8989
bpf: Arc<Mutex<BpfTimingCollector>>,
9090
scanner: Arc<Mutex<SynScanner>>,
9191
) -> TimingResult {
92+
// Pre-timing BPF health check: verify XDP + TC are still attached.
93+
{
94+
let bpf_ref = bpf.lock().await;
95+
if let Err(e) = bpf_ref.verify_attached() {
96+
return TimingResult::error(
97+
request,
98+
format!("BPF health check failed: {e}"),
99+
);
100+
}
101+
}
102+
92103
let addr = match resolve_address(&request.target_host, request.target_port) {
93104
Ok(addr) => addr,
94105
Err(e) => return TimingResult::error(request, format!("DNS resolution failed: {}", e)),

src/timing/xdp.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,19 @@ impl BpfTimingCollector {
351351
}
352352
}
353353

354+
impl BpfTimingCollector {
355+
/// Verify both XDP ingress and TC egress are still attached.
356+
///
357+
/// Runs `ip link show` + `tc filter show` (~5ms overhead). Use this as a
358+
/// pre-scan health check for long-lived processes where BPF programs may
359+
/// detach after initial setup (interface bounce, admin removal, etc.).
360+
pub fn verify_attached(&self) -> Result<(), BpfTimingError> {
361+
verify_xdp_attached(&self.interface)?;
362+
verify_tc_attached(&self.interface)?;
363+
Ok(())
364+
}
365+
}
366+
354367
impl Drop for BpfTimingCollector {
355368
fn drop(&mut self) {
356369
// TC hooks persist after userspace exits, so we must explicitly detach

0 commit comments

Comments
 (0)