Skip to content

Commit c122d21

Browse files
committed
Add request-backed cgroups lookup builders
1 parent 44a5bef commit c122d21

17 files changed

Lines changed: 529 additions & 133 deletions

File tree

.agents/sow/current/SOW-0021-20260613-netipc-at-scale.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1187,6 +1187,116 @@ Current interpretation:
11871187
- Go remains materially slower than C/Rust in these codec+dispatch microbenchmarks. The obvious avoidable algorithmic and allocation issues found so far are fixed, but the remaining gap still needs full-suite benchmark confirmation and, if required, profiling before declaring it inherent Go/runtime overhead.
11881188
- This checkpoint does not close the performance gate. Full POSIX and Windows benchmark regeneration still needs to be run cleanly after commit/push.
11891189

1190+
## 8192-Item Profiling Checkpoint - 2026-06-15
1191+
1192+
Context:
1193+
1194+
- Netdata can call lookup APIs with arrays of `8192` entries on large HPC systems.
1195+
- The `8192` case must be treated as normal scale, not as an exceptional or degraded path.
1196+
- The benchmarked loop is relevant to Level 2 consumers: it encodes the request, dispatches through the typed builder, and decodes/validates the response.
1197+
1198+
Profiling evidence before this checkpoint's additional fixes:
1199+
1200+
- `perf stat` on `build-bench-posix/bin/bench_posix_go lookup-method-bench 5 cgroups-lookup-mixed-8192 0` reported `cgroups-lookup-mixed-8192,go,go,1185,...,99.0` and about `99%` CPU. This is CPU-bound, not wait-bound.
1201+
- Isolated Go pprof benchmark before the allocation fix:
1202+
- `BenchmarkProfileCgroupsLookupMixed8192`: about `967462 ns/op`, `73920 B/op`, `4 allocs/op`.
1203+
- `BenchmarkProfileAppsLookupMixed8192`: about `782743 ns/op`, `73920 B/op`, `4 allocs/op`.
1204+
- Go memory profiles showed `makePayloadExceededSuffixBytes` accounting for about `97%-99%` of allocated bytes in both apps and cgroups lookup dispatch.
1205+
- Go CPU profiles showed the remaining cgroups cost concentrated in:
1206+
- `CgroupsLookupBuilder.Add`, about `40%` cumulative;
1207+
- string/NUL scanning through `bytes.IndexByte`/`indexbytebody`;
1208+
- response decode validation, about `25%` cumulative;
1209+
- defensive checked arithmetic and directory/payload slice validation.
1210+
- Go CPU profiles showed the remaining apps cost concentrated in:
1211+
- `AppsLookupBuilder.Add`, about `47%` cumulative;
1212+
- `validateAppsLookupItem`, about `29%` cumulative;
1213+
- `validateAppsLookupSemantics`, about `13%` cumulative before the safe status-switch cleanup.
1214+
1215+
Fixes made in this checkpoint:
1216+
1217+
- Go:
1218+
- APPS_LOOKUP dispatch-owned overflow suffix fitting now uses a fixed-size formula instead of allocating an item-count suffix table.
1219+
- CGROUPS_LOOKUP suffix bytes now use `uint32`, matching the protocol/C representation and halving the table size on 64-bit Go.
1220+
- APPS_LOOKUP semantic validation now uses a status switch while preserving the existing invalid cgroup-status rejection behavior.
1221+
- Rust:
1222+
- APPS_LOOKUP dispatch-owned overflow suffix fitting now uses a fixed-size formula instead of allocating a suffix vector.
1223+
- Lookup suffix bytes now use `u32` instead of `usize`, matching the protocol/C representation.
1224+
1225+
Validation evidence:
1226+
1227+
- `cd src/go && go test -count=1 ./pkg/netipc/protocol` passed.
1228+
- `cargo test --manifest-path src/crates/netipc/Cargo.toml protocol::lookup -- --test-threads=1` passed.
1229+
- `cd src/go && go test -count=1 ./pkg/netipc/...` passed.
1230+
- `cargo test --manifest-path src/crates/netipc/Cargo.toml -- --test-threads=1` passed with `374 passed`.
1231+
- `cmake --build build --target test_protocol -j24 && build/bin/test_protocol` passed with `514 passed, 0 failed`.
1232+
- Isolated Go benchmark after fixes:
1233+
- `BenchmarkProfileCgroupsLookupMixed8192`: about `883683 ns/op`, `41152 B/op`, `4 allocs/op`.
1234+
- `BenchmarkProfileAppsLookupMixed8192`: about `689551 ns/op`, `208 B/op`, `3 allocs/op`.
1235+
- Rebuilt release benchmark sample after fixes:
1236+
- `cgroups-lookup-mixed-8192`: C about `1917`, Rust about `1426`, Go about `1081` requests/s.
1237+
- `apps-lookup-mixed-8192`: C about `2174`, Rust about `1928`, Go about `1422` requests/s.
1238+
1239+
Current interpretation:
1240+
1241+
- The avoidable Go allocation problem is fixed for APPS_LOOKUP and materially reduced for CGROUPS_LOOKUP.
1242+
- Rust now matches the protocol-sized suffix representation, but its `8192` cgroups gap remains CPU-bound rather than allocation-bound in the sampled benchmark.
1243+
- Go remains materially slower than C/Rust at `8192`, especially for CGROUPS_LOOKUP.
1244+
- The main remaining cgroups-specific Go cost is duplicated string validation: request decode validates each path, then handlers commonly pass the validated `CStringView.Bytes()` back to `builder.Add`, which validates the same path bytes again because the public builder API accepts raw `[]byte`.
1245+
- Avoiding that duplicated cgroups scan safely requires an explicit design/API choice, such as an internal dispatch helper or a public builder path that accepts already-validated `CStringView` data. This was not changed in this checkpoint because silently trusting raw `[]byte` would weaken corruption detection.
1246+
- The performance gate remains open.
1247+
1248+
Decision - validated cgroups builder path:
1249+
1250+
- The duplicated cgroups path scan is a code-organization problem, not a Go-specific runtime issue.
1251+
- The accepted design is:
1252+
- keep the existing raw public builder methods safe and validating;
1253+
- split validation from item layout/wire writing internally;
1254+
- add/use an already-validated cgroups path flow for request-derived paths;
1255+
- preserve corruption detection for raw application input and peer-decoded response data;
1256+
- apply the same organization in Go, Rust, and C.
1257+
- This design must not silently trust arbitrary raw byte slices. Any validated path entry point must be backed by a decoder-produced view or an internal request item path whose provenance is known.
1258+
1259+
Implementation update after the decision:
1260+
1261+
- Added request-backed cgroups response builder entry points in all three SDKs:
1262+
- C: `nipc_cgroups_lookup_builder_add_request_item()`
1263+
- Rust: `CgroupsLookupBuilder::add_request_item()`
1264+
- Go: `CgroupsLookupBuilder.AddRequestItem()`
1265+
- Kept the raw builder entry points validating raw application bytes.
1266+
- Split internal builder logic so item layout/wire writing can be shared while path validation is skipped only for decoded request-view paths.
1267+
- Updated C, Rust, and Go benchmark cgroups handlers to use the request-backed builder path.
1268+
- Added direct C/Rust/Go protocol tests for request-backed builder success and invalid-index rejection.
1269+
- Updated `docs/codec-cgroups-lookup.md` and `docs/netipc-integrator-skill.md` so future cgroups-lookup handlers use the request-backed builder only when echoing decoded request paths.
1270+
1271+
Validation evidence after the request-backed builder update:
1272+
1273+
- `gofmt -w bench/drivers/go/main.go src/go/pkg/netipc/protocol/cgroups_lookup.go src/go/pkg/netipc/protocol/apps_lookup.go src/go/pkg/netipc/protocol/lookup_common.go src/go/pkg/netipc/protocol/lookup_guard_test.go src/go/pkg/netipc/protocol/lookup_test.go` passed.
1274+
- `cargo fmt --manifest-path src/crates/netipc/Cargo.toml --all` passed.
1275+
- `git diff --check` passed.
1276+
- `cmake --build build --target test_protocol -j24 && build/bin/test_protocol` passed with `525 passed, 0 failed`.
1277+
- `cd src/go && go test -count=1 -timeout=300s ./pkg/netipc/protocol` passed.
1278+
- `cargo test --manifest-path src/crates/netipc/Cargo.toml protocol::lookup -- --test-threads=1` passed with `17 passed`.
1279+
- `cd src/go && go test -count=1 -timeout=300s ./pkg/netipc/...` passed.
1280+
- `cargo test --manifest-path src/crates/netipc/Cargo.toml -- --test-threads=1` passed with `375 passed`.
1281+
- `/usr/bin/ctest --test-dir build --output-on-failure` passed with `48/48` tests. The unqualified `ctest` command failed first because the workstation's `~/.local/bin/ctest` Python wrapper could not import `cmake`; the system CTest binary was then used successfully.
1282+
- `cmake --build build-bench-posix --target bench_posix_c bench_posix_go bench_posix_rs -j24` passed.
1283+
1284+
Focused POSIX release benchmark samples after the request-backed builder update:
1285+
1286+
- `build-bench-posix/bin/bench_posix_c lookup-method-bench 10 cgroups-lookup-mixed-8192 0`: `2079 req/s`, p50 `440 us`, p95 `708 us`, p99 `871 us`.
1287+
- `src/crates/netipc/target/release/bench_posix lookup-method-bench 10 cgroups-lookup-mixed-8192 0`: `1655 req/s`, p50 `572 us`, p95 `814 us`, p99 `965 us`.
1288+
- `build-bench-posix/bin/bench_posix_go lookup-method-bench 10 cgroups-lookup-mixed-8192 0`: `1204 req/s`, p50 `767 us`, p95 `1261 us`, p99 `1479 us`.
1289+
- `build-bench-posix/bin/bench_posix_c lookup-method-bench 10 apps-lookup-mixed-8192 0`: `2175 req/s`, p50 `424 us`, p95 `655 us`, p99 `782 us`.
1290+
- `src/crates/netipc/target/release/bench_posix lookup-method-bench 10 apps-lookup-mixed-8192 0`: `1802 req/s`, p50 `517 us`, p95 `763 us`, p99 `846 us`.
1291+
- `build-bench-posix/bin/bench_posix_go lookup-method-bench 10 apps-lookup-mixed-8192 0`: `1440 req/s`, p50 `633 us`, p95 `1096 us`, p99 `1190 us`.
1292+
1293+
Current interpretation:
1294+
1295+
- The request-backed cgroups builder removes the unsafe/code-smelly reason for double-validating decoded request paths in C, Rust, and Go.
1296+
- Go cgroups throughput improved in the focused sample and no longer carries the avoidable request-path revalidation penalty in the benchmark handler.
1297+
- Go remains slower than C/Rust in these microbenchmarks. That residual gap is now more likely runtime/allocation/implementation overhead outside the specific duplicated path scan, but this is a working theory until a fresh full benchmark/profile pass is reviewed.
1298+
- These focused samples do not replace full POSIX and Windows benchmark regeneration before SOW close.
1299+
11901300
## Downstream Vendoring Plan
11911301

11921302
Purpose:

bench/drivers/c/bench_posix.c

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1136,20 +1136,15 @@ static bool cgroups_lookup_bench_handler(void *user,
11361136
};
11371137

11381138
for (uint32_t i = 0; i < request->item_count; i++) {
1139-
nipc_cgroups_lookup_req_item_t item;
1140-
if (nipc_cgroups_lookup_req_item(request, i, &item) != NIPC_OK)
1141-
return false;
1142-
11431139
if (lookup_variant_is_known(ctx->variant, i)) {
1144-
if (nipc_cgroups_lookup_builder_add(
1145-
builder, NIPC_CGROUP_LOOKUP_KNOWN, NIPC_ORCHESTRATOR_K8S,
1146-
item.path.ptr, item.path.len,
1140+
if (nipc_cgroups_lookup_builder_add_request_item(
1141+
builder, request, i, NIPC_CGROUP_LOOKUP_KNOWN,
1142+
NIPC_ORCHESTRATOR_K8S,
11471143
"bench-pod", 9, labels, 2) != NIPC_OK)
11481144
return false;
11491145
} else {
1150-
if (nipc_cgroups_lookup_builder_add(
1151-
builder, NIPC_CGROUP_LOOKUP_UNKNOWN_RETRY_LATER, 0,
1152-
item.path.ptr, item.path.len,
1146+
if (nipc_cgroups_lookup_builder_add_request_item(
1147+
builder, request, i, NIPC_CGROUP_LOOKUP_UNKNOWN_RETRY_LATER, 0,
11531148
"", 0, NULL, 0) != NIPC_OK)
11541149
return false;
11551150
}

bench/drivers/go/main.go

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1421,15 +1421,12 @@ func runLookupMethodBench(durationSec int, scenario string, targetRPS uint64) in
14211421
if err == nil {
14221422
respLen, err = protocol.DispatchCgroupsLookup(reqBuf[:reqLen], respBuf, func(req *protocol.CgroupsLookupRequestView, builder *protocol.CgroupsLookupBuilder) bool {
14231423
for i := uint32(0); i < req.ItemCount; i++ {
1424-
path, ierr := req.Item(i)
1425-
if ierr != nil {
1426-
return false
1427-
}
14281424
if lookupVariantIsKnown(variant, int(i)) {
1429-
if berr := builder.Add(
1425+
if berr := builder.AddRequestItem(
1426+
req,
1427+
i,
14301428
protocol.CgroupLookupKnown,
14311429
protocol.OrchestratorK8s,
1432-
path.Bytes(),
14331430
[]byte("bench-pod"),
14341431
[]struct{ Key, Value []byte }{
14351432
{Key: []byte("namespace"), Value: []byte("bench")},
@@ -1438,10 +1435,11 @@ func runLookupMethodBench(durationSec int, scenario string, targetRPS uint64) in
14381435
); berr != nil {
14391436
return false
14401437
}
1441-
} else if berr := builder.Add(
1438+
} else if berr := builder.AddRequestItem(
1439+
req,
1440+
i,
14421441
protocol.CgroupLookupUnknownRetryLater,
14431442
0,
1444-
path.Bytes(),
14451443
nil,
14461444
nil,
14471445
); berr != nil {

bench/drivers/rust/src/main.rs

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1540,19 +1540,17 @@ mod posix_only {
15401540
&mut resp_buf,
15411541
|req, builder: &mut CgroupsLookupBuilder| {
15421542
for i in 0..req.item_count {
1543-
let Ok(path) = req.item(i) else {
1544-
return false;
1545-
};
15461543
if lookup_variant_is_known(variant, i as usize) {
15471544
let labels: [(&[u8], &[u8]); 2] = [
15481545
(&b"namespace"[..], &b"bench"[..]),
15491546
(&b"image"[..], &b"bench:latest"[..]),
15501547
];
15511548
if builder
1552-
.add(
1549+
.add_request_item(
1550+
req,
1551+
i,
15531552
CGROUP_LOOKUP_KNOWN,
15541553
ORCHESTRATOR_K8S,
1555-
path.as_bytes(),
15561554
b"bench-pod",
15571555
&labels,
15581556
)
@@ -1561,10 +1559,11 @@ mod posix_only {
15611559
return false;
15621560
}
15631561
} else if builder
1564-
.add(
1562+
.add_request_item(
1563+
req,
1564+
i,
15651565
CGROUP_LOOKUP_UNKNOWN_RETRY_LATER,
15661566
0,
1567-
path.as_bytes(),
15681567
b"",
15691568
&[],
15701569
)

docs/codec-cgroups-lookup.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,24 @@ The wire decoder validates structure. The typed client must also verify:
238238
An echoed-path mismatch is a server bug and the typed client must fail
239239
the response.
240240

241+
## Response Builder Guidance
242+
243+
Servers often echo the request path into each response item. In that case,
244+
prefer the request-backed response builder entry point in each language:
245+
246+
- C: `nipc_cgroups_lookup_builder_add_request_item()`
247+
- Rust: `CgroupsLookupBuilder::add_request_item()`
248+
- Go: `CgroupsLookupBuilder.AddRequestItem()`
249+
250+
These APIs read the path from a decoded `CGROUPS_LOOKUP` request view. The
251+
request decoder has already validated the path string, so the builder can
252+
avoid scanning the same path bytes again while still validating all raw
253+
application-owned fields such as the returned name and labels.
254+
255+
Use the raw builder `add` / `Add` entry point when the response path did not
256+
come from the decoded request view. Raw entry points must continue validating
257+
the supplied bytes.
258+
241259
## Security Considerations
242260

243261
Request paths are opaque lookup keys. A server must not resolve,

docs/netipc-integrator-skill.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,18 @@ Wire-format footguns for lookup methods:
269269
- C code must use explicit wire-size constants; the 60-byte
270270
`APPS_LOOKUP` item header naturally pads to 64 bytes as a C struct
271271

272+
Lookup handler builder rule:
273+
274+
- When a `cgroups-lookup` handler echoes a path from the decoded request into
275+
the response, use the request-backed builder API:
276+
- C: `nipc_cgroups_lookup_builder_add_request_item()`
277+
- Rust: `CgroupsLookupBuilder::add_request_item()`
278+
- Go: `CgroupsLookupBuilder.AddRequestItem()`
279+
- Use the raw `add` / `Add` APIs only for application-owned path bytes that did
280+
not come from the decoded request view.
281+
- This preserves corruption checks for raw inputs while avoiding a second scan
282+
of already-decoded request paths in hot lookup paths.
283+
272284
## Netdata Integration Model
273285

274286
The intended Netdata pattern is:

src/crates/netipc/src/protocol/lookup/apps_lookup.rs

Lines changed: 28 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -446,7 +446,8 @@ pub struct AppsLookupBuilder<'a> {
446446
data_offset: usize,
447447
error: Option<NipcError>,
448448
payload_exceeded_suffix: bool,
449-
payload_exceeded_suffix_bytes: Vec<usize>,
449+
payload_exceeded_suffix_bytes: Vec<u32>,
450+
payload_exceeded_fixed_item_len: Option<usize>,
450451
}
451452

452453
impl<'a> AppsLookupBuilder<'a> {
@@ -468,6 +469,7 @@ impl<'a> AppsLookupBuilder<'a> {
468469
error: None,
469470
payload_exceeded_suffix: false,
470471
payload_exceeded_suffix_bytes: Vec::new(),
472+
payload_exceeded_fixed_item_len: None,
471473
}
472474
}
473475

@@ -477,7 +479,10 @@ impl<'a> AppsLookupBuilder<'a> {
477479

478480
pub fn set_payload_exceeded_item_lens(&mut self, item_lens: Vec<usize>) {
479481
match payload_exceeded_suffix_bytes_from_lens(item_lens) {
480-
Ok(suffix_bytes) => self.payload_exceeded_suffix_bytes = suffix_bytes,
482+
Ok(suffix_bytes) => {
483+
self.payload_exceeded_suffix_bytes = suffix_bytes;
484+
self.payload_exceeded_fixed_item_len = None;
485+
}
481486
Err(err) => self.error = Some(err),
482487
}
483488
}
@@ -575,13 +580,7 @@ impl<'a> AppsLookupBuilder<'a> {
575580
Some(v) if v <= self.buf.len() => v,
576581
_ => return self.note_item_overflow(pid),
577582
};
578-
if !payload_exceeded_suffix_fits(
579-
self.buf.len(),
580-
item_end,
581-
&self.payload_exceeded_suffix_bytes,
582-
self.item_count + 1,
583-
self.max_items,
584-
) {
583+
if !self.payload_exceeded_suffix_fits(item_end) {
585584
return self.note_item_overflow(pid);
586585
}
587586
if item_start > self.data_offset {
@@ -648,6 +647,25 @@ impl<'a> AppsLookupBuilder<'a> {
648647
Ok(())
649648
}
650649

650+
fn payload_exceeded_suffix_fits(&self, data_offset: usize) -> bool {
651+
if let Some(item_len) = self.payload_exceeded_fixed_item_len {
652+
return payload_exceeded_fixed_suffix_fits(
653+
self.buf.len(),
654+
data_offset,
655+
item_len,
656+
self.item_count + 1,
657+
self.max_items,
658+
);
659+
}
660+
payload_exceeded_suffix_fits(
661+
self.buf.len(),
662+
data_offset,
663+
&self.payload_exceeded_suffix_bytes,
664+
self.item_count + 1,
665+
self.max_items,
666+
)
667+
}
668+
651669
fn add_non_known_item(&mut self, status: u16, pid: u32) -> Result<(), NipcError> {
652670
if self.item_count >= self.max_items {
653671
self.error = Some(NipcError::Overflow);
@@ -779,14 +797,7 @@ where
779797
}
780798
let mut builder = AppsLookupBuilder::new(resp, request.item_count, 0);
781799
if request.item_count > 0 {
782-
let item_count = request.item_count as usize;
783-
let capacity = item_count.checked_add(1).ok_or(NipcError::Overflow)?;
784-
let mut item_lens = Vec::with_capacity(capacity);
785-
item_lens.resize(item_count, APPS_LOOKUP_ITEM_HDR_SIZE + 3);
786-
builder.set_payload_exceeded_item_lens(item_lens);
787-
if let Some(err) = builder.error {
788-
return Err(err);
789-
}
800+
builder.payload_exceeded_fixed_item_len = Some(APPS_LOOKUP_ITEM_HDR_SIZE + 3);
790801
}
791802
if !handler(&request, &mut builder) {
792803
return Err(builder.error().unwrap_or(NipcError::HandlerFailed));

0 commit comments

Comments
 (0)