Skip to content

ci: run full barbacane-test integration suite (repair rotted tests + fix host-I/O epoch trap)#98

Merged
ndreno merged 8 commits into
mainfrom
ci/broaden-integration-binaries
Jul 3, 2026
Merged

ci: run full barbacane-test integration suite (repair rotted tests + fix host-I/O epoch trap)#98
ndreno merged 8 commits into
mainfrom
ci/broaden-integration-binaries

Conversation

@ndreno

@ndreno ndreno commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

What

Broadens CI to run the entire barbacane-test integration suite (every tests/*.rs binary), not just the crate's lib tests, and repairs the tests this exposed. The suites had been excluded from CI, so they had quietly rotted. Fixing them also surfaced one real data-plane bug.

CI wiring

  • Integration Tests job now runs the lib tests plus every integration binary, auto-discovered from tests/*.rs (the security target keeps its own Postgres job).
  • Added --no-fail-fast so one failing binary doesn't hide failures in the binaries scheduled after it.
  • Installed the wasm32-unknown-unknown target on the job so build.rs can compile the fixture WASM plugins (they were failing with exit 101 and silently skipping the streaming/body suites).
  • Widened the TestGateway readiness window to 120s: under the full suite, two CEL-heavy gateways can cold-boot at once on a shared runner and the loser exceeded the old 60s window.

Test repairs (21 tests across 5 binaries)

All were test-side issues, and several confirm the security hardening is doing its job:

  • auth JWT (5) used a fake signature and relied on skip_signature_validation, which the production WASM correctly ignores (it is honored only under the plugin's own cfg(test)). Now signs tokens with a real ES256 key (new p256 dev-dep) and embeds the matching public JWK in a generated spec.
  • auth secrets (1): file:// secrets now require BARBACANE_SECRETS_DIR (fail-closed). Injected via a new TestGateway::from_spec_with_env, which sets per-instance child env vars instead of racing on process globals.
  • mcp (1): a non-initialize method now needs a session; the test initializes first.
  • proxy (3) hit the real api/httpbin.org; pointed at a local wiremock so they are hermetic and fast.
  • plugins/redirect (5): reqwest follows redirects by default, turning the 3xx into a followed 404/200; now uses a non-redirect client.
  • plugins/ip-restriction (2): X-Forwarded-For is only trusted from declared trusted_proxies (anti-spoofing default); the fixture now declares loopback so the XFF cases are exercised.
  • streaming (3): the streaming-echo fixture declared capabilities with the wrong syntax (http_stream = true) instead of host_functions = ["http_call"], and its plugin.toml was not beside the wasm where the compiler reads it. Fixed the manifest and made build.rs copy plugin.toml next to the built fixture wasm.

Data-plane fix (the one real bug)

plugins/kafka broker-unavailable returned 500 while NATS returned 502 for the same scenario. Root cause: the WASM epoch deadline (max_execution_ms, default 100ms) bounds plugin CPU time, but the epoch clock keeps advancing while a plugin is blocked in a host function doing network I/O. A slow/unavailable upstream made the plugin resume past its deadline and trap (500). Kafka's connect retries for ~5s (blowing the 100ms budget), NATS fails fast.

Fix: store max_execution_ms in PluginState and refresh the store's epoch deadline once the blocking call returns, in host_kafka_publish, host_nats_publish, host_http_call and host_http_stream. Time spent waiting on native I/O no longer counts against the plugin's CPU budget, so a slow/unreachable upstream yields a clean error instead of a trap. The CPU guard on actual WASM execution is unchanged.

Verification

All 21 previously-failing tests pass, verified locally one binary at a time (no sweeps): plugins 59, auth 13, proxy 5, mcp, streaming 3, plus barbacane-wasm unit 187. Clippy clean on the modified library crates.

ndreno added 7 commits July 2, 2026 17:35
The Integration Tests job only ran the crate's lib tests (cargo test
-p barbacane-test --lib), leaving every tests/*.rs integration binary
(proxy, plugins, streaming, validation, workload, routing, auth, cache,
etc.) out of CI. Those suites drifted before (stale assertions surfaced
in #95/#97 once they finally ran).

Broaden the job to compile the crate once and run the lib tests plus all
integration binaries. The security target is excluded here (it needs
PostgreSQL + the control-plane binary and has its own security-suite
job). Targets are discovered from tests/*.rs rather than hard-coded, so
new suites are picked up automatically.

No new services required: the kafka/nats tests are broker-unavailable
negative tests, and the only Postgres-dependent suite is security.
…t window

Two issues surfaced once the full barbacane-test suite ran in CI:

1. The Integration Tests job installed the toolchain without the
   wasm32-unknown-unknown target, so barbacane-test's build.rs could not
   compile the streaming-echo/body-echo fixture WASM plugins (exit 101).
   The streaming/body suites then had no real coverage (they panic on the
   missing .wasm, or lean on a stale cached artifact). Add the target so
   the fixtures build, matching the build and security-suite jobs.

2. ai_gateway::cel_selected_strict_profile_blocks_prompt failed twice with
   StartupFailed("gateway did not become ready in time"). It boots the
   same spec as a sibling test that passes, so it is not a spec/code bug:
   with --test-threads=2 two CEL-heavy gateways cold-boot at once on a
   shared CI runner and the loser exceeds the 60s health window. Widen the
   TestGateway readiness timeout to 120s; a real boot hang still fails.
The shipped-fragment test's /v1/models step exercised the real
api.openai.com / api.anthropic.com hosts (the fragment omits base_url for
those providers). Locally they fast-fail (401) so the aggregator returns
the expected partial 200; in CI they hang, and the accumulated wall-clock
trips the ai-proxy plugin's epoch deadline (added in the WASM sandbox
hardening), so the dispatcher traps and the gateway returns 502.

Redirect all three providers at the wiremock by injecting
base_url: env://<PROVIDER>_BASE_URL into the copied fragment's
OpenAI/Anthropic routes, and mock GET /v1/models -> 500 so the partial
path fires deterministically and fast. No outbound network, no epoch
pressure. The partial-aggregation contract is also covered by the
dedicated hermetic models tests.
cargo test stops at the first test binary that fails, which hides
failures in every binary scheduled after it. Broadening CI surfaced one
new failing binary per run (ai_gateway -> ai_proxy -> auth) instead of
all at once. Run with --no-fail-fast so a single CI run reports the full
set of failures across all integration binaries.
The WASM epoch deadline (max_execution_ms, default 100ms) bounds plugin
CPU time, but the epoch clock keeps advancing while a plugin is blocked
in a host function doing network I/O (broker publish, HTTP call/stream).
A slow or unavailable upstream therefore made the plugin resume past its
deadline and trap, surfacing as a 500 instead of a clean gateway error.

This was observable as an inconsistency: an unavailable NATS broker
returned 502 (it fails fast), while an unavailable Kafka broker returned
500 (rskafka retries the connect for ~5s, blowing the 100ms budget, so
the dispatch trapped).

Store max_execution_ms in PluginState and, in host_kafka_publish,
host_nats_publish, host_http_call and host_http_stream, refresh the
store's epoch deadline once the blocking call returns. Time spent waiting
on native I/O no longer counts against the plugin's CPU-execution budget,
so a slow/unreachable upstream yields a clean error response rather than a
trap. The CPU guard on actual WASM execution is unchanged.
Broadening CI to run every barbacane-test integration binary surfaced 21
tests that had rotted while excluded. All are test-side issues (the
production behavior they now exercise is correct, and several confirm the
security hardening works):

- auth JWT (5): tests used a fake signature relying on
  skip_signature_validation, which the production WASM correctly ignores
  (it is honored only under the plugin's own cfg(test)). Sign tokens with
  a real ES256 key (new p256 dev-dep) and embed the matching public JWK in
  a generated spec.
- auth secrets (1): file:// secrets now require BARBACANE_SECRETS_DIR
  (fail-closed). Inject it via the new TestGateway::from_spec_with_env,
  which sets per-instance child env vars instead of racing on process
  globals.
- mcp (1): a non-initialize method now needs a session; initialize first.
- proxy (3): tests hit the real httpbin.org; point them at a local
  wiremock so they are hermetic and fast.
- plugins/redirect (5): reqwest follows redirects by default, turning the
  3xx into a followed 404/200; use a non-redirect client.
- plugins/ip-restriction (2): X-Forwarded-For is only trusted from
  declared trusted_proxies (anti-spoofing); declare loopback so the
  XFF-based cases are exercised.
- plugins/kafka (1): now passes via the host-I/O epoch fix.
- streaming (3): the streaming-echo fixture declared capabilities with the
  wrong key syntax (http_stream = true) instead of host_functions =
  ["http_call"], and its plugin.toml was not beside the wasm where the
  compiler reads it. Fix the manifest and have build.rs copy plugin.toml
  next to the built fixture wasm.
@ndreno ndreno changed the title ci: run full barbacane-test integration suite ci: run full barbacane-test integration suite (repair rotted tests + fix host-I/O epoch trap) Jul 3, 2026
…ases

The previous fix added trusted_proxies to /allowlist and /denylist in the
shared ip-restriction.yaml fixture, which broke the security suite's
forged_xff_does_not_bypass_ip_restriction test: that test forges an XFF on
/denylist and asserts it is ignored (anti-spoofing), which only holds when
trusted_proxies is empty.

Revert /allowlist and /denylist to their (empty trusted_proxies) defaults so
the anti-spoofing guarantee is preserved, and add separate
/allowlist-trusted-proxy and /denylist-trusted-proxy endpoints that trust the
loopback peer. The plugins.rs XFF tests use those, so they exercise the
'honored when behind a trusted proxy' path without weakening the endpoints the
security suite depends on.
@ndreno ndreno marked this pull request as ready for review July 3, 2026 09:39
@ndreno ndreno merged commit 764e972 into main Jul 3, 2026
13 checks passed
@ndreno ndreno deleted the ci/broaden-integration-binaries branch July 3, 2026 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant