ci: run full barbacane-test integration suite (repair rotted tests + fix host-I/O epoch trap)#98
Merged
Merged
Conversation
The Integration Tests job only ran the crate's lib tests (cargo test -p barbacane-test --lib), leaving every tests/*.rs integration binary (proxy, plugins, streaming, validation, workload, routing, auth, cache, etc.) out of CI. Those suites drifted before (stale assertions surfaced in #95/#97 once they finally ran). Broaden the job to compile the crate once and run the lib tests plus all integration binaries. The security target is excluded here (it needs PostgreSQL + the control-plane binary and has its own security-suite job). Targets are discovered from tests/*.rs rather than hard-coded, so new suites are picked up automatically. No new services required: the kafka/nats tests are broker-unavailable negative tests, and the only Postgres-dependent suite is security.
…t window
Two issues surfaced once the full barbacane-test suite ran in CI:
1. The Integration Tests job installed the toolchain without the
wasm32-unknown-unknown target, so barbacane-test's build.rs could not
compile the streaming-echo/body-echo fixture WASM plugins (exit 101).
The streaming/body suites then had no real coverage (they panic on the
missing .wasm, or lean on a stale cached artifact). Add the target so
the fixtures build, matching the build and security-suite jobs.
2. ai_gateway::cel_selected_strict_profile_blocks_prompt failed twice with
StartupFailed("gateway did not become ready in time"). It boots the
same spec as a sibling test that passes, so it is not a spec/code bug:
with --test-threads=2 two CEL-heavy gateways cold-boot at once on a
shared CI runner and the loser exceeds the 60s health window. Widen the
TestGateway readiness timeout to 120s; a real boot hang still fails.
The shipped-fragment test's /v1/models step exercised the real api.openai.com / api.anthropic.com hosts (the fragment omits base_url for those providers). Locally they fast-fail (401) so the aggregator returns the expected partial 200; in CI they hang, and the accumulated wall-clock trips the ai-proxy plugin's epoch deadline (added in the WASM sandbox hardening), so the dispatcher traps and the gateway returns 502. Redirect all three providers at the wiremock by injecting base_url: env://<PROVIDER>_BASE_URL into the copied fragment's OpenAI/Anthropic routes, and mock GET /v1/models -> 500 so the partial path fires deterministically and fast. No outbound network, no epoch pressure. The partial-aggregation contract is also covered by the dedicated hermetic models tests.
cargo test stops at the first test binary that fails, which hides failures in every binary scheduled after it. Broadening CI surfaced one new failing binary per run (ai_gateway -> ai_proxy -> auth) instead of all at once. Run with --no-fail-fast so a single CI run reports the full set of failures across all integration binaries.
The WASM epoch deadline (max_execution_ms, default 100ms) bounds plugin CPU time, but the epoch clock keeps advancing while a plugin is blocked in a host function doing network I/O (broker publish, HTTP call/stream). A slow or unavailable upstream therefore made the plugin resume past its deadline and trap, surfacing as a 500 instead of a clean gateway error. This was observable as an inconsistency: an unavailable NATS broker returned 502 (it fails fast), while an unavailable Kafka broker returned 500 (rskafka retries the connect for ~5s, blowing the 100ms budget, so the dispatch trapped). Store max_execution_ms in PluginState and, in host_kafka_publish, host_nats_publish, host_http_call and host_http_stream, refresh the store's epoch deadline once the blocking call returns. Time spent waiting on native I/O no longer counts against the plugin's CPU-execution budget, so a slow/unreachable upstream yields a clean error response rather than a trap. The CPU guard on actual WASM execution is unchanged.
Broadening CI to run every barbacane-test integration binary surfaced 21 tests that had rotted while excluded. All are test-side issues (the production behavior they now exercise is correct, and several confirm the security hardening works): - auth JWT (5): tests used a fake signature relying on skip_signature_validation, which the production WASM correctly ignores (it is honored only under the plugin's own cfg(test)). Sign tokens with a real ES256 key (new p256 dev-dep) and embed the matching public JWK in a generated spec. - auth secrets (1): file:// secrets now require BARBACANE_SECRETS_DIR (fail-closed). Inject it via the new TestGateway::from_spec_with_env, which sets per-instance child env vars instead of racing on process globals. - mcp (1): a non-initialize method now needs a session; initialize first. - proxy (3): tests hit the real httpbin.org; point them at a local wiremock so they are hermetic and fast. - plugins/redirect (5): reqwest follows redirects by default, turning the 3xx into a followed 404/200; use a non-redirect client. - plugins/ip-restriction (2): X-Forwarded-For is only trusted from declared trusted_proxies (anti-spoofing); declare loopback so the XFF-based cases are exercised. - plugins/kafka (1): now passes via the host-I/O epoch fix. - streaming (3): the streaming-echo fixture declared capabilities with the wrong key syntax (http_stream = true) instead of host_functions = ["http_call"], and its plugin.toml was not beside the wasm where the compiler reads it. Fix the manifest and have build.rs copy plugin.toml next to the built fixture wasm.
…ases The previous fix added trusted_proxies to /allowlist and /denylist in the shared ip-restriction.yaml fixture, which broke the security suite's forged_xff_does_not_bypass_ip_restriction test: that test forges an XFF on /denylist and asserts it is ignored (anti-spoofing), which only holds when trusted_proxies is empty. Revert /allowlist and /denylist to their (empty trusted_proxies) defaults so the anti-spoofing guarantee is preserved, and add separate /allowlist-trusted-proxy and /denylist-trusted-proxy endpoints that trust the loopback peer. The plugins.rs XFF tests use those, so they exercise the 'honored when behind a trusted proxy' path without weakening the endpoints the security suite depends on.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Broadens CI to run the entire
barbacane-testintegration suite (everytests/*.rsbinary), not just the crate's lib tests, and repairs the tests this exposed. The suites had been excluded from CI, so they had quietly rotted. Fixing them also surfaced one real data-plane bug.CI wiring
tests/*.rs(thesecuritytarget keeps its own Postgres job).--no-fail-fastso one failing binary doesn't hide failures in the binaries scheduled after it.wasm32-unknown-unknowntarget on the job sobuild.rscan compile the fixture WASM plugins (they were failing with exit 101 and silently skipping the streaming/body suites).TestGatewayreadiness window to 120s: under the full suite, two CEL-heavy gateways can cold-boot at once on a shared runner and the loser exceeded the old 60s window.Test repairs (21 tests across 5 binaries)
All were test-side issues, and several confirm the security hardening is doing its job:
skip_signature_validation, which the production WASM correctly ignores (it is honored only under the plugin's owncfg(test)). Now signs tokens with a real ES256 key (newp256dev-dep) and embeds the matching public JWK in a generated spec.file://secrets now requireBARBACANE_SECRETS_DIR(fail-closed). Injected via a newTestGateway::from_spec_with_env, which sets per-instance child env vars instead of racing on process globals.initializemethod now needs a session; the test initializes first.api/httpbin.org; pointed at a local wiremock so they are hermetic and fast.X-Forwarded-Foris only trusted from declaredtrusted_proxies(anti-spoofing default); the fixture now declares loopback so the XFF cases are exercised.streaming-echofixture declared capabilities with the wrong syntax (http_stream = true) instead ofhost_functions = ["http_call"], and itsplugin.tomlwas not beside the wasm where the compiler reads it. Fixed the manifest and madebuild.rscopyplugin.tomlnext to the built fixture wasm.Data-plane fix (the one real bug)
plugins/kafkabroker-unavailable returned 500 while NATS returned 502 for the same scenario. Root cause: the WASM epoch deadline (max_execution_ms, default 100ms) bounds plugin CPU time, but the epoch clock keeps advancing while a plugin is blocked in a host function doing network I/O. A slow/unavailable upstream made the plugin resume past its deadline and trap (500). Kafka's connect retries for ~5s (blowing the 100ms budget), NATS fails fast.Fix: store
max_execution_msinPluginStateand refresh the store's epoch deadline once the blocking call returns, inhost_kafka_publish,host_nats_publish,host_http_callandhost_http_stream. Time spent waiting on native I/O no longer counts against the plugin's CPU budget, so a slow/unreachable upstream yields a clean error instead of a trap. The CPU guard on actual WASM execution is unchanged.Verification
All 21 previously-failing tests pass, verified locally one binary at a time (no sweeps): plugins 59, auth 13, proxy 5, mcp, streaming 3, plus barbacane-wasm unit 187. Clippy clean on the modified library crates.