Commit ed64a6a
authored
test(compliance): MCP 2025-11-25 protocol compliance harness (#4301)
* test(compliance): add MCP 2025-11-25 protocol compliance harness
Build a FastMCP-based test harness that probes ContextForge against the
MCP 2025-11-25 specification across three targets: reference stdio,
gateway proxy (federated), and gateway virtual server. Includes a new
reference server (mcp-servers/python/compliance_reference_server) with
tools/resources/prompts covering lifecycle, transport, pagination,
notifications, subscriptions, roots, sampling, elicitation, and logging.
Key additions:
- tests/protocol_compliance/: parametrized test suite with XPASS-capture,
drift detection across targets, and gap-tracking workflow
- scripts/compliance_matrix.py: orchestrator that runs the suite across
engine modes (python/shadow/edge/full) and aggregates results
- Runtime-mutable MCP mode tests covering shadow<->edge flip, boot-mode
rails, publish/audit response contract, and data-plane runtime header
- OAuth 2.1 / Authorization tests against Keycloak (tier-gated skip)
- Security Best Practices probes (origin validation, bind address,
confused-deputy)
- Rewrite tests/e2e/test_mcp_cli_protocol.py as test_mcp_protocol_e2e.py
using FastMCP Client directly (no external CLI deps)
- Makefile: test-protocol-compliance* targets; testing-up folds in sso
profile so Keycloak comes up with the gateway stack
- pre-commit: scope name-tests-test hook to tests/unit/ (integration,
e2e, and compliance suites legitimately host non-test Python modules)
Closes #4273 (runtime-mutable MCP mode observability coverage).
Signed-off-by: Jonathan Springer <jps@s390x.com>
* refactor(tests): group test helpers under tests/*/helpers/ subdirs
Move loose non-test Python modules into conventional helpers/ subdirs
so the test-naming hook can exclude a single pattern instead of
enumerating each file.
Moves:
- tests/e2e/mcp_test_helpers.py -> tests/e2e/helpers/mcp_test_helpers.py
- tests/protocol_compliance/_helpers.py ->
tests/protocol_compliance/helpers/compliance.py
- tests/protocol_compliance/_drift.py ->
tests/protocol_compliance/helpers/drift.py
Updates all import sites across tests/e2e, tests/e2e_rust, and
tests/protocol_compliance.
Config cleanup:
- name-tests-test now excludes tests/*/(helpers|fixtures|targets|pages)/
as a single verbose-regex alternative instead of file-by-file.
- Top-level pre-commit exclude and detect-secrets excludes converted
to verbose-regex with inline comments.
- Dropped non-existent excludes (scripts/sign_image.sh, scripts/zap,
sonar-project.properties) and the redundant .secrets.baseline
entry (detect-secrets skips its own baseline automatically).
- detect-secrets hook in pre-commit now carries its own exclude list,
kept in sync with the Makefile invocation.
Signed-off-by: Jonathan Springer <jps@s390x.com>
* fix(keycloak): enable service-accounts on the mcp-gateway dev realm client
The local Keycloak dev realm disables the client_credentials grant, so
compliance tests that exercise OAuth 2.1 token issuance against the
preconfigured `mcp-gateway` client fail with
`unauthorized_client: Client not enabled to retrieve service account`.
Flip `serviceAccountsEnabled` to `true` so fresh `docker compose` volume
imports come up with the grant available. Existing stacks can apply the
same change via the Keycloak Admin API against the running realm; the
stored volume overrides the import file once the realm exists, which is
why the change has to land here to stick across rebuilds.
Signed-off-by: Jonathan Springer <jps@s390x.com>
* test(compliance): PR-review follow-ups — silent-failure fixes, behavioral coverage, type design, stream attribution
Bundles the fixes surfaced by running the pr-review-toolkit agents
(code-reviewer, pr-test-analyzer, comment-analyzer,
silent-failure-hunter, type-design-analyzer) plus two iterations of
Codex stop-review feedback.
Silent-failure fixes:
- _parse_junit / _run_slice / stale-junit unlink /
_wait_data_plane_converges return-value check close the
"0/0/0/0 row" silent path in the matrix orchestrator.
- Reference server stdout+stderr captured to a session tmp log; tail
dumped on readiness timeout instead of swallowed.
- flip_runtime_mode / _delete_if_exists / _wait_for_federation_sync
/ _probe_state / Keycloak token fetch surface last-status
diagnostics instead of collapsing to None.
- conftest fixture-resolution narrows catch-all so ImportError /
NameError / SyntaxError from a broken fixture definition
propagate; only runtime unreachability skips.
- pytest_runtest_logreport wraps sidecar write in try/except so an
unwriteable log doesn't poison the run.
Type design:
- ComplianceTarget: __init_subclass__ enforces name and
supported_transports on every concrete subclass; base client()
validates transport support and dispatches to abstract
_open_client(); per-subclass boilerplate deleted.
- SliceResult: stored `ran` field removed (derive from
skip_reason); __post_init__ rejects negative counts and the
inconsistent "skipped slice with non-zero counts" state.
- ReferenceUpstream: frozen, no raw Popen field — process handle
stays in fixture closure so tests can't race teardown.
- KeycloakConfig: token_endpoint is now a property; __repr__
redacts client_secret.
Stream attribution:
- MCP 2025-11-25 § Listening for Messages is explicit:
server→client *requests* (roots/list, sampling/createMessage,
elicitation/create) MUST NOT ride the standalone GET /mcp/
stream; they travel on the POST-correlated stream. GAP-002..005
and seven xfail reasons wrongly blamed #4205 (standalone-stream
closure). Rewrote the Why sections and reasons to point at
POST-correlated-stream relay instead.
- GAP-011 (resource-subscription updates) is the only gap that
legitimately points at #4205.
- Added a canonical "Stream-attribution note" to COMPLIANCE_GAPS.md.
- Filed GAP-011 (resource subscriptions) and GAP-012 (gateway
returns -32000 where spec requires -32601).
Behavioral coverage (6 new tests + 4 reference-server witnesses):
- notifications/cancelled end-to-end delivery.
- logging/setLevel filter-threshold round-trip.
- JSON-RPC reserved codes -32700 / -32601 / -32602.
- tools/call argument-type validation (narrowed to McpError).
- initialize-twice lifecycle rejection.
- MCP-Protocol-Version mismatch handling.
- Reference-server side-effect witnesses for mutate_resource_list /
mutate_prompt_list / bump_subscribable notifications and the
subscribe/unsubscribe handler round-trip.
Each raw-httpx test fails on 5xx and accepts 4xx as HTTP-level
rejection; no false-green paths where a server crash would read as
compliance.
Bug fixes:
- Reference server: mutator tools now use a dedicated
_mutation_counter to prevent ephemeral_0 name-collisions.
- test_drift._collect narrows to pytest.skip.Exception; other probe
errors are recorded so drift diffs show the root cause.
- Makefile PYTEST_IGNORE excludes tests/protocol_compliance.
Matrix rust_full slice: 58 passed / 0 failed / 40 xfailed / 0 xpassed
/ 11 skipped (was 53/0/34/0/11). Zero unexpected failures.
Signed-off-by: Jonathan Springer <jps@s390x.com>
* fix(e2e): align test_oauth_jwks_e2e JWT secret with gateway minimum
Test defaulted its JWT secret to "my-test-key" (11 chars) when
JWT_SECRET_KEY was unset; the gateway's validator requires ≥32 chars
so every admin-API call the fixtures make would fail with
401 "Invalid authentication credentials" on default deployments.
Import JWT_SECRET from helpers.mcp_test_helpers — the 40-char default
the rest of the e2e suite already shares — so this test self-configures
without requiring the operator to export JWT_SECRET_KEY. Overriding
via the env var still works.
Signed-off-by: Jonathan Springer <jps@s390x.com>
* ci(pre-commit): exclude tests/**/targets/ from name-tests-test in the lite config
The main pre-commit config already excludes these POM-style support
directories, but CI runs the lite config (Makefile:4175 —
`pre-commit run --config .pre-commit-lite.yaml`) which was a
near-duplicate that hadn't been kept in sync. Add `targets` to its
excluded-dir alternation so the four compliance-harness target modules
stop tripping the hook in CI.
Signed-off-by: Jonathan Springer <jps@s390x.com>
* chore(deps): pin authlib>=1.7.0 in the dev group via exclude-newer-package override
authlib is a transitive dep of fastmcp, which the compliance harness
uses as a test client — both live in the dev dependency group.
mcpgateway itself imports nothing from authlib, so the constraint goes
in `[dependency-groups] dev` rather than main project deps.
Release 1.7.0 (2026-04-18) ships security fixes but was blocked by the
repo-wide `exclude-newer = "10 days"` newness filter, leaving the
lockfile on 1.6.9. Adds an `exclude-newer-package` entry for authlib
(2026-04-19) so the filter allows the new release.
Signed-off-by: Jonathan Springer <jps@s390x.com>
---------
Signed-off-by: Jonathan Springer <jps@s390x.com>1 parent a0ad03e commit ed64a6a
78 files changed
Lines changed: 5947 additions & 804 deletions
File tree
- crates/mcp_runtime
- docs/docs
- architecture
- development
- testing
- infra/keycloak
- mcp-servers/python/compliance_reference_server
- src/compliance_reference_server
- tests
- scripts
- tests
- e2e_rust
- e2e
- helpers
- protocol_compliance
- fixtures
- helpers
- targets
- unit/mcpgateway
- services
- transports
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
22 | 38 | | |
23 | 39 | | |
24 | 40 | | |
| |||
347 | 363 | | |
348 | 364 | | |
349 | 365 | | |
350 | | - | |
| 366 | + | |
351 | 367 | | |
352 | | - | |
353 | | - | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
354 | 376 | | |
355 | 377 | | |
356 | 378 | | |
| |||
614 | 636 | | |
615 | 637 | | |
616 | 638 | | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
354 | 354 | | |
355 | 355 | | |
356 | 356 | | |
357 | | - | |
| 357 | + | |
358 | 358 | | |
359 | 359 | | |
360 | 360 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| |||
326 | 326 | | |
327 | 327 | | |
328 | 328 | | |
329 | | - | |
| 329 | + | |
330 | 330 | | |
331 | 331 | | |
332 | 332 | | |
333 | 333 | | |
334 | 334 | | |
335 | 335 | | |
336 | 336 | | |
337 | | - | |
| 337 | + | |
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
345 | | - | |
| 345 | + | |
346 | 346 | | |
347 | 347 | | |
348 | 348 | | |
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
352 | 352 | | |
353 | | - | |
| 353 | + | |
354 | 354 | | |
355 | 355 | | |
356 | 356 | | |
357 | 357 | | |
358 | 358 | | |
359 | 359 | | |
360 | 360 | | |
361 | | - | |
| 361 | + | |
362 | 362 | | |
363 | 363 | | |
364 | 364 | | |
365 | 365 | | |
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
369 | | - | |
| 369 | + | |
370 | 370 | | |
371 | 371 | | |
372 | 372 | | |
373 | 373 | | |
374 | 374 | | |
375 | 375 | | |
376 | 376 | | |
377 | | - | |
| 377 | + | |
378 | 378 | | |
379 | 379 | | |
380 | 380 | | |
381 | 381 | | |
382 | 382 | | |
383 | 383 | | |
384 | 384 | | |
385 | | - | |
| 385 | + | |
386 | 386 | | |
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
390 | 390 | | |
391 | 391 | | |
392 | 392 | | |
393 | | - | |
| 393 | + | |
394 | 394 | | |
395 | 395 | | |
396 | 396 | | |
| |||
5977 | 5977 | | |
5978 | 5978 | | |
5979 | 5979 | | |
5980 | | - | |
| 5980 | + | |
5981 | 5981 | | |
5982 | 5982 | | |
5983 | 5983 | | |
5984 | 5984 | | |
5985 | | - | |
| 5985 | + | |
5986 | 5986 | | |
5987 | 5987 | | |
5988 | 5988 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
724 | 724 | | |
725 | 725 | | |
726 | 726 | | |
727 | | - | |
728 | | - | |
729 | | - | |
730 | | - | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
731 | 733 | | |
732 | 734 | | |
733 | 735 | | |
| |||
759 | 761 | | |
760 | 762 | | |
761 | 763 | | |
762 | | - | |
| 764 | + | |
763 | 765 | | |
764 | | - | |
| 766 | + | |
| 767 | + | |
765 | 768 | | |
766 | 769 | | |
767 | 770 | | |
| |||
773 | 776 | | |
774 | 777 | | |
775 | 778 | | |
776 | | - | |
777 | | - | |
| 779 | + | |
| 780 | + | |
778 | 781 | | |
| 782 | + | |
| 783 | + | |
779 | 784 | | |
780 | | - | |
781 | | - | |
782 | | - | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
783 | 819 | | |
784 | 820 | | |
785 | 821 | | |
| |||
1619 | 1655 | | |
1620 | 1656 | | |
1621 | 1657 | | |
1622 | | - | |
| 1658 | + | |
1623 | 1659 | | |
1624 | 1660 | | |
1625 | 1661 | | |
| |||
1630 | 1666 | | |
1631 | 1667 | | |
1632 | 1668 | | |
| 1669 | + | |
1633 | 1670 | | |
1634 | 1671 | | |
1635 | 1672 | | |
| |||
1676 | 1713 | | |
1677 | 1714 | | |
1678 | 1715 | | |
1679 | | - | |
| 1716 | + | |
1680 | 1717 | | |
1681 | 1718 | | |
1682 | 1719 | | |
| |||
7703 | 7740 | | |
7704 | 7741 | | |
7705 | 7742 | | |
7706 | | - | |
7707 | | - | |
| 7743 | + | |
| 7744 | + | |
| 7745 | + | |
| 7746 | + | |
| 7747 | + | |
| 7748 | + | |
| 7749 | + | |
| 7750 | + | |
| 7751 | + | |
| 7752 | + | |
| 7753 | + | |
7708 | 7754 | | |
7709 | 7755 | | |
7710 | 7756 | | |
| |||
0 commit comments