Skip to content

[TESTING]: Automated end-to-end tests for PII filter plugin via plugin management API #4221

@jonpspri

Description

@jonpspri

Summary

Build an automated end-to-end test suite for the PII filter plugin (cpex-pii-filter, see #3965) that drives the plugin management API to activate, configure, and deactivate the plugin for a given user/tenant, and verifies that PII in prompts and tool arguments is handled correctly across the full matrix of configuration settings.

Motivation

Scope

Test Harness

  • Stand up a ContextForge instance (test container or in-process app) with observability enabled.
  • Create a test user/tenant, mint a JWT, and drive all plugin state changes through the plugin management / bindings API — no static YAML edits.
  • Wrap common actions (enable plugin, set config, bind to tool, invoke tool, assert on response) in reusable pytest fixtures / helpers.

Configuration Matrix

At minimum, each scenario should vary and assert on:

  • Plugin state: disabled → enabled → disabled (confirm state transitions take effect on subsequent invocations without restart).
  • Mode: block, redact, flag-only (or whatever modes the plugin exposes).
  • Categories enabled: e.g. emails, phone numbers, SSNs, credit-card numbers, names, addresses — tested individually and in combinations.
  • Scope / binding: plugin bound globally vs. per-tool vs. per-tenant; confirm non-bound tools/tenants are unaffected.
  • Hook coverage: prompt_pre_fetch, tool_pre_invoke, and (if supported) tool_post_invoke — verify PII is handled on both inbound args and outbound responses as configured.

Assertions

For each scenario:

  • The response / tool-call payload matches the expected redaction / block / flag behavior.
  • Violations (when applicable) are recorded with the expected category and confidence.
  • Disabling the plugin mid-test restores pass-through behavior on the next invocation.
  • Other users / tools with different bindings are unaffected (isolation check).
  • Observability signals (spans, structured logs) reflect plugin activity — useful smoke test that the plugin ran at all.

Proposed Location

  • tests/e2e/plugins/test_pii_filter_e2e.py (new), with shared fixtures in tests/e2e/plugins/conftest.py.
  • A dedicated make test-e2e-plugins target (or extension of existing e2e target) so the suite can be run independently of unit tests.

Acceptance Criteria

  • End-to-end test file exercises activate → configure → invoke → assert → deactivate flow entirely through the plugin management API.
  • Test matrix covers at least: all supported modes, 3+ PII categories individually, 2+ binding scopes, and state-transition (enable/disable) cases.
  • Isolation check confirms changes to one user/tenant/tool binding do not affect others.
  • Suite runs in CI (on a schedule or gated target if runtime is long); failures are actionable.
  • Tests use real JWTs and real API calls — no monkey-patching of the plugin manager internals.

References

Metadata

Metadata

Labels

pluginssecurityImproves securitytestingTesting (unit, e2e, manual, automated, etc)triageIssues / Features awaiting triage

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions