Fix PII result copy regression#89
Conversation
3db74a4 to
34056e9
Compare
Signed-off-by: lucarlig <luca.carlig@ibm.com>
34056e9 to
682905a
Compare
gandhipratik203
left a comment
There was a problem hiding this comment.
What's good
- Real bug, clear root-cause story. The
PyDict::iter()→PyMapping::items()swap is exactly the right level. CPEX'sCopyOnWriteDictwrapper inherits fromdict, so the C-levelcast::<PyDict>()succeeds, butdict.iter()only walks the underlying PyDict storage table — which is empty for an unmodified COW wrapper because everything lives inself._original. Going through the mapping protocol picks upitems()from the subclass. Surgical fix. - Test coverage is layered well. Rust unit tests pin the
max_collection_itemsboundary (allows at limit, rejects over) against the newmapping.len()path. Python integration tests cover the actual user-visible behavior at two levels — direct plugin call on MCP-shaped content, and end-to-end throughwrap_payload_for_isolation+apply_policy({"result"}). The second test is the real regression sentinel; without it this exact bug could come back silently. - Scoping is honest. The description explicitly calls out that this only fixes the PII filter and that #90 tracks the broader audit for other plugins under policy enforcement. Avoids overreach.
- Manual end-to-end check noted ("Gateway reproduction with latest
mcp-context-forgemain and localcpex-pii-filter:modified True,filtered True,Contact [REDACTED]"). Good given the stub-vs-real-CPEX caveat below. - Version bump is appropriate (patch bump on Cargo.toml, plugin-manifest, and Cargo.lock).
Findings
F1 (low) — Mapping protocol overhead vs raw PyDict iteration on a hot path.
mapping.items()?.iter() allocates a list of tuples and dispatches each entry through cast::<PyTuple>()? + two get_item(...)? calls, where the old dict.iter() had a C-level fast path with no allocation. For typical tool results (a few dozen entries) this is almost certainly noise, but tool_post_invoke runs on every successful tool call, so worth measuring once on a representative payload — particularly nested content arrays — to make sure the safety win doesn't show up as a p99 regression. Not a blocker; just worth a criterion data point if you have one handy or want to add one.
F2 (low) — Test-stub CopyOnWriteDict differs slightly from "real" CPEX.
The stub in plugins/tests/plugin_hooks.py doesn't override __contains__, so if key in self falls back to the underlying empty dict storage — works for the current tests because they only read originals, but it's fragile if a future test writes a new key via super().__setitem__ (the override of __getitem__ would route correctly, but key in self would still miss it). Manual validation against real CPEX in the gateway covers this for the bug-fix path; just flagging that the stub is not a full COW. Worth a one-line comment in the stub noting "minimal stub for these tests, not a full COW".
F3 (nit) — Could the Rust unit tests cover a non-PyDict mapping directly?
The new Rust tests construct PyDict and exercise the mapping.len()/items() paths through it. That validates the boundary, but doesn't actually exercise the bug scenario (a mapping whose underlying PyDict storage is empty). The Python integration test is what really pins the regression — but if you want belt-and-braces in Rust, a small test that builds a CopyOnWriteDict-style wrapper in Python and feeds it to process_nested would lock the Rust side too. Optional.
Verdict
Solid PR. F1 is the only one I'd be tempted to follow up on, and only if it turns out the hot path is hot enough to care. F2/F3 are tidiness asks.
msureshkumar88
left a comment
There was a problem hiding this comment.
Review: Approve with Findings
Core fix is correct — switching to PyMapping protocol is the right call for CPEX copy-on-write wrappers.
Findings (non-blocking)
Rust test gap: Both new Rust tests (allows_collection_limit_boundary, rejects_over_collection_limit) pass plain PyDict, not a dict subclass with hidden entries. They test boundary logic but don't exercise the actual regression path. A follow-up test creating a Python dict subclass with overridden __iter__ would close this gap.
CopyOnWriteDict missing __contains__: The test stub doesn't override __contains__, so key in cow_dict falls through to C-level dict which checks only empty underlying storage — always returns False for keys from _original. Doesn't break current tests, but fragile for future ones.
apply_policy equality: new_value == old_value works coincidentally because CopyOnWriteDict compares as {} via dict.__eq__. Implementation-detail dependency, not a semantic guarantee.
Failing CI check: 34/35 pass — what's failing? Confirm it's unrelated before merge.
mapping.items() materializes to PyList: Minor extra allocation vs lazy PyDict::iter(). Acceptable given the max_collection_items guard.
Strengths
test_tool_post_invoke_survives_cpex_policy_with_isolated_payloadcorrectly exercises the full gateway enforcement path- Version bump hits all three required locations
- Follow-up issue #90 scoped correctly for broader audit
Summary
tool_post_invokeresult maskingpii_filterfrom 0.3.1 to 0.3.2Root Cause
CPEX hook payload policies now enforce which top-level fields plugins may mutate. For
tool_post_invoke, policy allowsresult, so PII masking should survive if the plugin returns a modified result.When policy is active, CPEX isolates payloads with copy-on-write mapping wrappers. The Rust PII scanner treated mapping input as a raw
PyDict, which missed entries exposed only through the wrapper protocol. The plugin then returnedmodified_payload=None, soapply_policy()had no allowedresultupdate to merge and gateway kept the original tool result.Fix
The PII scanner now handles Python mappings through the mapping protocol instead of only reading raw
PyDictstorage. This makes CPEX copy-on-write wrappers visible to nested scanning and causes PII masking to return a policy-visible modified payload.The new regression test uses real CPEX isolation and
apply_policy()so it catches the gateway-enforcement path, not only direct plugin method behavior.Follow-Up
This PR fixes the PII filter case only. Issue #90 tracks the broader audit needed for all plugins that mutate hook payloads under CPEX policy enforcement.
Validation
make test-unitmake test-integrationmcp-context-forgemain and localcpex-pii-filter:modified True,filtered True,Contact [REDACTED]Note
Run the standard detailed code review workflow before marking this PR ready for review.