Skip to content

Good first issue: Add an MCP trust-boundary regression scenario #15

@mertsatilmaz

Description

@mertsatilmaz

Goal

Add a small executable regression scenario for an MCP trust-boundary failure.

Why this matters

MCP-connected agents may receive data, tool descriptions, resources, or instructions from external servers. A common failure mode is treating untrusted MCP-provided content as if it were trusted system or developer instruction.

The harness should include a scenario that checks whether an agent keeps MCP-provided content inside the correct trust boundary.

Suggested starting point

Look at the existing files under scenarios/ and follow the same structure.

The scenario should model untrusted MCP-provided content attempting to influence the agent outside its allowed role.

Expected outcome

A new scenario file that includes:

  • untrusted MCP-like content
  • the behavior the attacker is trying to trigger
  • the expected secure behavior
  • one or more assertions that the harness can evaluate

Acceptance criteria

  • A new MCP trust-boundary scenario is added under scenarios/
  • The scenario validates successfully
  • The scenario can be run through the CLI
  • Tests pass locally
  • Any relevant docs are updated if needed

Notes

This does not need to use a real MCP server. A small simulated MCP-style input is enough for the first version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    good first issueGood for newcomershelp wantedExtra attention is neededscenarioNew or updated executable security regression scenario

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions