Skip to content

feat: Cross-trace violation detection (Meerkat) #194

@acailic

Description

@acailic

Paper Reference

  • Title: Detecting Safety Violations Across Many Agent Traces (Meerkat)
  • Authors: Adam Stein, Davis Brown, Hamed Hassani, Mayur Naik, Eric Wong
  • Year: 2026
  • URL: https://arxiv.org/abs/2604.11806
  • Venue: arXiv preprint

Paper Summary

Combines clustering with agentic search to uncover safety violations specified in natural language. Detects sparse failures that are only visible when multiple traces are analyzed together — failures invisible in any single trace.

Proposed Feature

Implement cross-trace analysis for detecting rare safety violations:

Core Capabilities

  • Trace Clustering: Group similar agent sessions and identify outlier behaviors within clusters
  • Cross-Trace Violation Search: Search across many sessions for patterns matching natural language violation descriptions
  • Sparse Failure Detection: Find failures that only appear when N+ traces are compared (not visible in any single trace)
  • Violation Dashboard: Show detected violations with supporting evidence from multiple traces

Technical Approach

  • Implement session embedding and clustering in the API layer
  • Add natural language violation query interface
  • Build cross-trace comparison algorithms
  • Create violation dashboard with evidence linking

Impact

Addresses a critical blind spot: many safety violations are only detectable across sessions, not within a single session. This feature would make Peaky Peek uniquely valuable for safety-critical agent deployments.

Labels

enhancement, paper-inspired, safety, analytics

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions