Skip to content

Explore specialist review passes behind a single reviewer lane #202

@davidruzicka

Description

@davidruzicka

Summary

Evaluate whether the code review stage should stay externally single-lane (reviewer) while internally supporting specialized review passes (for example security, performance, API/contract, and test adequacy).

Motivation

Current workflow design treats review as a single reviewer lane. That keeps the state machine simple, but a single generic review prompt may not cover the full space of review concerns well enough across code quality, security, performance, compatibility, and test quality.

Recent discussion and current research direction suggest a useful hybrid shape:

  • keep one public reviewer lane for deterministic workflow semantics,
  • optionally run specialized internal review passes when repo/risk signals justify them,
  • aggregate findings into one normalized review artifact/comment instead of producing multiple noisy public reviewer personas.

Why this might help

  • Better coverage for heterogeneous review tasks.
  • Preserve current external workflow semantics (reviewer + merger).
  • Reduce noise compared with multiple public reviewer agents commenting independently.
  • Allow selective, risk-triggered review depth instead of paying full cost on every PR.

Candidate design direction

External contract

  • Keep one public reviewer lane.
  • Keep one merge decision path through merger.

Internal review passes

  • reviewer-general
  • reviewer-security
  • reviewer-performance
  • reviewer-api
  • reviewer-tests

Triggering

Only run specialist passes when justified, for example:

  • security-sensitive files / auth / crypto / shell / file / deserialization paths,
  • hot paths or likely performance-sensitive changes,
  • schema / API / contract changes,
  • large test deltas or suspiciously low test impact.

Output shape

Aggregate all findings into a normalized machine-readable artifact with fields like:

  • category
  • severity
  • confidence
  • evidence
  • blocking
  • head_sha
  • source_pass

Guardrails

  • avoid duplicate findings across passes,
  • keep fail-closed merge semantics explicit,
  • do not let multiple reviewer personas create contradictory public state,
  • prefer specialist passes + tooling/analyzers over multiple nearly-identical free-text prompts.

Open questions

  • Which specialist passes are worth the runtime/cost overhead first?
  • Should specialist findings be blocking by default or only after confidence/severity thresholds?
  • How should the planner/implementor handoff represent specialist findings?
  • Which findings should remain advisory vs merge-blocking?

Relationship to current work

Agent note: this issue body was drafted by an agent from a research-backed design discussion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-supportPlanning and implementation work for GitHub-native agent development supportagent:plannedagent:safeSmall, concrete, low-risk, testable work suitable for autonomous implementationarchitectureSeparation-of-concerns, dependency, or boundary design improvementenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions