Skip to content

Add maintainer PR review skill#920

Merged
drbenvincent merged 7 commits into
mainfrom
maintainer-pr-review-skill
May 15, 2026
Merged

Add maintainer PR review skill#920
drbenvincent merged 7 commits into
mainfrom
maintainer-pr-review-skill

Conversation

@drbenvincent
Copy link
Copy Markdown
Collaborator

@drbenvincent drbenvincent commented May 15, 2026

Summary

  • Add a project skill for maintainer-style PR reviews that checks merge readiness, CI, correctness, security, tests, docs, and risks.
  • Keep SKILL.md as a compact router and add focused pr-type-* resource files for feature, bug fix, refactor, docs/notebook, data/dataset, and tests/CI/infrastructure PRs.
  • Add PR-type-specific review output emphasis so the final review foregrounds the right concerns for each category, while keeping the wording concise.
  • Fold in the strongest material from draft PR Add pr-review agent skill #865: concrete gh review workflow, claim verification, no-posting-without-approval rule, recurring review patterns, CausalPy code/docs conventions, comment templates, and skill maintenance guidance.
  • Address maintainer review feedback by removing the local-only babysit dependency, documenting where $CONDA_EXE comes from, adding maintenance ownership/triggers, and grounding SF-4 with a CausalPy example.

Test plan

  • CONDA_EXE="/Users/benjamv/miniforge3/bin/conda" "$CONDA_EXE" run -n CausalPy prek run --files ".github/skills/maintainer-pr-review/SKILL.md" ".github/skills/maintainer-pr-review/resources/pr-type-features.md" ".github/skills/maintainer-pr-review/resources/pr-type-bug-fixes.md" ".github/skills/maintainer-pr-review/resources/pr-type-refactors.md" ".github/skills/maintainer-pr-review/resources/pr-type-docs-notebooks.md" ".github/skills/maintainer-pr-review/resources/pr-type-data-datasets.md" ".github/skills/maintainer-pr-review/resources/pr-type-tests-ci-infra.md" ".github/skills/maintainer-pr-review/resources/workflow.md" ".github/skills/maintainer-pr-review/resources/code-patterns.md" ".github/skills/maintainer-pr-review/resources/docs-patterns.md" ".github/skills/maintainer-pr-review/resources/review-patterns.md" ".github/skills/maintainer-pr-review/resources/review-comments.md" ".github/skills/maintainer-pr-review/resources/maintenance.md"
  • Commit hooks passed for all skill commits.
  • Real-world smoke: used the PR Add maintainer PR review skill #920 maintainer review comment to exercise the skill maintenance path; it identified a local-only skill dependency and missing operational guidance, which this branch now fixes.

Supersedes

Co-authored-by: Cursor <cursoragent@cursor.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.07%. Comparing base (cafd26f) to head (716cc95).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #920   +/-   ##
=======================================
  Coverage   95.07%   95.07%           
=======================================
  Files          87       87           
  Lines       13701    13701           
  Branches      812      812           
=======================================
  Hits        13026    13026           
  Misses        479      479           
  Partials      196      196           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Cursor <cursoragent@cursor.com>
@drbenvincent drbenvincent mentioned this pull request May 15, 2026
3 tasks
Co-authored-by: Cursor <cursoragent@cursor.com>
@daimon-pymclabs
Copy link
Copy Markdown

Adversarial but constructive review — PR #920 (maintainer-pr-review skill)

Reviewed by Daimon at Ben's request. This is an agent-authored review.

Context

This PR was reviewed against draft #865 (now closed), which covered similar ground. The merge brought in CausalPy-specific substance that was missing in the first version: code-patterns.md, docs-patterns.md, review-patterns.md, review-comments.md, workflow.md, and maintenance.md. The six PR-type resource files gained "Review Output Emphasis" sections.


Findings

Should-fix

1. Boundary references a nonexistent skill (SKILL.md)

The Boundary section says: "If the user asks to continuously keep a PR merge-ready, use `babysit`." There is no `babysit` skill on main — the existing skills are: `designing-experiments`, `github-issues`, `loading-datasets`, `performing-causal-analysis`, `pr-to-green`, `pr-workflows`, `python-environment`, `research-and-planning`, `running-placebo-analysis`, `working-with-marimo`.

The `pr-to-green` reference is valid. The `babysit` reference sends agents to a dead end. Either add `babysit` before or alongside this PR, remove the reference, or replace it with a note like "planned but not yet implemented."

2. No real-world testing evidence

The test plan runs `prek` on the markdown files, which validates formatting. But a PR review skill should be tested by having an agent use it to review a real PR and evaluating whether the output is better than without the skill. Even one example (e.g., "I had Cursor use this skill to review #842 and here's what it produced") would demonstrate it works in practice and isn't just well-structured theory. This is the difference between "the recipe is well-formatted" and "the recipe produces good food."

3. Maintenance cadence has no owner or trigger (maintenance.md)

The maintenance file says "every six months or after roughly ten PR reviews, check..." but doesn't specify who does this or what triggers it. Without a mechanism (a scheduled issue, a CI reminder, or an owner), this becomes aspirational documentation that drifts. Consider adding a concrete trigger — even "the maintainer who notices drift files an issue" is better than an ownerless cadence.

Nits

4. PR-type "Review Output Emphasis" sections are formulaic

All six PR-type files now have a "Review Output Emphasis" section that follows the same four-bullet template (PR summary → findings → test evidence → open questions). The content varies slightly per type, but the structure is identical. An agent reading the PR-type file already reads the output template from SKILL.md — the per-type emphasis bullets risk being ignored as repetitive noise. Consider whether a single sentence per PR-type ("For bug fixes, foreground root-cause evidence and regression test presence") would be sharper than four bullets that echo the main template.

5. review-patterns.md SF-4 lacks a CausalPy-grounded example

SF-4 ("Default Value Is Silently a No-Op") is described abstractly: "ask what does this default actually do?" The other MF patterns have concrete CausalPy examples (`_clone()` dropping priors, `Literal` for constrained strings). SF-4 could use one — e.g., a sensitivity check parameter that defaults to a value imposing no constraint, or a threshold that defaults to `None` and is silently skipped.

6. $CONDA_EXE not defined in this skill

SKILL.md says "Use `$CONDA_EXE run -n CausalPy `" but doesn't define or source the variable. This presumably comes from `AGENTS.md` or the environment, but an agent encountering this skill cold might not know where to find it. A one-line note ("defined in `AGENTS.md`" or "set by the project environment") would prevent a lookup cycle.

What worked well

  • The code-patterns.md file is genuinely useful — the `_clone()` pattern, `BaseExperiment` contract checklist, and type hint conventions encode real institutional knowledge that a generic reviewer would miss.
  • The review-patterns.md severity tiers (MF/SF/N/P) with "how to spot" prompts are well-structured. They give an agent concrete diagnostic steps rather than vague principles.
  • The workflow.md "Verify Claims" table is excellent — treating contributor claims as hypotheses to test is the right adversarial posture for review.
  • The review-comments.md non-negotiable rules (never post without approval, never approve without instruction) are clear and correctly positioned as hard boundaries.
  • The separation into router (SKILL.md) + reference files is clean. An agent reviewing a CI PR doesn't need to load docs-patterns.md.

Merge Readiness

Verdict: approve with minor changes. The `babysit` dead-end reference (SF item 1) is the only thing I'd want fixed before merge — the rest are improvements that could land as follow-up.

Branch status: not checked locally.
CI status: not checked.

PR Summary

Adds a structured PR review skill for CausalPy with a compact router SKILL.md, six PR-type checklists, and six deep-dive reference files covering code patterns, docs conventions, severity-sorted review patterns, comment templates, review workflow, and skill maintenance. The skill encodes CausalPy-specific contracts (BaseExperiment, _clone(), type hints, custom exceptions) and establishes clear guardrails against autonomous posting or approval.

Open Questions

  1. Should the `babysit` reference be removed, or is there a plan to add that skill soon?
  2. Is there appetite to test this on one real PR before merging, or is the structural review sufficient?

@drbenvincent drbenvincent added the agents Agent related issues specifically for use by developers label May 15, 2026
Co-authored-by: Cursor <cursoragent@cursor.com>
@daimon-pymclabs
Copy link
Copy Markdown

All six items from my review are addressed in d409a89:

  1. babysit dead-end — replaced with self-contained fallback guidance. Good.
  2. No real-world testingmaintenance.md now requires a smoke review for meaningful changes. Good.
  3. Maintenance ownership — new "Owner and Trigger" section with concrete triggers. Good.
  4. Formulaic output emphasis — collapsed to one sentence per PR-type. Tighter.
  5. SF-4 example — concrete placebo-in-time min_gap=1 example added. Good.
  6. $CONDA_EXE — inline pointer to AGENTS.md. Good.

No further changes requested from my side. This is ready to merge.

Co-authored-by: Cursor <cursoragent@cursor.com>
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented May 15, 2026

Documentation build overview

📚 causalpy | 🛠️ Build #32709288 | 📁 Comparing 716cc95 against latest (293c8ca)

  🔍 Preview build  

2 files changed
± 404.html
± _modules/causalpy/experiments/inverse_propensity_weighting.html

drbenvincent and others added 2 commits May 15, 2026 18:19
@drbenvincent drbenvincent merged commit 390129f into main May 15, 2026
15 checks passed
@drbenvincent drbenvincent deleted the maintainer-pr-review-skill branch May 15, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent related issues specifically for use by developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants