Skip to content
This repository was archived by the owner on Mar 21, 2026. It is now read-only.

ralph/pytest harness pr#1

Merged
elasticdotventures merged 20 commits into
mainfrom
ralph/pytest-harness-pr
Jan 31, 2026
Merged

ralph/pytest harness pr#1
elasticdotventures merged 20 commits into
mainfrom
ralph/pytest-harness-pr

Conversation

@elasticdotventures

Copy link
Copy Markdown
Member
  • feat: add FastMCP 3.0 server capabilities to Ralph
  • feat: add pytest harness

b and others added 2 commits January 31, 2026 21:49
- Add FastMCP 3.0.0b1 integration with dual CLI/MCP modes
- Expose MCP tools: run_ralph_iteration, get_ralph_status, get_prd_status
- Expose MCP resources: ralph://prd, ralph://progress
- Fix ralph.sh symlink resolution (cd to script dir before uv run)
- Update default model to gpt-5.2-codex
- Add README-MCP.md with usage examples

Ralph can now be used as both:
1. CLI tool: ./ralph.sh --agent codex 3
2. MCP server: uv run --script ralphython.py --mcp --transport http

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 31, 2026 10:59

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds FastMCP 3.0 server capabilities to Ralph and introduces a pytest testing harness. The implementation migrates the core logic from a bash script to a Python module (ralphython.py) that can run as both a CLI tool and an MCP server.

Changes:

  • Rewrote Ralph's core logic in Python with FastMCP 3.0 integration, exposing MCP tools for running iterations and checking status, plus resources for accessing PRD and progress files
  • Added pytest test suite covering argument parsing, deprecated flag handling, and PRD ingestion
  • Simplified ralph.sh to a thin wrapper that delegates to the Python implementation via uv

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
ralphython.py Main Python implementation with CLI and MCP server modes, including tools and resources
tests/test_ralphython.py Pytest test suite for CLI functionality
tests/conftest.py Test configuration for module imports
test_ralph_mcp.py Integration test for MCP server capabilities
ralph.sh Simplified bash wrapper delegating to Python
requirements-dev.txt Development dependencies (pytest, mypy)
pytest.ini Pytest configuration
README.md Updated usage instructions and added testing section
README-MCP.md New documentation for MCP features
AGENTS.md Updated agent selection examples
tests/pycache/* Compiled bytecode (should not be committed)
pycache/* Compiled bytecode (should not be committed)
ralph_cli.py Unclear purpose file containing only "ralphython.py"
=3.0.0b1 Malformed or incomplete file
ralph.sh~ Editor backup file (should not be committed)
Comments suppressed due to low confidence (1)

README.md:258

  • The main README.md does not mention the new MCP functionality at all, despite it being a significant feature addition mentioned in the PR description. Consider adding a reference to README-MCP.md in the main README so users are aware of the MCP capabilities.
# Ralph

![Ralph](ralph.webp)

Ralph is an autonomous AI agent loop that runs AI coding tools ([Amp](https://ampcode.com) or [Claude Code](https://docs.anthropic.com/en/docs/claude-code)) repeatedly until all PRD items are complete. Each iteration is a fresh instance with clean context. Memory persists via git history, `progress.txt`, and `prd.json`.

Based on [Geoffrey Huntley's Ralph pattern](https://ghuntley.com/ralph/).

[Read my in-depth article on how I use Ralph](https://x.com/ryancarson/status/2008548371712135632)

## Prerequisites

- One of the following AI coding tools installed and authenticated:
  - [Amp CLI](https://ampcode.com)
  - [Claude Code](https://docs.anthropic.com/en/docs/claude-code) (`npm install -g @anthropic-ai/claude-code`)
- `jq` installed (`brew install jq` on macOS)
- A git repository for your project

## Setup

### Option 1: Copy to your project

Copy the ralph files into your project:

```bash
# From your project root
mkdir -p scripts/ralph
cp /path/to/ralph/ralph.sh scripts/ralph/

# Copy the prompt template for your AI tool of choice:
cp /path/to/ralph/prompt.md scripts/ralph/prompt.md    # For Amp
# OR
cp /path/to/ralph/CLAUDE.md scripts/ralph/CLAUDE.md    # For Claude Code

chmod +x scripts/ralph/ralph.sh

Option 2: Install skills globally (Amp)

Copy the skills to your Amp or Claude config for use across all projects:

For AMP

cp -r skills/prd ~/.config/amp/skills/
cp -r skills/ralph ~/.config/amp/skills/

For Claude Code (manual)

cp -r skills/prd ~/.claude/skills/
cp -r skills/ralph ~/.claude/skills/

Option 3: Use as Claude Code Marketplace

Add the Ralph marketplace to Claude Code:

/plugin marketplace add snarktank/ralph

Then install the skills:

/plugin install ralph-skills@ralph-marketplace

Available skills after installation:

  • /prd - Generate Product Requirements Documents
  • /ralph - Convert PRDs to prd.json format

Skills are automatically invoked when you ask Claude to:

  • "create a prd", "write prd for", "plan this feature"
  • "convert this prd", "turn into ralph format", "create prd.json"

Configure Amp auto-handoff (recommended)

Add to ~/.config/amp/settings.json:

{
  "amp.experimental.autoHandoff": { "context": 90 }
}

This enables automatic handoff when context fills up, allowing Ralph to handle large stories that exceed a single context window.

Workflow

1. Create a PRD

Use the PRD skill to generate a detailed requirements document:

Load the prd skill and create a PRD for [your feature description]

Answer the clarifying questions. The skill saves output to tasks/prd-[feature-name].md.

2. Convert PRD to Ralph format

Use the Ralph skill to convert the markdown PRD to JSON:

Load the ralph skill and convert tasks/prd-[feature-name].md to prd.json

This creates prd.json with user stories structured for autonomous execution.

3. Run Ralph

# Using Amp
./scripts/ralph/ralph.sh [max_iterations]

# Using Claude Code
./scripts/ralph/ralph.sh --agent claude [max_iterations]

Default is 10 iterations. Use --agent amp, --agent claude, or --agent codex to select your AI coding tool.

Ralph will:

  1. Create a feature branch (from PRD branchName)
  2. Pick the highest priority story where passes: false
  3. Implement that single story
  4. Run quality checks (typecheck, tests)
  5. Commit if checks pass
  6. Update prd.json to mark story as passes: true
  7. Append learnings to progress.txt
  8. Repeat until all stories pass or max iterations reached

Testing

Install dev dependencies:

uv pip install -r requirements-dev.txt

Run tests:

uv run pytest

The pytest harness exercises the ralphython CLI end-to-end (argument parsing,
deprecated --tool handling, and PRD ingestion) without invoking Amp, Claude,
or Codex so you can validate behavior locally before handing off to agents.

Key Files

File Purpose
ralph.sh The bash loop that spawns fresh AI instances (supports --agent amp, --agent claude, or --agent codex)
prompt.md Prompt template for Amp
CLAUDE.md Prompt template for Claude Code
prd.json User stories with passes status (the task list)
prd.json.example Example PRD format for reference
progress.txt Append-only learnings for future iterations
skills/prd/ Skill for generating PRDs (works with Amp and Claude Code)
skills/ralph/ Skill for converting PRDs to JSON (works with Amp and Claude Code)
.claude-plugin/ Plugin manifest for Claude Code marketplace discovery
flowchart/ Interactive visualization of how Ralph works

Flowchart

Ralph Flowchart

View Interactive Flowchart - Click through to see each step with animations.

The flowchart/ directory contains the source code. To run locally:

cd flowchart
npm install
npm run dev

Critical Concepts

Each Iteration = Fresh Context

Each iteration spawns a new AI instance (Amp or Claude Code) with clean context. The only memory between iterations is:

  • Git history (commits from previous iterations)
  • progress.txt (learnings and context)
  • prd.json (which stories are done)

Small Tasks

Each PRD item should be small enough to complete in one context window. If a task is too big, the LLM runs out of context before finishing and produces poor code.

Right-sized stories:

  • Add a database column and migration
  • Add a UI component to an existing page
  • Update a server action with new logic
  • Add a filter dropdown to a list

Too big (split these):

  • "Build the entire dashboard"
  • "Add authentication"
  • "Refactor the API"

AGENTS.md Updates Are Critical

After each iteration, Ralph updates the relevant AGENTS.md files with learnings. This is key because AI coding tools automatically read these files, so future iterations (and future human developers) benefit from discovered patterns, gotchas, and conventions.

Examples of what to add to AGENTS.md:

  • Patterns discovered ("this codebase uses X for Y")
  • Gotchas ("do not forget to update Z when changing W")
  • Useful context ("the settings panel is in component X")

Feedback Loops

Ralph only works if there are feedback loops:

  • Typecheck catches type errors
  • Tests verify behavior
  • CI must stay green (broken code compounds across iterations)

Browser Verification for UI Stories

Frontend stories must include "Verify in browser using dev-browser skill" in acceptance criteria. Ralph will use the dev-browser skill to navigate to the page, interact with the UI, and confirm changes work.

Stop Condition

When all stories have passes: true, Ralph outputs <promise>COMPLETE</promise> and the loop exits.

Debugging

Check current state:

# See which stories are done
cat prd.json | jq '.userStories[] | {id, title, passes}'

# See learnings from previous iterations
cat progress.txt

# Check git history
git log --oneline -10

Customizing the Prompt

After copying prompt.md (for Amp) or CLAUDE.md (for Claude Code) to your project, customize it for your project:

  • Add project-specific quality check commands
  • Include codebase conventions
  • Add common gotchas for your stack

Archiving

Ralph automatically archives previous runs when you start a new feature (different branchName). Archives are saved to archive/YYYY-MM-DD-feature-name/.

References

</details>



---

💡 <a href="/PromptExecution/ralph-plus-_b00t_/new/main/.github/instructions?filename=*.instructions.md" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Add Copilot custom instructions</a> for smarter, more guided reviews. <a href="https://docs.github.com/en/copilot/customizing-copilot/adding-repository-custom-instructions-for-github-copilot" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Learn how to get started</a>.

Comment thread ralphython.py
Comment on lines +79 to +82
try:
last_branch = last_branch_file.read_text().strip()
except Exception:
last_branch = ""

Copilot AI Jan 31, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad exception handler catches all exceptions without logging them, making it difficult to debug issues when reading the last branch name fails. Consider catching specific exceptions like FileNotFoundError or IOError, and logging unexpected errors.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Comment thread ralphython.py
_codex_prompt_file = Path(
os.environ.get("CODEX_PROMPT_FILE", str(script_dir / "prompt.md"))
)
codex_model = os.environ.get("CODEX_MODEL", "gpt-5.2-codex")

Copilot AI Jan 31, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variable default "gpt-5.2-codex" appears to reference a model that may not exist. As of my knowledge cutoff (January 2025), there is no publicly documented GPT-5.2 model from OpenAI. Verify that this model name is correct for your use case, or update it to a valid model identifier.

Suggested change
codex_model = os.environ.get("CODEX_MODEL", "gpt-5.2-codex")
codex_model = os.environ.get("CODEX_MODEL", "gpt-4.1")

Copilot uses AI. Check for mistakes.
Comment thread test_ralph_mcp.py Outdated
Comment thread ralphython.py
Comment on lines +53 to +57
def _read_branch_name(prd_file: Path) -> str:
try:
data = json.loads(prd_file.read_text())
except Exception:
return ""

Copilot AI Jan 31, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad exception handler catches all exceptions, which can hide bugs and make debugging difficult. Consider catching specific exceptions like JSONDecodeError, FileNotFoundError, or at minimum logging the exception before returning an empty string so errors are not silently ignored.

Copilot uses AI. Check for mistakes.
Comment thread ralphython.py Outdated
Comment thread test_ralph_mcp.py Outdated
Comment on lines +23 to +51
try:
async with Client("http://localhost:8766/mcp") as client:
print("✅ Connected to Ralph MCP server\n")

# Test get_prd_status tool
print("📋 Testing get_prd_status tool...")
result = await client.call_tool("get_prd_status", {})
print(f" PRD Status: {result['project']}")
print(f" Completed: {result['completed_stories']}/{result['total_stories']} ({result['completion_percentage']}%)\n")

# Test get_ralph_status tool
print("📊 Testing get_ralph_status tool...")
result = await client.call_tool("get_ralph_status", {})
print(f" Status: {result['status']}")
print(f" Total lines: {result.get('total_lines', 0)}\n")

# Test resources
print("📂 Testing ralph://prd resource...")
resources = await client.list_resources()
prd_resources = [r for r in resources if "prd" in r.uri]
if prd_resources:
print(f" Found resource: {prd_resources[0].uri}\n")

print("✅ All tests passed!")

finally:
proc.terminate()
proc.wait(timeout=2)
print("\n🛑 Server stopped")

Copilot AI Jan 31, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_ralph_mcp.py file lacks error handling for scenarios where the server fails to start, the connection fails, or the MCP tools return error responses. The test assumes happy-path execution and will fail with unclear errors if the server doesn't start properly or if any tool call fails. Add try-except blocks and assertions to handle and report these failure cases clearly.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Comment thread ralph.sh Outdated
Comment thread ralphython.py Outdated
Comment thread ralphython.py Outdated
Comment on lines +306 to +320
total = len(prd_data["userStories"])
completed = sum(1 for story in prd_data["userStories"] if story.get("passes", False))
incomplete = [
{"id": s["id"], "title": s["title"]}
for s in prd_data["userStories"]
if not s.get("passes", False)
]

return {
"status": "loaded",
"project": prd_data.get("project", "Unknown"),
"total_stories": total,
"completed_stories": completed,
"completion_percentage": round((completed / total) * 100, 1) if total > 0 else 0,
"incomplete_stories": incomplete[:5], # First 5 incomplete

Copilot AI Jan 31, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function assumes the PRD data structure has a "userStories" key. If this key is missing, the function will raise a KeyError. Add validation to check if required keys exist and return an appropriate error status if the PRD structure is invalid.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Comment thread ralphython.py
Comment on lines +148 to +150
_codex_prompt_file = Path(
os.environ.get("CODEX_PROMPT_FILE", str(script_dir / "prompt.md"))
)

Copilot AI Jan 31, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable _codex_prompt_file is not used.

Suggested change
_codex_prompt_file = Path(
os.environ.get("CODEX_PROMPT_FILE", str(script_dir / "prompt.md"))
)

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where/how should it be used?

b and others added 4 commits January 31, 2026 22:12
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>

Copilot AI commented Jan 31, 2026

Copy link
Copy Markdown

@elasticdotventures I've opened a new pull request, #2, to work on those changes. Once the pull request is ready, I'll request review from you.

elasticdotventures and others added 2 commits January 31, 2026 22:18
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>

Copilot AI commented Jan 31, 2026

Copy link
Copy Markdown

@elasticdotventures I've opened a new pull request, #3, to work on those changes. Once the pull request is ready, I'll request review from you.

elasticdotventures and others added 2 commits January 31, 2026 22:19
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>

Copilot AI commented Jan 31, 2026

Copy link
Copy Markdown

@elasticdotventures I've opened a new pull request, #4, to work on those changes. Once the pull request is ready, I'll request review from you.

elasticdotventures and others added 9 commits January 31, 2026 22:20
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>
Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
[WIP] Fix issues based on feedback from review on ralph/pytest harness PR
Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
Add error handling to MCP server test harness
Add PRD structure validation to prevent KeyError in get_prd_status
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Brian Horakh <35611074+elasticdotventures@users.noreply.github.com>
@elasticdotventures elasticdotventures merged commit 364de6f into main Jan 31, 2026
@elasticdotventures elasticdotventures deleted the ralph/pytest-harness-pr branch January 31, 2026 11:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants