.

Kasper Junge · Kasper Junge · commit c8001afe72c5 · 2026-04-17T12:19:05.000+02:00
diff --git a/agr.lock b/agr.lock
@@ -11,10 +11,6 @@ handle = "microsoft/playwright-cli/playwright-cli"
 source = "github"
 installed-name = "playwright-cli"
 
-[[skill]]
-path = "skills/release"
-installed-name = "release"
-
 [[skill]]
 handle = "computerlovetech/docs-audit"
 source = "github"
@@ -35,3 +31,18 @@ source = "github"
 commit = "9859f7bceb7a46af8482cabb9aa24e0d38a49413"
 content-hash = "sha256:fa1ce825fa7e11cd5aac55ee7eac5e9c918e3af113b7988fdbd281a319acc110"
 installed-name = "code-review"
+
+[[skill]]
+path = "skills/ralphify-release"
+installed-name = "ralphify-release"
+
+[[ralph]]
+path = "ralphs/bug-hunter"
+installed-name = "bug-hunter"
+
+[[ralph]]
+handle = "computerlovetech/ralphs/improve-codebase"
+source = "github"
+commit = "70a0dabf5a59ae1a3acaf50a9e619c83e174a35f"
+content-hash = "sha256:0afc55d5b2d4061fa82c270a147c66c8bc21589ab62c47d9747ddb5f382c708d"
+installed-name = "improve-codebase"
diff --git a/agr.toml b/agr.toml
@@ -1,11 +1,33 @@
+# Source to fetch skills from
 default_source = "github"
+
+# Tools to install skills to (e.g. "claude", "cursor", "codex")
+tools = ["claude"]
+
+# Primary tool for instruction sync (must be listed in tools)
+# default_tool = "claude"
+
+# Default GitHub owner for short handles (e.g. "skill" resolves to "computerlovetech/skill")
+default_owner = "computerlovetech"
+
+# Default GitHub repo for two-part handles (e.g. "owner/skill" resolves to "owner/skills/skill")
+default_repo = "skills"
+
+# Sync instruction files across tools
+# sync_instructions = false
+
+# Which tool's instruction file is the source of truth
+# canonical_instructions = "CLAUDE.md"
+
 dependencies = [
     {path = "skills/brand-guidelines", type = "skill"},
     {handle = "microsoft/playwright-cli/playwright-cli", type = "skill"},
-    {path = "skills/release", type = "skill"},
     {handle = "computerlovetech/docs-audit", type = "skill"},
     {handle = "computerlovetech/setup-agent-workspace", type = "skill"},
     {handle = "maragudk/code-review", type = "skill"},
+    {path = "ralphs/bug-hunter", type = "ralph"},
+    {path = "skills/ralphify-release", type = "skill"},
+    {handle = "computerlovetech/ralphs/improve-codebase", type = "ralph"},
 ]
 
 [[source]]
diff --git a/docs/blog/posts/the-ralph-format.md b/docs/blog/posts/the-ralph-format.md
@@ -0,0 +1,126 @@
+---
+date: 2026-04-04
+categories:
+  - Standards
+authors:
+  - kasper
+description: A short manifest defining the ralph format — a single-file, skill-like spec for autonomous agent loops.
+keywords: ralph format, ralph spec, RALPH.md, autonomous agent loop format, agent harness spec, ralph standard
+---
+
+# The Ralph Format
+
+A **ralph** is a directory with a `RALPH.md` file in it. `RALPH.md` defines an autonomous agent loop: the agent to run, the commands to run between iterations, and the prompt to pipe in.
+
+That's the whole format.
+
+<!-- more -->
+
+## Minimal example
+
+```markdown
+---
+agent: claude -p
+commands:
+  - name: tests
+    run: uv run pytest -x
+---
+
+Fix the failing tests.
+
+{{ commands.tests }}
+```
+
+Each iteration: run the commands, fill the `{{ commands.<name> }}` placeholders with their output, pipe the assembled prompt to the agent, repeat.
+
+## The spec
+
+`RALPH.md` is a markdown file with YAML frontmatter and a prompt body.
+
+**Frontmatter fields:**
+
+| Field | Required | Description |
+|---|---|---|
+| `agent` | yes | Command to run. Anything that reads a prompt on stdin. |
+| `commands` | no | List of `{name, run}` (plus optional `timeout`). Output fills `{{ commands.<name> }}` placeholders. |
+| `args` | no | List of argument names. Values fill `{{ args.<name> }}` placeholders and become CLI flags. |
+
+**Body:** markdown with `{{ placeholders }}`. Unmatched placeholders resolve to empty strings.
+
+**Placeholders:**
+
+- `{{ commands.<name> }}` — output of a command (stdout + stderr, regardless of exit code)
+- `{{ args.<name> }}` — value of a CLI argument
+- `{{ ralph.name }}`, `{{ ralph.iteration }}`, `{{ ralph.max_iterations }}` — runtime metadata
+
+## Directory form
+
+`RALPH.md` on its own is enough. But a ralph is a directory so it can bundle anything the loop needs:
+
+```
+bug-hunter/
+├── RALPH.md              # required
+├── check-coverage.sh     # command script (optional)
+├── coding-guidelines.md  # context for the agent (optional)
+└── test-data.json        # supporting file (optional)
+```
+
+Commands whose `run` starts with `./` resolve relative to the ralph directory, so bundled scripts just work. The directory is the unit of sharing.
+
+## A realistic example
+
+```markdown
+---
+agent: claude -p --dangerously-skip-permissions
+commands:
+  - name: tests
+    run: uv run pytest -x
+  - name: lint
+    run: uv run ruff check .
+  - name: git-log
+    run: git log --oneline -10
+args:
+  - focus
+---
+
+You are an autonomous bug-hunting agent running in a loop.
+Each iteration starts with fresh context.
+
+## Tests
+{{ commands.tests }}
+
+## Lint
+{{ commands.lint }}
+
+## Recent commits
+{{ commands.git-log }}
+
+## Task
+
+Find and fix one real bug in this codebase. {{ args.focus }}
+
+Rules:
+- One bug per iteration
+- Write a failing regression test before fixing
+- Do not change unrelated code
+- Commit with `fix: resolve <description>`
+```
+
+## The runtime
+
+The format is the spec. A runtime executes it. [Ralphify](https://github.com/computerlovetech/ralphify) is the reference runtime:
+
+```bash
+uv tool install ralphify
+ralph run ./bug-hunter --focus "authentication"
+```
+
+Anything that implements the loop — read `RALPH.md`, resolve placeholders, pipe to the agent, repeat — is a conforming runtime.
+
+## Why a format
+
+Everyone writing ralph loops ends up with the same scaffolding: a markdown prompt, a few shell commands that surface state between iterations, a while-loop that ties them together. A format turns that scaffolding into something you can share, version, and install.
+
+Ralphs are to the outer loop what [skills](https://agentskills.io/) are to the inner loop. A skill guides an agent inside a session. A ralph defines what runs *between* sessions.
+
+See also: [RALPH.md — a markdown format for autonomous agent loops](the-ralph-standard.md) — the longer thinking piece behind the format.
diff --git a/ralphs/bug-hunter/RALPH.md b/ralphs/bug-hunter/RALPH.md
@@ -0,0 +1,61 @@
+---
+agent: claude -p --dangerously-skip-permissions
+commands:
+  - name: tests
+    run: uv run pytest -x
+  - name: types
+    run: uv run ty check
+  - name: lint
+    run: uv run ruff check .
+  - name: git-log
+    run: git log --oneline -10
+args:
+  - focus
+---
+
+# Bug Hunter
+
+You are an autonomous bug-hunting agent running in a loop. Each
+iteration starts with a fresh context. Your progress lives in the
+code and git.
+
+## Test results
+
+{{ commands.tests }}
+
+## Type checking
+
+{{ commands.types }}
+
+## Lint
+
+{{ commands.lint }}
+
+## Recent commits
+
+{{ commands.git-log }}
+
+If tests, types, or lint are failing, fix that before hunting for new bugs.
+
+## Task
+
+Find and fix a real bug in this codebase.
+{{ args.focus }}
+
+Each iteration:
+
+1. **Read code** — pick a module and read it carefully. Look for
+   edge cases, off-by-one errors, missing validation, incorrect
+   error handling, race conditions, or logic errors.
+2. **Write a failing test** — prove the bug exists with a test that
+   fails on the current code.
+3. **Fix the bug** — make the test pass with a minimal fix.
+4. **Verify** — all existing tests must still pass.
+
+## Rules
+
+- One bug per iteration
+- The bug must be real — do not invent hypothetical issues
+- Always write a regression test before fixing
+- Do not change unrelated code
+- Commit with `fix: resolve <description>`
diff --git a/skills/ralphify-release/SKILL.md b/skills/ralphify-release/SKILL.md
diff --git a/tasks/README.md b/tasks/README.md
@@ -0,0 +1,48 @@
+# Live peek code review — task list
+
+Tasks generated from a competitive review of the uncommitted live-peek feature (the `p` toggle for agent output streaming). Each task is self-contained so a fresh agent context can pick one up without needing the others.
+
+**Pick them up in order.** Some later tasks build on earlier ones (e.g. `medium-01` depends on the capability signal introduced in `critical-01`).
+
+## Critical — must fix before merge
+
+| # | Task | Original IDs |
+|---|---|---|
+| 1 | [Capture strategy three-way branch](critical-01-capture-strategy-three-way-branch.md) | C1 + M1 |
+| 2 | [`_pump_stream` exception containment](critical-02-pump-stream-exception-handling.md) | C2 + M9 |
+| 3 | [`stdin.write` timeout enforcement](critical-03-stdin-write-timeout.md) | C3 |
+| 4 | [Bounded reader-thread joins in `finally`](critical-04-bounded-reader-thread-joins.md) | C4 + C5 + M3 |
+
+## High
+
+| # | Task | Original IDs |
+|---|---|---|
+| 5 | [Echo output + Live spinner coordination](high-01-echo-output-live-spinner-coordination.md) | H1 + M8 |
+| 6 | [`_read_agent_stream` deadline + readahead](high-02-read-agent-stream-deadline-and-buffering.md) | H2 + H3 |
+| 7 | [Test helpers pid=12345 hazard](high-03-test-helpers-pid-hazard.md) | H4 |
+| 8 | [Peek discoverability (`--help`, startup hint)](high-04-peek-discoverability.md) | H5 |
+
+## Medium
+
+| # | Task | Original IDs |
+|---|---|---|
+| 9 | [Filter `AGENT_OUTPUT_LINE` events when no subscriber](medium-01-agent-output-line-event-filtering.md) | M2 |
+| 10 | [`returncode=None` semantic change — document](medium-02-returncode-semantic-change-docs.md) | M4 |
+| 11 | [Keypress listener: EINTR / SIGTSTP / SIGCONT](medium-03-keypress-signal-handling.md) | M5 |
+| 12 | [Peek lock discipline](medium-04-peek-lock-discipline.md) | M6 + M7 |
+| 13 | [Console emitter handler interleaving](medium-05-console-emitter-handler-interleaving.md) | M10 |
+
+## How to use
+
+Each task file contains:
+- **Problem** — what's wrong and where (file:line)
+- **Why it matters** — concrete failure mode
+- **Fix direction** — not a prescription, but the shape the fix should take
+- **Done when** — checklist for completion
+- **Context** — code snippets / surrounding gotchas the agent will need
+
+Run `uv run pytest && uv run ruff check . && uv run ruff format --check . && uv run ty check` after every task. Any new behavior needs a regression test.
+
+## Out of scope
+
+Low-severity / nit findings (`.gitignore` `.ralphify/`, CHANGELOG wording, `docs/llms-full.txt` stale copy, dumb-terminal check, import ordering, case-sensitive `p`, slow test markers, free-threaded Python compatibility) are tracked in the review transcript but not in individual task files — they can be swept up in a single follow-up PR.
diff --git a/uv.lock b/uv.lock