You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: refine ralph standard blog post based on feedback
Rewrite skill comparison to emphasize inner/outer loop distinction and
intentional format familiarity. Remove harness engineering and autoresearch
as inspirations. Focus on controlling the outer loop and injecting context
into the inner loop. Add ralph add install instructions and cookbook link.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/blog/posts/the-ralph-standard.md
+33-31Lines changed: 33 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,15 +10,15 @@ keywords: RALPH.md format, agent loop standard, autonomous coding format, ralph
10
10
11
11
# An agent skill-like standard for autonomous agent loops
12
12
13
-
I've spent the last few weeks messing around with [ralph loops](https://ghuntley.com/ralph/)— running an agent against a prompt in a while loop. The more I used them, the more I wanted a reusable format: deterministic scripts between iterations, their output optionally injected into the prompt, and a way to parametrize the whole thing so one loop definition works across projects.
13
+
I've spent the last few weeks messing around with [ralph loops](https://ghuntley.com/ralph/)- running an agent against a prompt in a while loop. The more I used them, the more I wanted a reusable format: deterministic scripts between iterations, their output optionally injected into the prompt, and a way to parametrize the whole thing so one loop definition works across projects.
14
14
15
15
So I designed one.
16
16
17
17
<!-- more -->
18
18
19
19
## The format
20
20
21
-
A ralph is a self-contained directory. The only required file is `RALPH.md`— everything else is optional context:
21
+
A ralph is a self-contained directory. The only required file is `RALPH.md`- everything else is optional context:
22
22
23
23
```
24
24
bug-hunter/
@@ -78,18 +78,18 @@ Find and fix a real bug in this codebase.
78
78
79
79
Each iteration:
80
80
81
-
1.**Read code**— pick a module and read it carefully. Look for
81
+
1.**Read code**- pick a module and read it carefully. Look for
2.**Write a failing test**— prove the bug exists with a test
84
+
2.**Write a failing test**- prove the bug exists with a test
85
85
that fails on the current code.
86
-
3.**Fix the bug**— make the test pass with a minimal fix.
87
-
4.**Verify**— all existing tests must still pass.
86
+
3.**Fix the bug**- make the test pass with a minimal fix.
87
+
4.**Verify**- all existing tests must still pass.
88
88
89
89
## Rules
90
90
91
91
- One bug per iteration
92
-
- The bug must be real — do not invent hypothetical issues
92
+
- The bug must be real - do not invent hypothetical issues
93
93
- Always write a regression test before fixing
94
94
- Do not change unrelated code
95
95
- Commit with `fix: resolve <description>`
@@ -99,28 +99,18 @@ Each iteration:
99
99
100
100
The whole format is four things:
101
101
102
-
1.**`agent`**— the command to run (anything that reads stdin)
103
-
2.**`commands`**— deterministic feedback commands that run between iterations
104
-
3.**`args`**— declared arguments to parametrize the ralph from the command line
105
-
4.**A prompt body**— with `{{ placeholders }}` for command output and arguments
102
+
1.**`agent`**- the command to run (anything that reads stdin)
103
+
2.**`commands`**- deterministic feedback commands that run between iterations
104
+
3.**`args`**- declared arguments to parametrize the ralph from the command line
105
+
4.**A prompt body**- with `{{ placeholders }}` for command output and arguments
106
106
107
-
Each iteration: run the commands, optionally inject their output into the prompt via `{{ commands.<name> }}`, resolve `{{ args.<name> }}` placeholders for ad-hoc steering, pipe the assembled prompt to the agent, agent does its thing, repeat. Fresh context every cycle. State, progress, strategy — it all lives in the project's filesystem. Git history, markdown docs, plan files, whatever makes sense. The format doesn't prescribe where state goes.
107
+
Each iteration: run the commands, optionally inject their output into the prompt via `{{ commands.<name> }}`, resolve `{{ args.<name> }}` placeholders for ad-hoc steering, pipe the assembled prompt to the agent, agent does its thing, repeat. Fresh context every cycle.
108
108
109
109
## Design decisions
110
110
111
-
**Why a directory, not just a file?** Same reason the [Agent Skills](https://agentskills.io/) format uses a directory. A `RALPH.md` on its own is enough for simple loops, but real-world loops often need a shell script for a custom check (`./check-coverage.sh`), reference docs for progressive disclosure (`coding-guidelines.md`, `architecture.md`), data files, templates. Commands starting with `./` run relative to the ralph directory, so bundled scripts just work. The directory is the unit of sharing — copy it, check it into a repo, and the whole loop travels together.
111
+
**Why a directory, not just a file?** Same reason the [Agent Skills](https://agentskills.io/) format uses a directory. A `RALPH.md` on its own is enough for simple loops, but ralph loops often benefit from being bundled with shell scripts for custom checks and context injection and reference docs for progressive disclosure (`coding-guidelines.md`, `architecture.md`). Commands starting with `./` run relative to the ralph directory, so bundled scripts just work. The directory then is the unit of sharing.
112
112
113
-
**Why not just make it a skill?** They look similar on the surface — both are directories with a markdown file and optional bundled resources. But a skill is loaded once when an agent decides it's relevant. It adds knowledge to a single session. A ralph is executed repeatedly — it's the outer loop that launches the agent, feeds it deterministic feedback, and kicks off the next iteration. Skills live inside an agent's context. Ralphs live outside, orchestrating from the outside in. You could use both together — a ralph that runs an agent which has skills installed. Complementary layers, not competing ones.
114
-
115
-
## What shaped the format
116
-
117
-
Two things influenced the design more than anything else. OpenAI's [harness engineering](https://openai.com/index/harness-engineering/) post — build deterministic infrastructure around the agent, keep progress as markdown in the codebase, don't try to make the agent smarter. And Karpathy's [autoresearch](https://github.com/karpathy/autoresearch) — one hard metric, ~700 experiments in two days, changes that don't improve the number get reverted.
118
-
119
-
`commands` in a ralph are the mechanism for all of this. They're not just ground truth — they're the control structure around the loop. Enforce file and directory conventions. Run checks. Inject dynamic context. Gate progress. The deterministic scaffolding that lets you trust the agent to operate autonomously.
120
-
121
-
The easy wins are tasks with hard metrics — test coverage, validation loss, a reference implementation to compare against. But I think ralph loops have the potential to take on much fuzzier, higher-level work. A PRD. A loose description of a desired outcome. A strategic goal. For that kind of work, how you frame the outcome matters more than how specifically you instruct the agent. Too specific and the agent overfits to your instructions. Too vague and it drifts. There's a weird golden balance, and I've been reaching for Jobs-to-be-Done as a prompting technique — express the outcome to optimize for, not the steps to take.
122
-
123
-
That's what I want to build towards — a format and a tool that enable increasingly ambitious and fuzzy things to be achieved with ralph loops. Because AI is truly powerful when it surprises you with solutions to problems you didn't anticipate when you kicked off the loop. True discovery. The iterative, fresh-context way of working makes that possible in a way that single-shot prompting doesn't. A good ralph engineer figures out how to get results with agents that are as autonomous as possible — because that means the strategy and outcome definition are good enough for the agent to make decisions you couldn't have predicted. That's the power of it. And honestly why I keep rabbit-holing on all of this.
113
+
**Why not just make it a skill?** They look similar on the surface - both are directories with a markdown file and optional bundled resources. That similarity is intentional - the skill format has become familiar to a lot of people, and borrowing its shape makes ralphs easy to understand at a glance. But they serve different layers. A skill provides knowledge about reusable processes in the inner loop - the agent's session. A ralph steers the outer loop by running code between iterations to deterministically control the environment and optionally inject context into the inner loop before kicking off the next iteration.
124
114
125
115
## Try it
126
116
@@ -129,23 +119,35 @@ I'm building a tool called [Ralphify](https://github.com/computerlovetech/ralphi
129
119
```bash
130
120
uv tool install ralphify
131
121
132
-
# point the bug hunter at a specific area
133
-
ralph run bug-hunter --focus "authentication and session handling"
122
+
# point it at a directory containing a RALPH.md
123
+
ralph run ./ralphs/bug-hunter --focus "authentication and session handling"
134
124
135
125
# same ralph, different focus
136
-
ralph run bug-hunter --focus "edge cases in the payment flow"
126
+
ralph run ./ralphs/bug-hunter --focus "edge cases in the payment flow"
137
127
138
-
# or run it without args — unmatched placeholders just resolve to empty
139
-
ralph run bug-hunter
128
+
# or run it without args - unmatched placeholders just resolve to empty
129
+
ralph run ./ralphs/bug-hunter
140
130
```
141
131
142
132
Declare `args: [focus]` and you get `--focus` on the CLI. The value fills `{{ args.focus }}` in the prompt. One ralph, many use cases.
143
133
144
-
Ralphify is just one implementation though. The format itself is what I care about most — it's just YAML frontmatter and markdown. Any tool could read it and run the loop. I can't predict what will end up being useful here. But I built this, and maybe someone else finds the format interesting enough to build on or take in a direction I haven't thought of. The Agent Skills format started as one team's idea and ended up adopted by dozens of agents. I don't know if the same thing happens here, but the format is simple enough that it could.
134
+
Because ralphs are just directories in a git repo, anyone can share them. If a repo contains a directory with a `RALPH.md`, you can install it with `ralph add`:
135
+
136
+
```bash
137
+
# install a specific ralph from any GitHub repo
138
+
ralph add owner/repo/ralph-name
139
+
140
+
# install all ralphs in a repo
141
+
ralph add owner/repo
142
+
```
143
+
144
+
The [ralphify examples](https://github.com/computerlovetech/ralphify/tree/main/examples) are a good place to start — and the [cookbook](https://ralphify.co/docs/cookbook/) has more.
145
+
146
+
Ralphify is just one implementation though. The format itself is what I care about most - it's just YAML frontmatter and markdown. Any tool could read it and run the loop. I can't predict what will end up being useful here. But I built this, and maybe someone else finds the format interesting enough to build on or take in a direction I haven't thought of.
145
147
146
148
## I'd love feedback
147
149
148
-
This is where my thinking landed, but I'm sure there are blind spots. If you're running agent loops — for coding, research, testing, or something I haven't thought of — I'd genuinely like to hear what you think.
150
+
This is where my thinking landed, but I'm sure there are blind spots. If you're running agent loops - for coding, research, testing, or something I haven't thought of - I'd genuinely like to hear what you think.
149
151
150
152
-**Share a use case**: [open an issue](https://github.com/computerlovetech/ralphify/issues) describing how you'd use this, or how you already run agent loops. The weird, unexpected ones are the most useful.
151
153
-**Poke holes in the format**: if something feels wrong or missing, I want to know.
0 commit comments