|
| 1 | +--- |
| 2 | +description: How to write effective RALPH.md prompts for autonomous coding loops — structure, patterns, anti-patterns, and real examples. |
| 3 | +--- |
| 4 | + |
| 5 | +# Writing Prompts |
| 6 | + |
| 7 | +Your `RALPH.md` is the single most important file in a ralph loop. It's the only thing the agent reads each iteration — everything it does follows from what you write here. A good prompt turns an AI coding agent into a productive autonomous worker. A bad one produces noise. |
| 8 | + |
| 9 | +This guide covers the patterns that work, the mistakes that waste iterations, and how to tune your prompt while the loop is running. |
| 10 | + |
| 11 | +## The anatomy of a good prompt |
| 12 | + |
| 13 | +Every effective ralph prompt has three parts: |
| 14 | + |
| 15 | +### 1. Role and orientation |
| 16 | + |
| 17 | +Tell the agent what it is, how it works, and where progress lives. This prevents the agent from trying to have a conversation, ask questions, or wait for input. |
| 18 | + |
| 19 | +```markdown |
| 20 | +You are an autonomous coding agent running in a loop. Each iteration |
| 21 | +starts with a fresh context. Your progress lives in the code and git. |
| 22 | +``` |
| 23 | + |
| 24 | +This framing matters because the agent has **no memory between iterations**. Without it, the agent may try to pick up where it "left off" or reference work it can't see. |
| 25 | + |
| 26 | +### 2. Task source |
| 27 | + |
| 28 | +Point the agent at something concrete to work on. The most common mistake is being vague ("improve the codebase"). Instead, give the agent a **specific place to look** for work: |
| 29 | + |
| 30 | +| Pattern | When to use | |
| 31 | +|---|---| |
| 32 | +| `Read TODO.md and pick the top uncompleted task` | When you maintain a task list | |
| 33 | +| `Read PLAN.md and implement the next step` | For sequential multi-step work | |
| 34 | +| `Find the module with the lowest test coverage` | For coverage-driven testing | |
| 35 | +| `Read the codebase and find the biggest documentation gap` | For open-ended improvement | |
| 36 | +| `Fix the failing tests` | When checks feed failures back | |
| 37 | + |
| 38 | +The key is that the agent can **find work without you telling it what to do each time**. The task source is what makes the loop autonomous. |
| 39 | + |
| 40 | +### 3. Rules and constraints |
| 41 | + |
| 42 | +Constraints are more important than instructions. The agent knows how to code — what it doesn't know is your project's conventions, what to avoid, and when to stop. |
| 43 | + |
| 44 | +```markdown |
| 45 | +## Rules |
| 46 | + |
| 47 | +- One task per iteration |
| 48 | +- No placeholder code — full, working implementations only |
| 49 | +- Run tests before committing |
| 50 | +- Commit with a descriptive message like `feat: add X` or `fix: resolve Y` |
| 51 | +- Do not modify existing tests to make them pass |
| 52 | +``` |
| 53 | + |
| 54 | +Rules prevent the agent from: |
| 55 | + |
| 56 | +- Doing too much in one iteration (context window bloat, messy commits) |
| 57 | +- Leaving TODO comments instead of writing real code |
| 58 | +- Breaking things it shouldn't touch |
| 59 | +- Committing without validation |
| 60 | + |
| 61 | +## Patterns that work |
| 62 | + |
| 63 | +### The TODO-driven loop |
| 64 | + |
| 65 | +Maintain a `TODO.md` that the agent reads and updates each iteration. This gives you a clear task queue and visible progress. |
| 66 | + |
| 67 | +```markdown |
| 68 | +# Prompt |
| 69 | + |
| 70 | +You are an autonomous coding agent running in a loop. Each iteration |
| 71 | +starts with a fresh context. Your progress lives in the code and git. |
| 72 | + |
| 73 | +Read TODO.md for the current task list. Pick the top uncompleted task, |
| 74 | +implement it fully, then mark it done. |
| 75 | + |
| 76 | +## Rules |
| 77 | + |
| 78 | +- One task per iteration |
| 79 | +- No placeholder code — full implementations only |
| 80 | +- Run `uv run pytest -x` before committing |
| 81 | +- Commit with a descriptive message |
| 82 | +- Mark the completed task in TODO.md |
| 83 | +``` |
| 84 | + |
| 85 | +**Why it works:** The agent always knows what to do next. You control priority by reordering the list. You can add tasks while the loop runs. |
| 86 | + |
| 87 | +### The self-healing loop |
| 88 | + |
| 89 | +Rely on [checks](primitives.md#checks) to define "done" and let failures guide the agent. The prompt focuses on the work; the checks handle quality. |
| 90 | + |
| 91 | +```markdown |
| 92 | +--- |
| 93 | +checks: [tests, lint, typecheck] |
| 94 | +contexts: [git-log] |
| 95 | +--- |
| 96 | + |
| 97 | +# Prompt |
| 98 | + |
| 99 | +{{ contexts.git-log }} |
| 100 | + |
| 101 | +You are an autonomous coding agent. Read PLAN.md and implement the |
| 102 | +next incomplete step. If the previous iteration left failing checks, |
| 103 | +fix those first before moving on. |
| 104 | + |
| 105 | +## Rules |
| 106 | + |
| 107 | +- Fix check failures before starting new work |
| 108 | +- One step per iteration |
| 109 | +- Commit each completed step separately |
| 110 | +``` |
| 111 | + |
| 112 | +**Why it works:** Failed checks automatically inject their output into the next iteration. The agent sees exactly what broke and how to fix it. You don't need to write error-handling instructions for every possible failure — the checks do it for you. |
| 113 | + |
| 114 | +### The edit-while-running loop |
| 115 | + |
| 116 | +The agent re-reads `RALPH.md` every iteration. This means you can steer the loop in real time by editing the prompt while it runs. |
| 117 | + |
| 118 | +```markdown |
| 119 | +# Prompt |
| 120 | + |
| 121 | +You are an autonomous agent improving this project's documentation. |
| 122 | + |
| 123 | +## Current focus |
| 124 | + |
| 125 | +<!-- Edit this section while the loop runs to steer the agent --> |
| 126 | +Focus on the API reference docs. Each endpoint needs a working |
| 127 | +curl example and a description of the response format. |
| 128 | + |
| 129 | +## Rules |
| 130 | + |
| 131 | +- One page per iteration |
| 132 | +- Verify all code examples run correctly |
| 133 | +- Commit with `docs: ...` prefix |
| 134 | +``` |
| 135 | + |
| 136 | +**Why it works:** When the agent does something you don't want, you add a constraint. When you want it to shift focus, you edit the "Current focus" section. The next iteration picks up your changes immediately. |
| 137 | + |
| 138 | +### The context-rich loop |
| 139 | + |
| 140 | +Use [contexts](primitives.md#contexts) to give the agent situational awareness without bloating the prompt with static text. |
| 141 | + |
| 142 | +```markdown |
| 143 | +--- |
| 144 | +checks: [tests] |
| 145 | +contexts: [git-log, test-status, open-issues] |
| 146 | +--- |
| 147 | + |
| 148 | +# Prompt |
| 149 | + |
| 150 | +{{ contexts.git-log }} |
| 151 | + |
| 152 | +{{ contexts.test-status }} |
| 153 | + |
| 154 | +{{ contexts.open-issues }} |
| 155 | + |
| 156 | +You are an autonomous bug-fixing agent. Review the open issues and |
| 157 | +test failures above. Pick the most important one and fix it. |
| 158 | + |
| 159 | +## Rules |
| 160 | + |
| 161 | +- One fix per iteration |
| 162 | +- Write a regression test for every bug fix |
| 163 | +- Commit with `fix: resolve #<issue-number>` |
| 164 | +``` |
| 165 | + |
| 166 | +**Why it works:** The agent sees fresh data every iteration — what was recently committed, what tests are failing, what issues are open. It makes informed decisions about what to work on without you updating the prompt. |
| 167 | + |
| 168 | +## Anti-patterns to avoid |
| 169 | + |
| 170 | +### Too vague |
| 171 | + |
| 172 | +```markdown |
| 173 | +<!-- DON'T --> |
| 174 | +Improve the codebase. Make things better. |
| 175 | +``` |
| 176 | + |
| 177 | +The agent doesn't know what "better" means to you. It might refactor code that works fine, add unnecessary abstractions, or reorganize files in ways that break your workflow. Always point at a concrete task source. |
| 178 | + |
| 179 | +### Too many tasks per iteration |
| 180 | + |
| 181 | +```markdown |
| 182 | +<!-- DON'T --> |
| 183 | +Implement user authentication, add rate limiting, write tests for |
| 184 | +both, update the API docs, and deploy to staging. |
| 185 | +``` |
| 186 | + |
| 187 | +One iteration = one task. If the agent tries to do five things, it'll do all of them poorly. The context window fills up, the commit is a mess, and if something breaks you can't tell which change caused it. |
| 188 | + |
| 189 | +### No validation step |
| 190 | + |
| 191 | +```markdown |
| 192 | +<!-- DON'T --> |
| 193 | +Read TODO.md, implement the next task, and commit. |
| 194 | +``` |
| 195 | + |
| 196 | +Without "run tests before committing", the agent will commit broken code. Then the next iteration starts with a broken codebase, wastes time understanding what went wrong, and may make it worse. Always include a validation step, either in the prompt or via checks. |
| 197 | + |
| 198 | +### Instructions that fight the checks |
| 199 | + |
| 200 | +```markdown |
| 201 | +<!-- DON'T — the lint check already enforces this --> |
| 202 | +--- |
| 203 | +checks: [lint] |
| 204 | +--- |
| 205 | +Make sure all code passes ruff lint checks before committing. |
| 206 | +Run `ruff check .` and fix any issues. |
| 207 | +``` |
| 208 | + |
| 209 | +If you have a lint check, you don't need lint instructions in the prompt. The check runs automatically, and its failure output tells the agent exactly what to fix. Duplicating the instruction wastes prompt space and can create contradictions if the check command differs from what the prompt says. |
| 210 | + |
| 211 | +### No commit instructions |
| 212 | + |
| 213 | +```markdown |
| 214 | +<!-- DON'T --> |
| 215 | +Fix bugs from the issue tracker. |
| 216 | +``` |
| 217 | + |
| 218 | +Ralphify doesn't commit for the agent — the agent must do it. Without explicit commit instructions, some agents won't commit at all, and progress is lost when the next iteration starts fresh. Always include: |
| 219 | + |
| 220 | +```markdown |
| 221 | +- Commit with a descriptive message like `fix: resolve X that caused Y` |
| 222 | +``` |
| 223 | + |
| 224 | +## Tuning a running loop |
| 225 | + |
| 226 | +The most powerful feature of ralph loops is that you can edit `RALPH.md` while the loop is running. Here's how to use this effectively: |
| 227 | + |
| 228 | +**When the agent does something dumb, add a sign.** This is the core insight from the [Ralph Wiggum technique](https://ghuntley.com/ralph/). If the agent keeps deleting tests instead of fixing them: |
| 229 | + |
| 230 | +```markdown |
| 231 | +## Rules |
| 232 | +- Do NOT delete or skip failing tests — fix the code instead |
| 233 | +``` |
| 234 | + |
| 235 | +**When the agent gets stuck in a loop,** it's usually because the prompt is ambiguous about what to do when something fails. Add explicit fallback instructions: |
| 236 | + |
| 237 | +```markdown |
| 238 | +- If you can't fix a failing test after one attempt, move on to the next task |
| 239 | + and leave a TODO comment explaining the issue |
| 240 | +``` |
| 241 | + |
| 242 | +**When you want to shift focus,** edit the task source. Change "Read TODO.md" to "Focus only on the API module" and the next iteration follows the new direction. |
| 243 | + |
| 244 | +**When the agent is too ambitious,** tighten the scope constraint: |
| 245 | + |
| 246 | +```markdown |
| 247 | +- Touch at most 3 files per iteration |
| 248 | +- Do not refactor code that isn't directly related to the current task |
| 249 | +``` |
| 250 | + |
| 251 | +## Prompt size and context windows |
| 252 | + |
| 253 | +Keep your prompt focused. A long prompt with every possible instruction eats into the agent's context window, leaving less room for the actual codebase. |
| 254 | + |
| 255 | +Rules of thumb: |
| 256 | + |
| 257 | +- **Core prompt:** 20-50 lines is the sweet spot. Enough to be specific, short enough to leave room for work. |
| 258 | +- **Contexts:** Use `{{ contexts.name }}` placeholders to inject only the data the agent needs. Don't dump everything — pick the 2-3 most useful signals. |
| 259 | +- **Check failure output:** This is injected automatically and can be long. If your checks produce verbose output, consider using scripts that filter to the relevant lines. |
| 260 | + |
| 261 | +## Next steps |
| 262 | + |
| 263 | +- [Getting Started](getting-started.md) — set up your first loop |
| 264 | +- [Primitives](primitives.md) — full reference for checks, contexts, and ralphs |
| 265 | +- [Cookbook](cookbook.md) — copy-pasteable setups for common use cases |
0 commit comments