Skip to content

Commit dadcfe6

Browse files
committed
Release v0.5.0
1 parent d05a82d commit dadcfe6

13 files changed

Lines changed: 239 additions & 39 deletions

File tree

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,16 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.5.0] - 2026-03-20
9+
10+
### Added
11+
- Optional explicit `--done-when "TEXT"` success contract for Forge task scopes
12+
13+
### Changed
14+
- Forge now treats open-text task completion as the primary objective and KPI targets as guardrails
15+
- Codex driver status and prompt rendering now surface success mode and `done_when` text
16+
- README and driver docs now describe task-derived completion checks when no explicit success override is provided
17+
818
## [0.4.2] - 2026-03-20
919

1020
### Added
@@ -99,6 +109,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99109
- Multi-language support in MEASURE phase (Elixir, Python, JavaScript, Ruby, Go)
100110
- Simultaneous multi-KPI completion gate
101111

112+
[0.5.0]: https://github.com/DjinnFoundry/forge-loop/releases/tag/v0.5.0
102113
[0.4.2]: https://github.com/DjinnFoundry/forge-loop/releases/tag/v0.4.2
103114
[0.4.1]: https://github.com/DjinnFoundry/forge-loop/releases/tag/v0.4.1
104115
[0.4.0]: https://github.com/DjinnFoundry/forge-loop/releases/tag/v0.4.0

README.md

Lines changed: 38 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,17 +21,18 @@
2121
**Forge Core with first-class drivers for [Claude Code](https://docs.anthropic.com/en/docs/claude-code) and Codex/manual workflows.**
2222

2323
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
24-
[![Version](https://img.shields.io/badge/version-0.4.2-green.svg)](CHANGELOG.md)
24+
[![Version](https://img.shields.io/badge/version-0.5.0-green.svg)](CHANGELOG.md)
2525

26-
Forge is a protocol plus an adapter. The protocol defines KPI tracking, state, strategy rotation, evaluation cadence, and completion rules. The bundled adapter makes that protocol run inside Claude Code with commands, agents, and a stop hook.
26+
Forge is a protocol plus adapters. The protocol defines task success, KPI guardrails, state, strategy rotation, evaluation cadence, and completion rules. The bundled drivers make that protocol run inside Claude Code and Codex/manual workflows.
2727

2828
```
29-
You: /forge "API controllers" --coverage 90 --speed -30%
29+
You: /forge "password reset flow" --done-when "users can request and complete a reset end-to-end" --coverage 90 --speed -30%
3030
3131
Forge: Measuring baseline... 85.2% coverage, 120s
32+
Success contract: password reset works end-to-end
3233
Strategy: coverage-push → 15 tests for edge cases
3334
85.8% (+0.6%), 118s (-2s) ✓
34-
...iterates until all targets met simultaneously...
35+
...iterates until task success and KPI targets are both satisfied...
3536
```
3637

3738
---
@@ -43,6 +44,7 @@ Forge: Measuring baseline... 85.2% coverage, 120s
4344
The portable part of the system:
4445

4546
- iteration protocol (Orient → Measure → Evaluate → Decide → Execute → Verify → Record → Complete)
47+
- task-driven success contract with optional explicit `done_when`
4648
- state format and autoregressive memory
4749
- KPI targets (coverage, speed, quality)
4850
- strategy selection and stagnation handling
@@ -71,7 +73,7 @@ The bundled Codex/manual adapter in this repo:
7173
- `.codex/forge/` state layout for per-project sessions
7274
- shared shell state helpers reused across drivers
7375

74-
Both drivers are first-class in `v0.4.2`. The difference is automation depth:
76+
Both drivers are first-class in `v0.5.0`. The difference is automation depth:
7577
Claude gets hook-driven iteration; Codex gets manual driver scripts that print
7678
the next prompt and manage session state.
7779

@@ -83,7 +85,7 @@ the next prompt and manage session state.
8385
| Codex CLI | First-class manual driver | Install script, `forge-init`, `forge-continue`, `forge-cancel`, project-local state |
8486
| Other agents / plain shell | Protocol-only | Reuse the protocol and state model manually |
8587

86-
Forge is not claiming native parity across agent runtimes. `v0.4.2` ships two real drivers with different control surfaces.
88+
Forge is not claiming native parity across agent runtimes. `v0.5.0` ships two real drivers with different control surfaces.
8789

8890
---
8991

@@ -105,14 +107,24 @@ Each iteration executes one complete eight-phase cycle:
105107

106108
| Phase | What happens |
107109
|-------|-------------|
108-
| **A. Orient** | Read forge-state file, check position + trends + stagnation count |
110+
| **A. Orient** | Read forge-state file, check task success contract + KPI trends + stagnation count |
109111
| **B. Measure** | Run tests with coverage, capture KPIs |
110112
| **C. Evaluate** | Every 3rd iteration: spawn fresh-context subagent for unbiased audit |
111113
| **D. Decide** | Pick strategy from KPI gaps + findings + lessons |
112114
| **E. Execute** | Apply ONE focused transformation |
113115
| **F. Verify** | Tests must be green, re-measure KPIs |
114116
| **G. Record** | Update forge-state with deltas + lessons (the autoregressive step) |
115-
| **H. Complete** | All targets met simultaneously? Done. Otherwise, next iteration. |
117+
| **H. Complete** | Task success contract satisfied and KPI targets met? Done. Otherwise, next iteration. |
118+
119+
### Success Contract
120+
121+
Forge is built for open-text work, not just KPI chasing.
122+
123+
- The task scope is the primary objective.
124+
- `--done-when "TEXT"` is an optional explicit success override.
125+
- If `--done-when` is omitted, Forge derives concrete completion checks from the task scope and records them in Forge state.
126+
- Coverage, speed, and quality stay as guardrails alongside the task itself.
127+
- Completion means both the task and the guardrails are satisfied.
116128

117129
### Strategies
118130

@@ -183,7 +195,7 @@ Codex support is manual by design, but it is now a real shipped driver.
183195

184196
Typical flow:
185197

186-
1. Run `forge-init "scope" ...` in the target project.
198+
1. Run `forge-init "scope" [--done-when "TEXT"] ...` in the target project.
187199
2. Paste the printed prompt into Codex.
188200
3. After each iteration, run `forge-continue` to print the next prompt.
189201
4. Use `forge-status` to inspect the active session.
@@ -209,15 +221,22 @@ Driver safety:
209221
/forge "LiveView components" --coverage 95 --speed -20%
210222
```
211223

224+
#### Open-text task with explicit success
225+
226+
```
227+
/forge "password reset flow" --done-when "users can request, receive, and complete a reset end-to-end" --coverage 90 --quality strict
228+
```
229+
212230
#### All options
213231

214232
```
215-
/forge "SCOPE" --coverage N --speed -N% --quality strict|moderate|lax --max-iterations N
233+
/forge "SCOPE" [--done-when "TEXT"] --coverage N --speed -N% --quality strict|moderate|lax --max-iterations N
216234
```
217235

218236
| Option | Default | Description |
219237
|--------|---------|-------------|
220238
| `SCOPE` | (required) | What to improve — quoted string |
239+
| `--done-when "TEXT"` | task-derived | Explicit success contract. If omitted, derive completion checks from the task itself |
221240
| `--coverage N` | baseline + 2 | Minimum coverage % target |
222241
| `--speed -N%` | -20% | Speed reduction from baseline |
223242
| `--quality` | moderate | strict (0 high, 0 med) / moderate (0 high, ≤3 med) / lax (0 high, ≤5 med) |
@@ -253,6 +272,13 @@ Other runtimes can reuse the same format in a different state root. Each iterati
253272
---
254273
session_id: "0320-1430-a3b2"
255274
scope: "API controllers"
275+
success:
276+
mode: "task-derived"
277+
task: "API controllers"
278+
done_when: null
279+
completion_checks:
280+
- "controller edge cases covered and passing"
281+
- "no controller path regresses current behavior"
256282
baseline:
257283
coverage: 85.2
258284
speed_seconds: 120
@@ -347,7 +373,7 @@ Distilled from studying autoresearch, Ralph Wiggum, pi-autoresearch, SICA, and a
347373
| Strategy | Single prompt | 8 named strategies, auto-rotation on stagnation |
348374
| Evaluation | Self-evaluation (anchoring bias) | Fresh-context audits every 3 iterations |
349375
| Memory | Context window only | Persistent state file survives compaction |
350-
| Completion | Manual / hope | Exact completion marker after protocol checks |
376+
| Completion | Manual / hope | Exact completion marker after task success plus protocol checks |
351377
| Lessons | Lost between iterations | Accumulated, inform strategy selection |
352378
| Stagnation | Repeats same approach | Detects + rotates after low-delta iterations |
353379
| Portability | Rebuild per runtime | Portable protocol, Claude and Codex drivers bundled |
@@ -357,7 +383,7 @@ Distilled from studying autoresearch, Ralph Wiggum, pi-autoresearch, SICA, and a
357383
## Claims We Are Willing To Make
358384

359385
- Forge packages proven loop patterns into a reusable protocol with first-class Claude Code and Codex/manual drivers.
360-
- Forge improves repeatability versus ad-hoc prompting when you care about KPI targets, iteration memory, and strategy rotation.
386+
- Forge improves repeatability versus ad-hoc prompting when you care about task success, KPI guardrails, iteration memory, and strategy rotation.
361387
- Forge does **not** yet provide universal runtime adapter parity beyond the shipped drivers.
362388
- Forge is more preconfigured than raw hooks. It is not a new primitive.
363389

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.4.2
1+
0.5.0

commands/forge-status.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@ Show the current Claude Code driver status for this project.
2222
- forge-state path if present
2323
5. If the forge-state file exists, also show:
2424
- scope
25+
- success mode
26+
- done_when text, or that it is task-derived if no explicit override exists
2527
- current strategy
2628
- stagnation_count
29+
- whether completion checks have been recorded yet if the success block is present
2730
6. Do not mutate any files. This command is read-only.

commands/forge.md

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
description: "KPI-driven codebase improvement loop"
3-
argument-hint: '"SCOPE" --coverage N --speed -N% --quality strict|moderate|lax [--max-iterations N]'
2+
description: "Task-driven Forge loop with KPI guardrails"
3+
argument-hint: '"SCOPE" [--done-when "TEXT"] --coverage N --speed -N% --quality strict|moderate|lax [--max-iterations N]'
44
---
55

66
# Forge Command
@@ -15,28 +15,33 @@ This command is the Claude Code driver for Forge Core.
1515

1616
## Argument Parsing
1717

18-
1. **SCOPE**: Quoted string describing what to improve (e.g., "LiveView components", "API controllers")
19-
2. `--coverage N`: Minimum coverage % target (default: current baseline + 2)
20-
3. `--speed -N%`: Speed reduction target as percentage (default: -20% from baseline)
21-
4. `--quality strict|moderate|lax`: Quality gate level (default: moderate)
18+
1. **SCOPE**: Quoted string describing the primary task or area to improve (e.g., "Build password reset flow", "LiveView components")
19+
2. `--done-when "TEXT"`: Optional explicit success override. If omitted, derive concrete completion checks from the task scope and persist them in forge-state.
20+
3. `--coverage N`: Minimum coverage % target (default: current baseline + 2)
21+
4. `--speed -N%`: Speed reduction target as percentage (default: -20% from baseline)
22+
5. `--quality strict|moderate|lax`: Quality gate level (default: moderate)
2223
- strict: 0 high, 0 medium findings
2324
- moderate: 0 high, <= 3 medium
2425
- lax: 0 high, <= 5 medium
25-
5. `--max-iterations N`: Safety limit (default: 20)
26+
6. `--max-iterations N`: Safety limit (default: 20)
2627

2728
If SCOPE is missing, ask what area to focus on.
2829

2930
## Launch Sequence
3031

3132
1. **Measure baseline**: Run test suite with coverage, parse coverage/speed/tests/failures
3233
2. **Generate session ID**: `MMDD-HHMM-XXXX` format
33-
3. **Compute targets** from arguments:
34+
3. **Establish success contract**:
35+
- primary task: `SCOPE`
36+
- explicit success override: `--done-when "TEXT"` if provided
37+
- otherwise: derive concrete completion checks from the task itself during iteration 1 and persist them in forge-state
38+
4. **Compute targets** from arguments:
3439
- coverage: `--coverage N` if provided, else `baseline_coverage + 2`
3540
- speed: `--speed -N%` if provided, compute `baseline_speed * (1 - N/100)`, else `baseline_speed * 0.8`
3641
- quality: `--quality` value or "moderate"
37-
4. **Create forge state file**: `.claude/forge-state.SESSION.md` with baseline + targets
38-
5. **Create loop state file**: `.claude/ralph-loop.SESSION.local.md` with forge prompt
39-
6. **Report** baseline, targets, and begin first iteration
42+
5. **Create forge state file**: `.claude/forge-state.SESSION.md` with success contract + baseline + targets
43+
6. **Create loop state file**: `.claude/ralph-loop.SESSION.local.md` with forge prompt
44+
7. **Report** baseline, targets, and begin first iteration
4045

4146
## Loop Prompt (written to state file)
4247

@@ -45,6 +50,7 @@ Read .claude/forge-state.SESSION.md and follow The Forge Protocol (A through H).
4550
4651
SCOPE: {parsed scope}
4752
SESSION: {session_id}
53+
DONE WHEN: {explicit done_when or derive from task and record in forge-state}
4854
4955
You are in a forge loop. Each iteration:
5056
A. ORIENT - Read forge-state, check position + trends + stagnation
@@ -54,7 +60,7 @@ D. DECIDE - Pick strategy from KPI gaps + findings + lessons
5460
E. EXECUTE - ONE focused change using appropriate subagent
5561
F. VERIFY - Tests must be green, re-measure with coverage
5662
G. RECORD - Update forge-state with deltas + lessons (autoregressive step)
57-
H. COMPLETE - ALL targets met simultaneously? → output RALPH_COMPLETE on its own line
63+
H. COMPLETE - Task success contract satisfied AND KPI targets met? → output RALPH_COMPLETE on its own line
5864
5965
Refer to the forge skill for the full protocol.
6066

drivers/codex/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,20 @@ It reuses Forge Core, but does not depend on Claude Code commands or stop hooks.
66
Instead it ships small shell entrypoints that manage loop state and print the
77
next prompt to run in Codex.
88

9+
Forge is task-driven, not just KPI-driven. Each session stores:
10+
11+
- the open-text task scope
12+
- either an explicit `--done-when "TEXT"` success contract or a task-derived one
13+
- the normal KPI guardrails for coverage, speed, quality, and max iterations
14+
15+
Typical flow:
16+
17+
1. `forge-init "scope" --done-when "what finished means"`
18+
2. Paste the printed prompt into Codex
19+
3. Record iteration results in Forge state
20+
4. Run `forge-continue` for the next prompt
21+
5. Use `forge-status` to inspect scope, success mode, and next iteration
22+
923
## Files
1024

1125
- `bin/forge-init` — create a new Forge session for the current project
@@ -31,3 +45,5 @@ location differs from the Claude Code adapter.
3145
entries in Forge state instead of blindly incrementing loop metadata.
3246
- If multiple active Codex sessions exist, `forge-continue` and `forge-cancel`
3347
require an explicit session id instead of guessing.
48+
- Open-text `scope` and `--done-when` values are persisted in Forge state and
49+
rendered back into prompts and status output.

drivers/codex/bin/forge-continue

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,12 @@ if [[ ! -f "$loop_state" ]] || [[ ! -f "$forge_state" ]]; then
2929
exit 1
3030
fi
3131

32-
active="$(forge_frontmatter_value "$loop_state" "active")"
33-
max_iterations="$(forge_frontmatter_value "$loop_state" "max_iterations")"
34-
scope="$(forge_frontmatter_value "$forge_state" "scope")"
32+
active="$(forge_strip_quotes "$(forge_frontmatter_value "$loop_state" "active")")"
33+
max_iterations="$(forge_strip_quotes "$(forge_frontmatter_value "$loop_state" "max_iterations")")"
34+
scope="$(forge_strip_quotes "$(forge_frontmatter_value "$forge_state" "scope")")"
3535
recorded_iteration="$(forge_recorded_iteration "$forge_state")"
3636
iteration=$((recorded_iteration + 1))
37+
done_when_text="$(forge_done_when_text "$forge_state")"
3738

3839
if [[ "$active" != "true" ]]; then
3940
echo "Error: Session ${session_id} is not active." >&2
@@ -50,7 +51,7 @@ if (( iteration > max_iterations )); then
5051
exit 1
5152
fi
5253

53-
prompt_text="$(forge_render_prompt "$PROMPT_TEMPLATE" "$session_id" "$scope" "$iteration")"
54+
prompt_text="$(forge_render_prompt "$PROMPT_TEMPLATE" "$session_id" "$scope" "$iteration" "$done_when_text")"
5455

5556
echo "Forge Codex driver session ${session_id}, iteration ${iteration}."
5657
echo "Forge state: .codex/forge/forge-state.${session_id}.md"

drivers/codex/bin/forge-init

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ source "$LIB_PATH"
1010

1111
usage() {
1212
cat <<'EOF'
13-
Usage: forge-init "SCOPE" [--coverage N] [--speed -N%] [--quality strict|moderate|lax] [--max-iterations N]
13+
Usage: forge-init "SCOPE" [--done-when "TEXT"] [--coverage N] [--speed -N%] [--quality strict|moderate|lax] [--max-iterations N]
1414
EOF
1515
}
1616

@@ -34,6 +34,7 @@ coverage_target=""
3434
speed_target="-20%"
3535
quality_target="moderate"
3636
max_iterations="20"
37+
done_when=""
3738

3839
while [[ $# -gt 0 ]]; do
3940
case "$1" in
@@ -52,6 +53,11 @@ while [[ $# -gt 0 ]]; do
5253
quality_target="$2"
5354
shift 2
5455
;;
56+
--done-when)
57+
require_value "$1" "${2:-}"
58+
done_when="$2"
59+
shift 2
60+
;;
5561
--max-iterations)
5662
require_value "$1" "${2:-}"
5763
max_iterations="$2"
@@ -96,16 +102,27 @@ timestamp="$(date -u +"%m%d-%H%M")"
96102
rand="$(od -An -N4 -tx1 /dev/urandom | tr -d ' \n' | cut -c1-4)"
97103
session_id="${timestamp}-${rand}"
98104
started_at="$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
105+
success_mode="$(if [[ -n "$done_when" ]]; then printf 'explicit'; else printf 'task-derived'; fi)"
106+
done_when_yaml="null"
107+
if [[ -n "$done_when" ]]; then
108+
done_when_yaml="$(forge_yaml_quote "$done_when")"
109+
fi
110+
scope_yaml="$(forge_yaml_quote "$scope")"
99111

100112
forge_state="${state_dir}/forge-state.${session_id}.md"
101113
loop_state="${state_dir}/loop-state.${session_id}.md"
102114

103115
cat > "$forge_state" <<EOF
104116
---
105117
session_id: "${session_id}"
106-
scope: "${scope}"
118+
scope: ${scope_yaml}
107119
driver: "codex"
108120
state_root: ".codex/forge"
121+
success:
122+
mode: "${success_mode}"
123+
task: ${scope_yaml}
124+
done_when: ${done_when_yaml}
125+
completion_checks: []
109126
baseline:
110127
coverage: null
111128
speed_seconds: null
@@ -127,6 +144,7 @@ ideas: []
127144
## Session
128145
- Driver: Codex
129146
- Scope: ${scope}
147+
- Done when: ${done_when:-Derive concrete completion checks from the task scope and record them during iteration 1.}
130148
- Started at: ${started_at}
131149
- Notes: Baseline not measured yet. First iteration must establish it from real test output.
132150
EOF
@@ -144,7 +162,8 @@ started_at: "${started_at}"
144162
---
145163
EOF
146164

147-
prompt_text="$(forge_render_prompt "$PROMPT_TEMPLATE" "$session_id" "$scope" "1")"
165+
done_when_text="${done_when:-Derive concrete completion checks from the task scope and record them in forge-state before EXECUTE.}"
166+
prompt_text="$(forge_render_prompt "$PROMPT_TEMPLATE" "$session_id" "$scope" "1" "$done_when_text")"
148167

149168
echo "Initialized Forge Codex driver session ${session_id}."
150169
echo "Forge state: .codex/forge/forge-state.${session_id}.md"

0 commit comments

Comments
 (0)