@@ -204,9 +204,43 @@ function Features(): ReactNode {
204204 </ div >
205205 < div className = { styles . featureRight } >
206206 < p className = { styles . featureCardDesc } >
207- The agent MUST use TDD. A state machine enforces the
208- Red-Green-Refactor cycle and blocks file edits until tests
209- fail first. No shortcuts, no skipping ahead.
207+ < a
208+ href = "https://medium.com/@bethqiang/the-absolute-beginners-guide-to-test-driven-development-with-a-practical-example-c39e73a11631"
209+ target = "_blank"
210+ rel = "noreferrer"
211+ >
212+ Test Driven Development
213+ </ a > { ' ' }
214+ is almost a universal way to build software. Write failing
215+ tests, watch them fail, write code to make them pass, watch
216+ them pass. You're gradually building a repository of every
217+ decision you ever made. Even better, if a decision is{ ' ' }
218+ < i > un-made</ i > , tests fail. Alarms go off.
219+ </ p >
220+ < p className = { styles . featureCardDesc } >
221+ Forcing the agent through TDD created a repository of all my
222+ micro decisions - I stopped needing to repeat past decisions
223+ to the agent. In time I began to be{ ' ' }
224+ < i > supervising the TDD process itself</ i > , while the agent
225+ built software according to its plan. I had gotten myself out
226+ of the loop, removing a layer of tedium.
227+ </ p >
228+ < p className = { styles . featureCardDesc } >
229+ But babysitting a TDD process is almost as tedious as doing
230+ TDD! I was constantly stopping the agent - don't do that, you
231+ didn't see the tests pass, roll that back, it's not time to
232+ write code yet. The agent was frequently befuddled by this. So
233+ I asked: can I get myself out of < i > that</ i > loop too?
234+ </ p >
235+ < p className = { styles . featureCardDesc } >
236+ The solution Claude and I hit on is a state machine tracked in
237+ a local log file. You can see the state machine at the top of
238+ the page!
239+ </ p >
240+ < p className = { styles . featureCardDesc } >
241+ With CodeLeash, the agent MUST use TDD. A state machine
242+ enforces the Red-Green-Refactor cycle and blocks file edits
243+ until tests fail first. No shortcuts, no skipping ahead.
210244 </ p >
211245 < p className = { styles . featureCardDesc } >
212246 A test suite in which the agent has seen every test fail -
@@ -277,10 +311,16 @@ function Features(): ReactNode {
277311 < p className = { styles . featureCardDesc } >
278312 CodeLeash is full of examples for your coding agent to crib
279313 from. Some traverse the codebase with ASTs; others with
280- regexes. Most code review feedback can be at least partly
281- automated. By the time your agent stops working, basic issues
282- were already fixed - and you know because the checks passed.
283- Never repeat obvious fixes again.
314+ regexes. A surprising amount of the code review feedback
315+ you've ever given in your software engineering career can be
316+ automated - ask your agent for ideas! With a big enough
317+ library of checks - built by you - once your agent stops
318+ working, all the basic issues were removed in response to the
319+ checks failing, without you watching. You no longer need to
320+ see agent code with obvious flaws.
321+ </ p >
322+ < p className = { styles . featureCardDesc } >
323+ Never repeat obvious fixes to your agent again.
284324 </ p >
285325 < span
286326 className = { `${ styles . featureLearnMore } ${ styles . featureLearnMoreEarth } ` }
@@ -329,16 +369,18 @@ function Features(): ReactNode {
329369 </ div >
330370 < div className = { styles . featureRight } >
331371 < p className = { styles . featureCardDesc } >
332- Each unit test must complete within 10ms. This forces pure
333- business logic - no I/O, no accidental imports, no framework
334- startup. If a test touches the network or spins up a server,
335- it fails.
372+ Each unit test must complete within 10ms. This is a TDD hack -
373+ it forces pure business logic - no I/O, no accidental imports,
374+ no framework startup, which makes it harder for tests to leak
375+ and interfere and allows parallel test runs. If a test touches
376+ the network or spins up a server, it fails.
336377 </ p >
337378 < p className = { styles . featureCardDesc } >
338379 Auto-retry handles transient performance hiccups like JIT
339- warmup. When a test does time out, a flamegraph SVG is
340- generated automatically so you can see exactly where the time
341- went.
380+ warmup or heavy import chains. When a test times out on retry,
381+ a flamegraph SVG is generated automatically so you can debug.
382+ And I was surprised to learn: agents can fix code based on a
383+ flamegraph!
342384 </ p >
343385 < span
344386 className = { `${ styles . featureLearnMore } ${ styles . featureLearnMoreGreen } ` }
0 commit comments