Skip to content

Commit cb35887

Browse files
committed
feat: update karpathy-guidelines.md to enhance clarity on success criteria and implementation checks
1 parent fb4aaa8 commit cb35887

1 file changed

Lines changed: 22 additions & 2 deletions

File tree

templates/skills/karpathy-guidelines.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,6 @@ Behavioral guidelines to reduce common LLM coding mistakes.
1010

1111
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
1212

13-
**Internal use:** Apply these guidelines silently. Do not cite this document, its title, or guideline names in user-facing responses.
14-
1513
## 1. Think Before Coding
1614

1715
**Don't assume. Don't hide confusion. Surface tradeoffs.**
@@ -54,6 +52,15 @@ The test: Every changed line should trace directly to the user's request.
5452

5553
**Define success criteria. Loop until verified.**
5654

55+
Before implementing, define the exact observable acceptance check:
56+
- Command output
57+
- Test assertion
58+
- UI state
59+
- File diff
60+
- API response
61+
62+
Do not start implementation if "works" cannot be checked objectively. If the check is unclear and would change the solution, ask before coding using AskUserQuestion tool.
63+
5764
Transform tasks into verifiable goals:
5865
- "Add validation" → "Write tests for invalid inputs, then make them pass"
5966
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
@@ -67,3 +74,16 @@ For multi-step tasks, state a brief plan:
6774
```
6875

6976
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
77+
78+
## 5. No Proxy Success
79+
80+
**Passing means the acceptance check passes.**
81+
82+
Don't substitute weaker signals:
83+
- "No crash" unless that was the goal.
84+
- "Non-empty output" unless any output is valid.
85+
- "Looks plausible" unless the task is subjective.
86+
- "Some tests pass" while the target check fails.
87+
- "Implementation complete" without verification.
88+
89+
Report exact pass, partial progress, or failure.

0 commit comments

Comments
 (0)