You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: guides/README.md
+11-1Lines changed: 11 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,7 +108,7 @@ gd dev <path/to/guide_dir>
108
108
This runs the following pipeline after the grader calibrates successfully:
109
109
110
110
1.**Generate `prompts.md`** if missing — uses Gemini CLI to create a set of developer-facing prompts derived from the guide
111
-
2.**Find or create a task file** in `harness/tasks/` — scans existing tasks for a matching `grader:` field, or creates `<guideName>-task.md` using the first prompt from `prompts.md` (defaults to `daily-grind` base app)
111
+
2.**Find or create a task file** in `harness/tasks/` — scans existing tasks for a matching `grader:` field, or creates `<guideName>-task.md` using the first prompt from `prompts.md` (defaults to `daily-grind` base app). If the task file already exists but its prompt has drifted from `prompts.md`, `gd dev` will warn you. Run with `--sync-task` to force it to synchronize.
112
112
3.**Grade the base app as-is** (pre-score) — establishes a baseline before any agent runs
113
113
4.**Run the agent** in both `unguided` (no guide access) and `guided` (with MCP guide access) modes against the base app
114
114
5.**Grade both outputs** and print a comparison:
@@ -125,6 +125,16 @@ The agent and base app are selected from the [harness config](../harness/config.
125
125
126
126
The generated task file is automatically included in future `gd eval suite` runs — the suite discovers all task files in `harness/tasks/` by default.
127
127
128
+
### Synchronizing All Regular Tasks
129
+
130
+
If you update multiple `prompts.md` files or want to ensure all regular tasks are in sync with their respective guides, you can run:
131
+
132
+
```bash
133
+
gd gen-task-suite
134
+
```
135
+
136
+
This script scans for "eval-ready" guides, reads their `prompts.md`, and updates the corresponding `<guideName>-task.md` in `harness/tasks/` while preserving any custom `base_app` configuration in the task file.
137
+
128
138
### Negative Suite
129
139
130
140
To verify that guides improve agent performance starting from a "bad" implementation, you can run a **Negative Suite**.
Copy file name to clipboardExpand all lines: harness/tasks/animate-scrollbar-color-on-scroll-task.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,4 +2,4 @@
2
2
base_app: daily-grind
3
3
grader: animate-scrollbar-color-on-scroll
4
4
---
5
-
hey can u make the scrollbar color change as i scroll down the page? like it should start as one color and shift to another as you get to the bottom of the coffee site. make it look smooth.
5
+
hey can u make the scrollbar color change as i scroll down the page? like it should start as one color and shift to another as you get to the bottom. make it look smooth.
Copy file name to clipboardExpand all lines: harness/tasks/deprioritize-background-fetches-task.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,4 +2,4 @@
2
2
base_app: empty-app
3
3
grader: deprioritize-background-fetches
4
4
---
5
-
Create an extremely minimal web page with a single button that triggers two concurrent fetch requests: one request to '/api/data' for mission-critical data that must be loaded as quickly as possible, and another to '/api/analytics' that POSTs a `{click: 1}` payload. Write the page to index.html.
5
+
Create an extremely minimal web page with a single button that triggers two concurrent fetch requests: one request to '/api/data' for mission-critical data that must be loaded as quickly as possible, and another to '/api/analytics' that POSTs a `{click: 1}` payload.
Copy file name to clipboardExpand all lines: harness/tasks/optimize-image-priority-task.md
+1-3Lines changed: 1 addition & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,4 @@
2
2
base_app: empty-app
3
3
grader: optimize-image-priority
4
4
---
5
-
Create an extremely minimal product landing page optimized for a main hero image 'hero-lcp.jpg' which is the largest contentful element. The page also contains a product image gallery where the first image is visible but the second image 'gallery-alt.jpg' is currently hidden behind a toggle. There is also a secondary 'mega-menu-promo.jpg' image that is part of a navigation menu and initially hidden. Finally, include a 'footer-logo.png' much further down the page below the fold.
6
-
7
-
MANDATORY: Write the page to index.html and ensure that all image sources exactly match the filenames provided. Do not bother downloading stock images, just use the filenames as the src attributes, it's ok if they don't exist.
5
+
Create an extremely minimal product landing page that features a main hero image 'hero-lcp.jpg' which is the largest contentful element. The page also contains a product image gallery where the first image is visible but the second image 'gallery-alt.jpg' is currently hidden behind a toggle. There is also a secondary 'mega-menu-promo.jpg' image that is part of a navigation menu and initially hidden. Finally, include a 'footer-logo.png' much further down the page below the fold.
Copy file name to clipboardExpand all lines: harness/tasks/optimize-preload-priority-task.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,4 +2,4 @@
2
2
base_app: empty-app
3
3
grader: optimize-preload-priority
4
4
---
5
-
Create an extremely minimal video landing page that optimizes for LCP. This includes a video poster image 'poster.jpg' (the LCP element), a custom web font 'brand-font.woff2' that is critical for the header rendering, and a secondary font 'secondary-font.woff2' for less critical UI elements. Write the page to index.html.
5
+
Create an extremely minimal video landing page that optimizes for LCP. This includes a video poster image 'poster.jpg' (the LCP element), a custom web font 'brand-font.woff2' that is critical for the header rendering, and a secondary font 'secondary-font.woff2' for less critical UI elements.
Copy file name to clipboardExpand all lines: harness/tasks/optimize-script-priority-task.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,4 +2,4 @@
2
2
base_app: empty-app
3
3
grader: optimize-script-priority
4
4
---
5
-
Create an extremely minimal web page for a dashboard. It requires a critical interactivity script at '/js/app.js' that should be loaded asynchronously. It also includes an older '/js/legacy-widgets.js' script that is normally parser-blocking. Finally, include an analytics script '/js/tracker.js'. Write the page to index.html.
5
+
Create an extremely minimal web page for a dashboard. It requires a critical interactivity script at '/js/app.js' that should be loaded asynchronously. It also includes an older '/js/legacy-widgets.js' script that is normally parser-blocking. Finally, include an analytics script '/js/tracker.js'.
0 commit comments