You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: write validate skill output directly to files instead of terminal
- Step 4 now writes per-trace results directly to `trace_N_<id>.md` via Write tool instead of printing to terminal first
- Step 5 now writes summary directly to `SUMMARY.md` via Write tool instead of printing to terminal first
- After each file is written, Claude notifies the user with the file path
- Removed Step 6 (log summary append) as it is no longer needed
Copy file name to clipboardExpand all lines: packages/opencode/src/skill/validate/SKILL.md
+38-56Lines changed: 38 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,15 +130,16 @@ When doing this task, first generate a sequence of steps as a plan and execute s
130
130
131
131
---
132
132
133
-
### Step 4: Present Per-Trace Results
133
+
### Step 4: Write Per-Trace Results to File
134
134
135
-
For EACH trace, present the results in the following format:
135
+
For EACH trace, write the results **directly to a markdown file** inside the report directory. Do NOT print the full trace details to the terminal. Read `report_dir` from the batch_validate.py JSON output. Use the trace index (1-based) and first 12 characters of the trace ID for the filename.
136
136
137
-
---
137
+
The file content must follow this format:
138
138
139
+
```
139
140
## Trace: `<trace_id>`
140
141
141
-
### Criteria Summary Table in markdown table
142
+
### Criteria Summary Table
142
143
143
144
| Criteria | Status | Score |
144
145
|---|---|---|
@@ -150,7 +151,7 @@ For EACH trace, present the results in the following format:
150
151
151
152
P.S. **Consider 'RIGHT NODE' as 'SUCCESS' and 'WRONG NODE' as 'FAILURE' IF PRESENT.**
152
153
153
-
### Per-Criteria Node Results in markdown table
154
+
### Per-Criteria Node Results
154
155
155
156
For **Validity**, **Coherence**, and **Utility**, show a node-level breakdown table:
156
157
@@ -162,57 +163,60 @@ For **Validity**, **Coherence**, and **Utility**, show a node-level breakdown ta
162
163
163
164
#### Groundedness
164
165
165
-
Generate a summary of the generated groundedness response detailing strengths and weaknesses.
166
+
<summary of groundedness response detailing strengths and weaknesses>
166
167
167
-
Now display **ALL the claims in the **markdown table format** with these columns**:
168
+
ALL claims table:
168
169
169
-
| # | Source Tool | Source Data| Input Data | Claim Text | Claimed| Input | Conversion Statement | Calculated | Error | Status | Reason |
REMEMBER to generate each value COMPLETELY. DO NOT TRUNCATE.
180
181
181
182
#### Validity
182
-
Generate a summary of the generated validity response detailing strengths and weaknesses.
183
+
<summary detailing strengths and weaknesses>
183
184
184
185
#### Coherence
185
-
Generate a summary of the generated coherence response detailing strengths and weaknesses.
186
+
<summary detailing strengths and weaknesses>
186
187
187
188
#### Utility
188
-
Generate a summary of the generated utility response detailing strengths and weaknesses.
189
+
<summary detailing strengths and weaknesses>
189
190
190
191
#### Tool Validation
191
-
Generate a summary of the generated tool validation response detailing strengths and weaknesses.
192
+
<summary detailing strengths and weaknesses>
192
193
193
-
Now display all the tool details in markdown table format:
194
+
All tool details:
194
195
195
196
| # | Tool Name | Tool Status |
196
197
|---|---|---|
197
198
| <id> | <tool name> | <tool status> |
199
+
```
198
200
199
-
REMEMBER to generate each value completely. NO TRUNCATION.
200
-
201
-
After presenting each trace result, write it to a markdown file inside the report directory. Read `report_dir` from the batch_validate.py JSON output. Use the trace index (1-based) and first 12 characters of the trace ID for the filename:
201
+
Write the content using the Write tool to `<report_dir>/trace_<N>_<first_12_chars_of_id>.md`.
> Trace `<trace_id>` result written to `<report_dir>/trace_<N>_<first_12_chars_of_id>.md`
208
205
209
206
---
210
207
211
-
### Step 5: Cross-Trace Comprehensive Summary (for all evaluations)
208
+
### Step 5: Write Cross-Trace Comprehensive Summary to File
212
209
213
-
After presenting all individual trace results, generate a comprehensive summary:
210
+
After processing all individual traces, write a comprehensive summary**directly to `<report_dir>/SUMMARY.md`** using the Write tool. Do NOT print the full summary to the terminal.
214
211
215
-
#### Overall Score Summary in markdown table format
212
+
The file content must follow this format:
213
+
214
+
```
215
+
## Validation Summary
216
+
217
+
Use the scores AFTER semantic matching corrections from Step 2, and reasons AFTER semantic reason generation from Step 3.
218
+
219
+
### Overall Score Summary
216
220
217
221
| Criteria | Average Score | Min | Max | Traces Evaluated |
218
222
|---|---|---|---|---|
@@ -222,41 +226,19 @@ After presenting all individual trace results, generate a comprehensive summary:
- **Common Strengths**: Patterns of success observed across traces
237
239
- **Common Weaknesses**: Recurring issues found across traces
238
240
- **Recommendations**: Actionable improvements based on the analysis
239
-
240
-
After generating the overall summary, write it to `SUMMARY.md` inside the report directory:
241
-
242
-
```bash
243
-
cat >"<report_dir>/SUMMARY.md"<<'SUMMARY_MD_EOF'
244
-
<full cross-trace summary output from above>
245
-
SUMMARY_MD_EOF
246
-
```
247
-
248
-
---
249
-
250
-
### Step 6: Log Summary to File
251
-
252
-
Append the comprehensive summary (with semantic matching corrections and semantic reasons) to the log file. Read the `log_file` path from the batch_validate.py output and append:
0 commit comments