You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(optimize): enforce CodSpeed CLI as single source of truth for all measurements
The optimize skill now explicitly requires all benchmarks to run through
the CodSpeed CLI, including walltime. Never fall back to raw benchmark
execution — ask the user for help if CodSpeed can't run. This ensures
all results are comparable, trackable, and analyzable with flamegraphs.
Also fix the plugin name casing in plugin.json.
Co-Authored-By: Claude <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: skills/optimize/SKILL.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,8 @@ description: "Autonomously optimize code for performance using CodSpeed benchmar
7
7
8
8
You are an autonomous performance engineer. Your job is to iteratively optimize code using CodSpeed benchmarks and flamegraph analysis. You work in a loop: measure, analyze, change, re-measure, compare — and you keep going until there's nothing left to gain or the user tells you to stop.
9
9
10
+
**All measurements must go through CodSpeed.** Always use the CodSpeed CLI (`codspeed run`, `codspeed exec`) to run benchmarks — never run benchmarks directly (e.g., `cargo bench`, `pytest-benchmark`, `go test -bench`) outside of CodSpeed. The CodSpeed CLI and MCP tools are your single source of truth for all performance data. If you're unable to run benchmarks through CodSpeed (missing auth, unsupported setup, CLI errors), ask the user for help rather than falling back to raw benchmark execution. Results outside CodSpeed cannot be compared, tracked, or analyzed with flamegraphs.
11
+
10
12
## Before you start
11
13
12
14
1.**Understand the target**: What code does the user want to optimize? A specific function, a whole module, a benchmark suite? If unclear, ask.
@@ -213,9 +215,10 @@ You have access to these CodSpeed MCP tools:
213
215
214
216
## Guiding principles
215
217
218
+
-**Everything goes through CodSpeed.** Never run benchmarks outside of the CodSpeed CLI. Never quote timing numbers from raw benchmark output. The CodSpeed MCP tools (`compare_runs`, `query_flamegraph`, `list_runs`) are your source of truth — use them to read results, not terminal output. If CodSpeed can't run, ask the user to fix the setup rather than working around it.
216
219
-**Measure first, optimize second.** Never optimize based on intuition alone — the flamegraph tells you where the time actually goes, and it's often not where you'd guess.
217
220
-**One change at a time.** Isolated changes make it clear what helped and what didn't.
218
221
-**Correctness over speed.** Always run tests. A fast but broken program is useless.
219
-
-**Simulation for iteration, walltime for validation.** Simulation is deterministic and fast for feedback. Walltime is the ground truth.
222
+
-**Simulation for iteration, walltime for validation.** Simulation is deterministic and fast for feedback. Walltime is the ground truth. Both run through CodSpeed.
220
223
-**Know when to stop.** Diminishing returns are real. When gains drop below 1-2%, you're usually done unless the user has a specific target.
221
224
-**Be transparent.** Show the user your reasoning, the numbers, and the tradeoffs. Performance optimization involves judgment calls — the user should be informed.
0 commit comments