Skip to content

Commit 70e2ece

Browse files
art049claude
andcommitted
feat(optimize): enforce CodSpeed CLI as single source of truth for all measurements
The optimize skill now explicitly requires all benchmarks to run through the CodSpeed CLI, including walltime. Never fall back to raw benchmark execution — ask the user for help if CodSpeed can't run. This ensures all results are comparable, trackable, and analyzable with flamegraphs. Also fix the plugin name casing in plugin.json. Co-Authored-By: Claude <noreply@anthropic.com>
1 parent ce1ad6e commit 70e2ece

3 files changed

Lines changed: 6 additions & 3 deletions

File tree

.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"name": "CodSpeed",
2+
"name": "codspeed",
33
"description": "CodSpeed plugin for Claude Code helping with performance measurement and optimization.",
44
"version": "1.0.0",
55
"author": {

.cursor-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"name": "CodSpeed",
2+
"name": "codspeed",
33
"description": "CodSpeed plugin for Cursor helping with performance measurement and optimization.",
44
"version": "1.0.0",
55
"author": {

skills/optimize/SKILL.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ description: "Autonomously optimize code for performance using CodSpeed benchmar
77

88
You are an autonomous performance engineer. Your job is to iteratively optimize code using CodSpeed benchmarks and flamegraph analysis. You work in a loop: measure, analyze, change, re-measure, compare — and you keep going until there's nothing left to gain or the user tells you to stop.
99

10+
**All measurements must go through CodSpeed.** Always use the CodSpeed CLI (`codspeed run`, `codspeed exec`) to run benchmarks — never run benchmarks directly (e.g., `cargo bench`, `pytest-benchmark`, `go test -bench`) outside of CodSpeed. The CodSpeed CLI and MCP tools are your single source of truth for all performance data. If you're unable to run benchmarks through CodSpeed (missing auth, unsupported setup, CLI errors), ask the user for help rather than falling back to raw benchmark execution. Results outside CodSpeed cannot be compared, tracked, or analyzed with flamegraphs.
11+
1012
## Before you start
1113

1214
1. **Understand the target**: What code does the user want to optimize? A specific function, a whole module, a benchmark suite? If unclear, ask.
@@ -213,9 +215,10 @@ You have access to these CodSpeed MCP tools:
213215

214216
## Guiding principles
215217

218+
- **Everything goes through CodSpeed.** Never run benchmarks outside of the CodSpeed CLI. Never quote timing numbers from raw benchmark output. The CodSpeed MCP tools (`compare_runs`, `query_flamegraph`, `list_runs`) are your source of truth — use them to read results, not terminal output. If CodSpeed can't run, ask the user to fix the setup rather than working around it.
216219
- **Measure first, optimize second.** Never optimize based on intuition alone — the flamegraph tells you where the time actually goes, and it's often not where you'd guess.
217220
- **One change at a time.** Isolated changes make it clear what helped and what didn't.
218221
- **Correctness over speed.** Always run tests. A fast but broken program is useless.
219-
- **Simulation for iteration, walltime for validation.** Simulation is deterministic and fast for feedback. Walltime is the ground truth.
222+
- **Simulation for iteration, walltime for validation.** Simulation is deterministic and fast for feedback. Walltime is the ground truth. Both run through CodSpeed.
220223
- **Know when to stop.** Diminishing returns are real. When gains drop below 1-2%, you're usually done unless the user has a specific target.
221224
- **Be transparent.** Show the user your reasoning, the numbers, and the tradeoffs. Performance optimization involves judgment calls — the user should be informed.

0 commit comments

Comments
 (0)