Skip to content

Commit 45dc0cd

Browse files
committed
1.1.1: Treat skill names as illustrative, allow non-binding hints in spec §8
Hardcoded skill names (macos-peekaboo, macos-accessibility-ids, and similar) across /tandemkit:init and the ApplePlatform strategy read as requirements that every user had them installed. Skill names and availability vary per project and setup — every reference is now framed as "scan ~/.claude/skills/ for what's installed, here's a common name as an example". The Planner's blanket ban on skill references in specs is now a mandate-vs-suggestion distinction. Mandates in the spec body remain banned; §8 "Possible Directions & Ideas" is explicitly allowed to surface skills worth considering, files worth reading, and tactical hints, as starting points the Generator can ignore. The Generator's loop is updated to treat §8 suggestions as non-contractual.
1 parent 3fd7a33 commit 45dc0cd

6 files changed

Lines changed: 48 additions & 20 deletions

File tree

.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "tandemkit",
3-
"version": "1.1.0",
3+
"version": "1.1.1",
44
"description": "Describe your goal, approve the spec, then step away — Claude and Codex loop together until it's right.",
55
"author": {
66
"name": "Cihat Gündüz",

commands/init.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -153,12 +153,12 @@ peekaboo daemon start
153153
peekaboo daemon status
154154
```
155155

156-
**Tell the user about the companion skills** (both are user-level and reusable across any macOS/iOS project):
156+
**Check for companion skills the user may have** at `~/.claude/skills/`. Names and availability vary — examples of skills that pair well with Peekaboo workflows:
157157

158-
- `macos-peekaboo` — full Peekaboo usage catalog (see → act → verify loop, gotchas, troubleshooting).
159-
- `macos-accessibility-ids` — how to add `.accessibilityIdentifier(…)` / `setAccessibilityIdentifier(_:)` so `peekaboo see` returns fast AX trees instead of hanging at its 25-second cap.
158+
- A Peekaboo-usage skill (one common name is `macos-peekaboo`) — full Peekaboo usage catalog (see → act → verify loop, gotchas, troubleshooting).
159+
- An accessibility-identifiers skill (one common name is `macos-accessibility-ids`) — how to add `.accessibilityIdentifier(…)` / `setAccessibilityIdentifier(_:)` so `peekaboo see` returns fast AX trees instead of hanging at its 25-second cap.
160160

161-
Both skills live at `~/.claude/skills/` if already installed. If they're missing, they're documented in `ApplePlatform.md` — you can create them later or skip for now; Peekaboo still works, you just won't have inline guidance.
161+
These names are illustrative — scan `~/.claude/skills/` and surface whatever exists. If nothing similar is installed, the CLI still works directly; inline guidance is just nice-to-have. The Evaluator strategy in `ApplePlatform.md` covers the technique fundamentals regardless of which skills are present.
162162

163163
Ask the user to come back and say "done" when Peekaboo is verified so you can proceed.
164164

@@ -367,7 +367,16 @@ Populate each with project-specific context. Key requirements:
367367
- **For macOS apps:** primary UI verification is **Peekaboo CLI** on the running app (screenshot, AX tree, click, type, menu navigation). `#Preview` + Xcode MCP `RenderPreview` is still the fastest path for isolated view rendering. Forbid `mcp__computer-use__*` on macOS (unreliable on Tahoe). Include Peekaboo command examples in the role file.
368368
- **For macOS apps with backends (ASC, DB, API):** verify backend side-effects via the app's CLI (e.g., `asc`, `psql`) — do not trust the UI alone.
369369

370-
**Generator.md should reference existing project skills** that are relevant (e.g., "Load `swift-code-context` before writing Swift code", "Load `swiftui-code-context` for SwiftUI work", "Use `review-swift-changes` for validation"). **For macOS apps, reference `macos-peekaboo` (runtime UI automation) and `macos-accessibility-ids` (so new/touched SwiftUI/AppKit views are automatable by Peekaboo and XCUITest out of the box).**
370+
**Generator.md should reference existing project skills** that are relevant. The specific skills depend entirely on what exists in the project's `.claude/skills/` directory — scan that directory and name the ones that apply. Typical categories to look for (names and availability vary by project):
371+
372+
- A style-guide skill for the project's primary language (e.g., something like `<language>-code-context` or `<language>-style-guide`)
373+
- Framework-specific skills (e.g., a UI-framework style guide if the project has one)
374+
- A testing-conventions skill if the project documents one
375+
- A final-review / linter skill if the project has one (e.g., something named like `review-<language>-changes`)
376+
377+
For macOS apps specifically, the project may have skills for runtime UI automation (e.g., something like a Peekaboo CLI wrapper) and accessibility-identifier authoring (so new SwiftUI/AppKit views are automatable). Name whatever the project actually has — do not invent or assume skill names.
378+
379+
Example phrasing in Generator.md (adjust names to match reality): "Load `<project-style-skill>` before writing <language> code" / "Use `<project-review-skill>` for validation at end of every round".
371380

372381
**Generator.md AND Evaluator.md must both include a top-of-file reminder** pointing at the Signal Protocol in SKILL.md. Suggested exact wording (add verbatim at the top of each file, under the first heading):
373382

skills/evaluator/strategies/ApplePlatform.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ peekaboo see --app "MyApp" --window-id <id> --json # re-capture to verify
132132

133133
#### Accessibility identifiers are a hard prerequisite for `see`
134134

135-
On SwiftUI apps without `.accessibilityIdentifier(…)` coverage, `peekaboo see` hangs at a 25-second hard cap walking unnamed descendants. **With identifiers the AX walk completes in ~1 second.** Adding identifiers is part of the Generator's responsibility when it touches interactive views — see the `macos-accessibility-ids` skill for naming conventions and the AppKit/UIKit equivalents (`setAccessibilityIdentifier(_:)` and `view.accessibilityIdentifier = …`).
135+
On SwiftUI apps without `.accessibilityIdentifier(…)` coverage, `peekaboo see` hangs at a 25-second hard cap walking unnamed descendants. **With identifiers the AX walk completes in ~1 second.** Adding identifiers is part of the Generator's responsibility when it touches interactive views. If your Claude setup has a convenience skill for accessibility-identifier conventions (e.g. one named `macos-accessibility-ids`, but actual names vary — scan `~/.claude/skills/`), load it; otherwise follow the standard AppKit/UIKit equivalents (`.accessibilityIdentifier(_:)` on SwiftUI, `setAccessibilityIdentifier(_:)` on AppKit, `view.accessibilityIdentifier = …` on UIKit).
136136

137137
**Fallback for apps without identifiers (or non-native apps like Electron):**
138138
1. `peekaboo image` for the screenshot.
@@ -146,12 +146,12 @@ On SwiftUI apps without `.accessibilityIdentifier(…)` coverage, `peekaboo see`
146146
- Fuzzy `click "label"` can match across apps — always scope with `--window-title` or `--window-id`, or prefer menu-path clicks.
147147
- Focus stealing: every shell call reactivates the terminal as frontmost. Chain multi-step flows in a single `peekaboo run <script.peekaboo.json>` script, or re-focus the target before every click (`peekaboo app switch --to X --verify`).
148148

149-
Full pattern catalog and troubleshooting: **`macos-peekaboo` skill** (user-level, installed at `~/.claude/skills/macos-peekaboo/`).
149+
Full pattern catalog and troubleshooting lives in the Peekaboo CLI's own docs (`peekaboo --help`) and optionally in a user-level convenience skill if your Claude setup has one (a common name is `macos-peekaboo` at `~/.claude/skills/macos-peekaboo/`, but actual names vary — scan `~/.claude/skills/` to see what's installed).
150150

151151
#### When Peekaboo is NOT enough
152152

153153
- **SwiftUI `#Preview` screenshots** via Xcode MCP `RenderPreview` are still the fastest verification for pure view rendering — use them first when the change is isolated to a single view's appearance. Peekaboo is for end-to-end flows against the running app.
154-
- **AppKit with no identifiers** — add them (`setAccessibilityIdentifier(_:)`) per the `macos-accessibility-ids` skill; `see` works on AppKit the moment identifiers are present.
154+
- **AppKit with no identifiers** — add them (`setAccessibilityIdentifier(_:)`); `see` works on AppKit the moment identifiers are present. If your setup has a helper skill for accessibility-identifier conventions (example name: `macos-accessibility-ids`), it offers faster naming guidance — but the API calls are standard AppKit regardless.
155155
- **Electron/Chromium-based apps** — AX tree is opaque; fall back to menu paths, hotkeys, and coordinate clicks. Check if the app has an official CLI/extension API first.
156156

157157
## Evaluation Checklist
@@ -171,7 +171,7 @@ Full pattern catalog and troubleshooting: **`macos-peekaboo` skill** (user-level
171171

172172
### When the Mission Involves UI (macOS)
173173
5m. **Build and launch** via `xcodebuildmcp macos build` then `xcodebuildmcp macos launch --app-path …` (or `peekaboo app launch`)
174-
6m. **Read the accessibility tree** via `peekaboo see --app X --window-id <id> --json --timeout-seconds 30` (requires `.accessibilityIdentifier` coverage on interactive views — if missing, add them per the `macos-accessibility-ids` skill)
174+
6m. **Read the accessibility tree** via `peekaboo see --app X --window-id <id> --json --timeout-seconds 30` (requires `.accessibilityIdentifier` coverage on interactive views — if missing, add them to the relevant views; your Claude setup may have a helper skill for naming conventions, e.g. something named `macos-accessibility-ids`, if not the standard SwiftUI/AppKit APIs work directly)
175175
7m. **Take screenshots** via `peekaboo image --app X --window-id <id> --path …`
176176
8m. **Test interaction flows** via `peekaboo click --on <id> --snapshot <snap>`, `peekaboo type`, `peekaboo hotkey`, `peekaboo menu click --path "…"`
177177
9m. **Verify backend side-effects via the app's own CLI/API** when the UI writes to a remote service (e.g., `asc` CLI for App Store Connect writes, database reads for DB writes) — do not trust the UI alone
@@ -225,7 +225,7 @@ During init, create `TandemKit/Evaluator.md` with:
225225
- Click / type: `peekaboo click --on <id> --snapshot <snap>`, `peekaboo type "…" --clear`
226226
- Menu navigation: `peekaboo menu click --app "[AppName]" --path "File > Save"`
227227
- Hotkeys: `peekaboo hotkey --keys "cmd,s"`
228-
- **Adding identifiers**: see the `macos-accessibility-ids` skill — unlocks fast `see` and stable targeting.
228+
- **Adding identifiers**: adding `.accessibilityIdentifier(_:)` (SwiftUI) or `setAccessibilityIdentifier(_:)` (AppKit) unlocks fast `see` and stable targeting. If your Claude setup has a convenience skill for naming conventions (e.g. one named like `macos-accessibility-ids`), it's worth a scan; otherwise the API calls are standard.
229229
- **DO NOT use `mcp__computer-use__*` on macOS** — unreliable on Tahoe.
230230

231231
### Shared

skills/generator/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ The user invokes this skill with `/tandemkit:generator NNN-MissionName`. First r
101101
1. Read `TandemKit/Config.json` — verify the mission exists and is current
102102
2. **Read `TandemKit/Generator.md`** for project-specific context — this is mandatory, do not skip
103103
3. Read the mission's `Spec.md` — this is your source of truth
104-
4. **Scan `.claude/skills/` for skills relevant to this mission's topic.** List the skill names and descriptions. Load any that seem related — they may contain domain knowledge, conventions, or validation rules critical for correct implementation. If the Spec mentions specific skills, load those too.
104+
4. **Scan `.claude/skills/` for skills relevant to this mission's topic.** List the skill names and descriptions. Load any that seem related — they may contain domain knowledge, conventions, or validation rules critical for correct implementation. If the Spec's §8 "Possible Directions & Ideas" (or a similarly-named "Context the Generator Might Find Useful" section) lists suggested skills, **treat those as starting points, not contracts** — load what seems relevant to the approach you choose, ignore what doesn't match. A suggestion in that section is never a pass/fail criterion; only Acceptance Criteria and Scope are binding.
105105
5. Read `State.json`. If `phase` is `"ready-for-execution"` or `"planning"`:
106106
a. Check `evaluatorStatus`. If already `"watching"` → update State.json: `generatorStatus: "working"`, `phase: "generation"`. Proceed to step 6.
107107
b. If `evaluatorStatus` is `null` → the Evaluator is not ready yet. Update State.json: `generatorStatus: "researching"`. You may read files, investigate the codebase, and prepare — but do NOT create or modify implementation files.

skills/planner/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -330,7 +330,7 @@ This section is the **most important rule** for keeping the spec requirement-foc
330330
2. **No complete code blocks** (full functions, full files, full type definitions). A 5-line snippet showing an exact API contract or a tricky edge case is fine. A 50-line block of "here's how the function should look" is not.
331331
3. **No step-by-step implementation procedures.** "First call `foo()`, then construct `Bar` with these args, then call `baz(...)` inside a `try` block" is HOW. Replace with WHAT: "The created entity must be visible to existing read tools and respect exclusion rules" — and let the Generator figure out the call sequence by reading the code you pointed to.
332332
4. **No acceptance criteria that prescribe implementation order or specific function calls.** AC is about observable outcomes. (See "What Makes a Good Acceptance Criterion" below.)
333-
5. **No "Style Guide Reminder" section telling the Generator which skills to load.** The Generator already has its own role file (`TandemKit/Generator.md`) that specifies which skills to load. The spec should not duplicate that or push role-specific instructions.
333+
5. **No "Style Guide Reminder" / "Skills to Load" section as a mandate.** The Generator already has its own role file (`TandemKit/Generator.md`) that specifies which skills to load. The spec should not duplicate that or push role-specific instructions as requirements. **Allowed exception — non-binding suggestions:** skill names MAY appear in the spec's §8 "Possible Directions & Ideas" (or a similarly-named "Context the Generator Might Find Useful" section) when framed as starting points the Generator can ignore. The distinction is mandate vs. suggestion: "the Generator MUST load `<skill-name>`" is banned as a contract; "`<skill-name>` is worth considering when writing `<feature>`" in the non-binding section is fine as a suggestion. Specific skill names are examples of what could be relevant — actual skills depend on the project. The binding WHAT/WHY stays in Acceptance Criteria + Scope.
334334
6. **No transcribed file contents.** If `auth_handler.py:42-78` is relevant, write that path and one sentence about WHY it's relevant. Don't paste 50 lines of that file into the spec.
335335
336336
**The spec IS:**

skills/planner/templates/Spec-Format.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ This rule is the difference between a 100-line spec the Generator can execute ag
1414
2. **No complete code blocks** — no full function bodies, full type definitions, full file contents. A 1–5 line snippet that pins down an exact API contract or shows a tricky edge case is fine. A 20-line block of "here's how the function should look" is not.
1515
3. **No step-by-step implementation procedures.** "First call `foo()`, then construct `Bar`, then call `baz()` inside a `try` block" is HOW. Replace with WHAT: "the operation must be atomic and respect existing exclusion rules" — and let the Generator find the call sequence by reading the file you pointed to.
1616
4. **No transcribed file contents.** If `auth_handler.py:42-78` is relevant, write that path and one sentence about WHY. Don't paste the 50 lines into the spec — the Generator will read the file.
17-
5. **No "Style Guide Reminder" / "Skills to Load" / "Generator MUST..." sections.** Role-specific instructions belong in `TandemKit/Generator.md`, not the spec. The spec is mission-specific; role files are role-specific.
17+
5. **No "Style Guide Reminder" / "Skills to Load" / "Generator MUST..." sections as mandates.** Role-specific *instructions* belong in `TandemKit/Generator.md`, not the spec. The spec is mission-specific; role files are role-specific. **Allowed exception — non-binding suggestions:** skill names and reference files that may be relevant MAY appear in §8 "Possible Directions & Ideas" (or a similarly-named "Context the Generator Might Find Useful" section), provided they are clearly framed as starting points the Generator can ignore. The distinction is mandate vs. suggestion: *"The Generator MUST load `<style-guide-skill>`"* is a mandate (banned in the spec body). *"`<style-guide-skill>` is worth considering when writing the <relevant feature>"* in a non-binding section is a suggestion (allowed). Whatever example skill names appear in this template are illustrative placeholders — actual skills vary by project. The binding WHAT/WHY stays in Acceptance Criteria + Scope; skill and file hints stay explicitly non-binding.
1818
6. **No acceptance criteria that prescribe implementation order or specific function calls.** ACs are about observable outcomes. (See §4 below for examples.)
1919

2020
### When implementation detail IS acceptable (rare exceptions)
@@ -216,16 +216,33 @@ Explicit boundaries. The Generator must NOT implement these. The Evaluator must
216216

217217
### 8. Possible Directions & Ideas (Optional)
218218

219-
Soft suggestions from the Planner. Non-binding. The Generator can take these or ignore them.
219+
Soft suggestions from the Planner. Non-binding. The Generator can take these or ignore them. This is also the right place for **skill hints** and **reference files** — content that doesn't fit as a requirement but would save the Generator time if surfaced. Frame everything as "worth considering", not "must do".
220+
221+
Typical contents (all optional, all non-binding — use whichever sub-headings fit the mission):
222+
- **Skills worth considering** — name the style-guide, testing, or domain skills that exist in this project and may help. The Generator decides whether to load them. Actual skill names depend on the project.
223+
- **Files and folders worth reading** — documentation, prior missions, adjacent patterns.
224+
- **Protocol or RFC references** — when the spec references a standard, point at it.
225+
- **Tactical hints** — possible implementation directions, testing approaches, edge cases to be aware of.
226+
- **Naming notes** — suggestions for tool names, branch names, enum values, with alternatives when any is fine.
227+
- **Suggested milestones** — a possible implementation sequence.
228+
229+
Example — in a hypothetical auth-API mission:
220230

221231
```markdown
222232
## Possible Directions & Ideas
223233

224-
- Consider using the existing `RateLimiter.swift` middleware pattern as
225-
a template for the auth middleware
234+
**Skills worth considering**
235+
- `<testing-skill-name>` — patterns for the test suite
236+
- `<api-style-skill-name>` — HTTP handler conventions
237+
(replace with the skills that actually exist in your project)
238+
239+
**Files worth reading**
240+
- `Sources/Middleware/RateLimiter.swift` — reference pattern for the new middleware
241+
- `Documentation/Toolbox.md` — reusable solutions
242+
243+
**Tactical hints**
226244
- The `swift-jwt` library already in the project supports RS256 natively
227-
- A `TokenService` actor might be a clean way to encapsulate token
228-
creation/validation/rotation logic
245+
- A `TokenService` actor might cleanly encapsulate token lifecycle
229246

230247
**Suggested milestones for the Generator:**
231248
1. Data model and token service
@@ -234,6 +251,8 @@ Soft suggestions from the Planner. Non-binding. The Generator can take these or
234251
4. Tests
235252
```
236253

254+
**Key framing**: every item in this section is a starting point. If the Generator finds a better approach by reading the codebase, taking that better approach is not a spec violation — the binding content is only what's in Acceptance Criteria, Scope, and Edge Cases. Skill names shown in examples above are placeholders; actual skill names depend on what exists in the project's `.claude/skills/` directory.
255+
237256
## Principles
238257

239258
1. **Constrain deliverables, not implementation** — specify WHAT and WHY, never HOW. Re-read "The One Rule" at the top of this file before drafting any section.
@@ -252,6 +271,6 @@ Most well-scoped missions produce specs in the **150–400 line** range. If your
252271
- An "Implementation Sketch" section (delete it entirely)
253272
- Acceptance criteria that prescribe call sequences (rewrite as observable outcomes)
254273
- Transcribed file contents in the Context section (replace with file path + 1 sentence)
255-
- A "Style Guide" or "Skills to Load" section (delete — that belongs in `TandemKit/Generator.md`, not the spec)
274+
- A "Style Guide" or mandatory "Skills to Load" section in the spec body (delete — that belongs in `TandemKit/Generator.md`, not the spec). Non-binding skill/file *suggestions* are fine in §8 "Possible Directions & Ideas" — the distinction is mandate vs. suggestion, not whether skills can be mentioned at all.
256275

257276
Before finalizing: scan your spec for any code block > 5 lines. For each one, ask "could the Generator have written this themselves after reading the codebase?" If yes — delete it.

0 commit comments

Comments
 (0)