Releases: FlineDev/TandemKit
1.4.0
Hand off to fresh Generator and Evaluator sessions, and ask the Developer about committing through a structured prompt.
Changed
- Step 5 hands off to fresh Claude sessions — the Planner no longer tells the Developer "stay in this session and rename to Generator", which dragged investigation + Codex-convergence tokens into the Generator's long implementation loop. Each role now starts in a clean session with its own token budget.
- Drops pre-emptive rename copy-paste blocks at handoff — the Generator and Evaluator skills already print their own rename block as the first thing they output on invocation, so the Planner was duplicating that work. Step 5 now just notes that each role will print its own rename for the Developer to copy-paste.
- Step 4 step 29 →
AskUserQuestion— the commit prompt is no longer a plain-text "Optionally ask: ..." line. It's now a realAskUserQuestioncall with three options: Commit and push (Recommended), Commit only (no push), or Skip and let the Generator bundle the planning artifacts. The message body must follow the project'sgit.commitConventions.
Full Changelog: 1.3.1...1.4.0
1.3.1
Fixed
Mission Completecloseout protocol — restructured as an atomic five-action sequence (writeSummary.md→ flipState.json→ clearConfig.json currentMission→ present in chat → ask about committing) with an explicit trigger gate, a five-item pre-flight checklist, and hardcoded literal field values. Closes a gap where, across four real closed missions,Summary.mdwas skipped 4/4, the commit-question skipped 4/4,currentMissionleft dangling 2/4, and once the Generator self-authorized closure on aPASS_WITH_FINDINGSverdict and wrotecompletedBy: "generator".After Receiving Evaluationclarification — added one line stating thatPASS/PASS_WITH_GAPS/PASS_WITH_FINDINGSverdicts hand off to Review Briefing → user-review, never to Mission Complete. Generator-initiated closure is now explicitly forbidden in the trigger gate.
Full Changelog: 1.3.0...1.3.1
1.3.0
Project name in session renames, a sharper Review Briefing focus, and a hard cap on mission name length.
Added
projectNamein Config.json — auto-detected during/tandemkit:initfrom the git-root basename, with a walk-up for generic umbrella folders (App,Server,Client,iOS,macOS,Android, etc.). Used by Planner, Generator, and Evaluator on every session rename.- New session rename format —
<Project>: <Role> (M-<NNN>)(e.g.MyApp: Generator (M-005)) replaces the previous emoji-prefixed format so TandemKit sessions are recognizable across projects in the session picker. - Operational reality check in the Review Briefing — before declaring a round done, the Generator walks Spec ACs + AGENTS.md operational rules and surfaces honest gaps (release / distribution steps, production deploy, live integration runs).
Changed
- Review Briefing focus — flipped from "What you should test" (assignment back to the user) to "What I confirmed works" + "Gaps". Past-tense declarative bullets with concrete evidence per behavior; the user spot-checks at their discretion rather than receiving a test plan.
- Mission name cap — hard limit of 2–3 PascalCase components, with a counting rule and pre-flight enforcement before the candidate name is presented. Consecutive uppercase letters in an acronym count as one component (
APIinFixAPIEndpoint= 1). Projects can tighten further viaPlanner.mdbut cannot loosen.
Full Changelog: 1.2.0...1.3.0
1.2.0
Shared Assets/ folder for mission verification captures, a three-option git commit policy for TandemKit metadata, and commit/PR text that no longer leaks the tandem process.
Added
Assets/folder per mission — flatTandemKit/NNN-MissionName/Assets/where both Generator and Evaluator save verification captures. Filenames followR{NN}-<Role>-<Slug>.<ext>; locale-specific captures append a BCP-47 2-letter code (R01-Gen-Before-en.webp,R01-Gen-Before-de.webp) — never a spelled-out language name.- Three-option git commit policy in
/tandemkit:initQuestion 7:all,text-only(default), ornone. Stored asgit.tandemKitCommitin Config.json; Step 7 writes the matching.gitignore. namingConventionfield in Config.json — auto-detected during init (PascalCase / camelCase / kebab-case / snake_case); governs mission folder names and Assets filenames.
Changed
- Evaluator reads Generator captures as primary evidence — cites file paths in round reports; only re-captures when the Generator's file is insufficient (covered element, wrong crop, missing post-interaction state).
- Commit and PR text no longer leak process — milestone, final, and PR commits don't mention TandemKit, the role names, missions, rounds, or FAIL/PASS iterations. Single exception: the optional post-mission commit of
TandemKit/NNN-MissionName/text files may reference "mission files". ProjectTandemKit/Generator.mdmay override. - PR description template — reuses
Assets/screenshots for before/after tables; branches ongit.tandemKitCommit(raw URL when committed, drag-drop upload when not).
Full Changelog: 1.1.2...1.2.0
1.1.2
Added
- Codex stall detection — Planner and Evaluator share a five-rule protocol: work on Claude's own investigation in parallel instead of idling; do a 10-minute liveness check against the Agent JSONL transcript mtime under
/private/tmp/claude-501/.../subagents/agent-<id>.jsonl(≥5-minute gap = stalled); abandon Codex after 20 minutes regardless of liveness; validate the target file exists at >500 bytes with an mtime newer than the Agent launch before trusting "completed"; fall back to Claude-only with aCodex-NN.mdplaceholder stating the reason (rate limit, quota, mid-write stall, liveness failure, 20-minute ceiling).
Changed
- Stalled-round fallback — Planner's placeholder lets Step 2 continue with Claude only and tells the user why. Evaluator's placeholder copies Claude's evaluation as the final
Round-NN.mdand signals the Generator so the loop keeps moving. Neither skill retries Codex within the same round or session — stalls don't self-heal in minutes.
Full Changelog: 1.1.0...1.1.2
1.1.1
Changed
- Skill names are illustrative, not mandated —
/tandemkit:initandApplePlatform.mdno longer hardcodemacos-peekaboo/macos-accessibility-idsas universal. Every reference now reads "scan~/.claude/skills/for what's installed". The underlying techniques work regardless of which helper skills exist. - Spec §8 allows non-binding skill and file hints — mandates in the spec body (e.g. "the Generator MUST load
<skill>") remain banned; only Acceptance Criteria and Scope bind. But §8 "Possible Directions & Ideas" may list skills, files, and tactical hints framed as starting points the Generator can ignore. Documented inSpec-Format.mdwith a worked example.
Full Changelog: 1.1.0...1.1.1
1.1.0
Atomic signal protocol and a new unstick script make Generator/Evaluator handoffs deadlock-resistant.
Added
- Atomic SIGNAL protocol — every handoff is now an explicit two-step atomic operation: flip State.json, then immediately launch
wait-for-state.shvia the Bash tool withrun_in_background: true. Both halves must happen before the response ends. Pre-flight checklist, template, and reasoning (only arun_in_background: truetask completion fires a<task-notification>that crosses turn boundaries — foreground polls die with the current response) live in each SKILL.md. scripts/unstick.sh— reads State.json, counts live watchers per side, prints the at-fault side plus the next action.--touchrefreshes theupdatedtimestamp to re-fire any live watcher on the other side.- "Why did you stop?" decision tree — both skills document the four cases (you at fault, other at fault with your watcher alive, other at fault with your watcher dead, your watcher alive but other side stuck) and the right response for each.
Changed
- Role files require the Signal Protocol reminder —
/tandemkit:initmandates every new project'sGenerator.mdandEvaluator.mdinclude a verbatim top-of-file reminder pointing at the Signal Protocol in SKILL.md and atscripts/unstick.shfor recovery. Existing projects can opt in by pasting the block.
Full Changelog: 1.0.8...1.1.0
1.0.8
Changed
- Race-free state watchers — removed the
--round Nflag fromwait-for-state.sh. Both agents reset the other side's status field at the start of each round (Generator writesevaluatorStatus=pendingwhen signallingready-for-eval; next Generator round-start flipsgeneratorStatus=workingbefore returning toready-for-eval), so a field-value match is always the current round's signal. Fixes hangs under never-stop Generator mode where the Generator could race past the expected round between Evaluator wake events.
Added
- Contributor guide — new
AGENTS.mdat the repo root documents the release process (bumpplugin.json, bare-version tag, matching GitHub release), version-pick rules, the per-type past-release style check, and the no-mid-sentence-line-breaks rule for Markdown.
Full Changelog: 1.0.7...1.0.8
1.0.7
Added
- Android and Flutter project detection in
/tandemkit:init— detectsbuild.gradle(.kts),settings.gradle(.kts),AndroidManifest.xml, andpubspec.yaml. Matching projects get a new evaluation strategy centered on Google's official Android CLI and the companionandroid/skillsrepository. Install is user-driven: init explains the tool, links the download, and waits for "done" before runningandroid init. skills/evaluator/strategies/Android.md— build/test/emulator/screenshot workflow, the Android CLI command surface, the companion skill install flow, Flutter integration, and fallbackadb+gradlewrecipes.- Flutter sanity check —
flutterCLI verified for matching projects; iOS-targeting Flutter apps also run the Apple-platform setup.
Changed
- Simplified Peekaboo setup —
ApplePlatform.mdand/tandemkit:initdrop the custom~/.local/binbuild-from-source dance for beta4. Peekaboo is now introduced with a plainbrew install peekaboo, permissions grant, and one-timepeekaboo daemon start. Only evergreen quirks remain (accessibility-identifier requirement, fuzzy-click scoping, focus-stealing).
Full Changelog: 1.0.6...1.0.7
1.0.6
Changed
- Self-healing
latestsymlink — every TandemKit script refreshes the plugin cache'slatestsymlink at the top of each run, and all skills switched to absolute paths under~/.claude/plugins/cache/FlineDev/tandemkit/latest/scripts/. Works whether the skill was loaded via the plugin system or via Codex's~/.agents/skills/symlinks. - Stricter Planner approval — the clickable
file:///link to the latestClaude-NN.mddraft is now a hard requirement at every approval step. The last line beforeAskUserQuestionis always a brief plain-text prompt so the picker overlay can't hide the link.
Fixed
- Codex availability detection — empty output or a zero-byte
Codex-NN.mdno longer triggers the fallback; only explicit error messages do. The launch step also documents thatrun_in_background: trueon the Agent tool is the only correct backgrounding mechanism — passing--backgroundto the Codex CLI produces an instantly-"completed" Agent with an empty findings file.
Full Changelog: 1.0.5...1.0.6