Skip to content

Commit 7ade690

Browse files
rubenmarcusclaude
andauthored
refactor(loop): consolidated round-2 improvements (#188)
* fix(loop): improve validation for greenfield builds - Reset circuit breaker when tasks advance (prevents false positives during multi-task greenfield builds where early tasks can't pass tests) - Add package manager detection: auto-detect pnpm/yarn/bun from lockfiles and packageManager field instead of hardcoding npm - Add validation warm-up: skip validation until enough tasks are done for greenfield builds (auto-detected, configurable via --validation-warmup) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): add stall detection and improve early termination - Track file changes across all iterations (not just iteration 1) - Stop loop after 2 consecutive idle iterations (no file changes) - Check IMPLEMENTATION_PLAN.md for pending tasks in all modes, not just when task string mentions the plan file - Lower default max-iterations from 10 to 7 when no plan file exists Fixes loops running all iterations for simple tasks where the agent finishes early but the loop doesn't detect completion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): dynamic iteration calculation from spec content When no IMPLEMENTATION_PLAN.md exists, estimate task count from the spec content by analyzing structural elements (headings, bullet points, numbered lists, checkboxes). This replaces the static default of 7 with a data-driven estimate. For the pet shop issue (#86): 4 headings + 12 bullets → ~5 iterations instead of the old static 10. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): add Ralph Playbook prompt engineering to loop context - Add loop-aware preamble to every iteration with key Ralph Playbook language patterns: "study" not "read", "don't assume not implemented", "no placeholders or stubs", and AGENTS.md self-improvement - For unstructured specs (no task headers), instruct agent to create IMPLEMENTATION_PLAN.md as first action instead of generic "implement all features" prompt - Add spec file references in iterations 2+ so agent can re-read requirements from specs/ directory - Add plan-creation reminder for later iterations without structured tasks - Use playbook language in structured spec prompt too Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ux): improve loop UX with ASCII art, smart directory, and calm warnings - Show Ralph ASCII art in run command via showWelcomeCompact() instead of plain text header - Smart project location: detect existing project markers (package.json, .git, Cargo.toml, etc.) and default to "Current directory" when found - Fix type:'list' → type:'select' for inquirer v13 compatibility in project location prompt (same bug fixed across 8 files previously) - Replace scary [WARNING] silence message with calm chalk.dim status: "Agent is thinking..." at 30s, "Still working..." at 60s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): remove iteration delay and fix validation feedback mutation Speed: - Remove unnecessary 1-second sleep between loop iterations — saves ~1s per iteration (25s on a 25-iteration loop) Bug fix: - Fix validation feedback mutation that defeated context trimming. The executor was appending compressed errors to `taskWithSkills` (line 868), accumulating old validation errors across iterations. Now stores feedback in a separate variable and passes it through the context builder's existing `validationFeedback` parameter, which was previously passed as `undefined` (dead code). The context builder already handles per-iteration compression (2000 chars for iter 2-3, 500 for 4+). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): fix progress status bug, deduplicate completion detection, improve error hashing - Fix progress entry always recording 'completed' even for non-done iterations (was ternary with identical branches). Now records 'partial' for iterations that didn't complete. - Merge detectCompletion() and getCompletionReason() into single-pass detectCompletionWithReason() to eliminate duplicate analyzeResponse() calls per iteration. - Remove unused _validationPassed variable. - Improve circuit breaker error hashing: only normalize file:line:col locations, timestamps, hex addresses, and stack traces — preserving semantically meaningful content so different errors (e.g. "port 8000 in use" vs "file not found") hash differently. - Add 'partial' status to ProgressEntry type with status badge. - Update circuit breaker tests for new normalization behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf(loop): memoize parsePlanTasks with mtime cache, parallelize agent detection - Add mtime-based caching to parsePlanTasks() — the same IMPLEMENTATION_PLAN.md file was being read and regex-parsed 4 times per iteration (init, progress check, completion check, display). The cache returns the stored result if the file's mtimeMs hasn't changed, eliminating ~75 redundant file reads across a 25-iteration loop. - Parallelize agent detection in detectAvailableAgents() — each agent check spawns an independent subprocess (e.g. `claude --version`). Running them with Promise.all() instead of sequential for/of cuts startup time from ~2-3s to <1s. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): add cost ceiling, run all validators, output size limit, configurable timeout Safety: - Add maxCost option to CostTracker and LoopOptions — the loop checks isOverBudget() before each iteration and exits with 'cost_ceiling' reason if exceeded. Prevents unexpected charges on long-running loops. - Add output size limit (default 50MB) in agent runner — truncates to last 80% of buffer if exceeded, preventing OOM from verbose agent output. UX: - Run all validation commands instead of stopping at first failure — the agent now sees lint AND test AND build failures in a single pass, enabling multi-fix iterations instead of fix-one-rerun-fix-another chains. Configuration: - Add agentTimeout option to LoopOptions (default: 5 min) — propagated to agent runner's timeoutMs. Complex tasks can set longer timeouts. - Add 'cost_ceiling' to LoopResult exit reasons. - Add 'partial' status to ProgressEntry for non-done iterations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(circuit-breaker): normalize timestamps before :line:col pattern Move timestamp regex before the :\d+:\d+ replacement. Previously, a timestamp like "14:07:39" would match :\d+:\d+ first, mangling it to "14:N:N" so the timestamp regex could never match. This caused same errors with different timestamps to hash differently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(task-counter): add double-stat guard against TOCTOU race in plan cache The file could change between stat (cache check) and readFileSync. Now stat before and after reading: only cache if both mtimes match, preventing stale content from being cached with a new mtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make output truncation repeatable and include stderr in byte accounting Addresses PR #185 review feedback: - Remove outputTruncated flag so truncation can fire more than once - Reset outputBytes after truncation to prevent counter drift - Include stderr data in byte accounting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use full Ralph ASCII art in run command instead of compact version The compact RALPH_WELCOME_SMALL looked out of place compared to the full RALPH_FULL art used in the wizard. Use showWelcome() consistently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): always-on build validation for greenfield projects Build validation (build + typecheck) now runs after every iteration regardless of the --validate flag. This catches broken builds early: - Missing file imports (components that don't exist yet) - PostCSS/Tailwind misconfiguration - TypeScript compilation errors Key changes: - Add detectBuildCommands() with AGENTS.md > package.json > tsc fallback - Add runBuildValidation() with 2-min timeout (vs 5-min for full) - Re-detect build commands per iteration for greenfield projects - Skip when --validate already covers build/typecheck (no double-run) - Add preamble rules: "create files before importing" + "verify compilation" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): filesystem change detection, directory anchoring, greenfield skills - Add filesystem-based change detection as primary method (git-independent) - Add getHeadCommitHash() and hasIterationChanges() for git-based secondary detection - Remove hasChanges gate from build/full validation (unconditional after iter 1) - Relax stall detection threshold (3 idle + i > 3) - Add directory anchoring rule to preamble (prevent nested project dirs) - Strengthen Tailwind v4 rules with exact setup instructions - Enable skills auto-install by default for greenfield projects (no package.json) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(loop): high-value fixes from code review context-builder.ts: - Fix wasTrimmed bug (was always true for iterations > 1) - Replace unsafe prompt.slice() with semantic trimming at paragraph boundaries - Section-aware feedback compression (keep first complete section, summarize rest) task-counter.ts: - Protect cache from consumer mutation via deep-clone - Extract MAX_ESTIMATED_ITERATIONS constant (was magic number 25) task-executor.ts: - Don't cascade previousBranch on failure (prevents branching from broken state) - Populate result.cost from loop cost stats (was dead field) executor.ts: - Task-aware stall detection (reset idle counter on task progress, not just file changes) - Post-iteration cost ceiling check (prevents starting expensive iteration over budget) - Reorder completion detection: cheap checks first, expensive semantic analysis last Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(skills): replace broken CLI search with skills.sh HTTP API npx skills find is an interactive fzf UI that returns garbage when piped programmatically. Replace with skills.sh search API (https://skills.sh/api/search) which returns real repos with install counts. Enable auto-install by default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): improve resilience and design quality Loop resilience: - Increase default iterations 7→10, minimum 3→5, buffer 2→3 - Validation failures no longer count as idle (agent is debugging) - Relax stall threshold for larger projects (4 idle for 5+ tasks) Design quality: - Add anti-AI-aesthetic rules to hard preamble (bans purple gradients, Inter/Roboto fonts, glass morphism) - Expand skill auto-apply triggers (page, dashboard, app, shop, store) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(skills): skip install when relevant skills exist, filter irrelevant results - Check installed skills first: if frontend-design is already installed and relevant to the task, show "Using installed skills:" and skip API - Add negative keyword filtering: react-native, mobile, ios, android, flutter etc. are filtered out for standard web projects - Use detectClaudeSkills() for comprehensive installed-skill detection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(skills): install complementary skills, expand queries, max 5 - Don't early-return when skills are installed — always search for complementary ones (e.g., react-best-practices alongside frontend-design) - Add React/Vue/Svelte-specific queries (best practices, composition) - Auto-add SEO query for landing/marketing pages - Increase max skills from 2 to 5 - Boost scoring for best-practices, composition, guidelines skills Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): dynamic iteration budget + ban dev server in loop Dynamic iterations: - maxIterations now recalculates when agent expands the plan (e.g., spec has 3 tasks → agent creates 8 → budget adjusts to 11) - Fixes premature "max_iterations" exit on greenfield projects Ban dev server: - Preamble now explicitly says "NEVER start a dev server" and to use npm run build instead. Dev servers block forever, create zombie processes, and eat up ports (5173, 5174, 5175...) across iterations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(fix): add fix command with design-aware skill detection and visual verification - Add `ralph-starter fix` command for autonomous build/design fixing - Fix skill re-installation bug: normalize skill IDs (spaces vs hyphens) for dedup - Fix design skills not applied: add CSS/visual keywords to task detection - Add visual verification instructions for design tasks (web-design-reviewer skill) - Tiered validation: lint on intermediate iterations, build on final iteration - Extend loop by 2 iterations when build fails on final iteration - Fix TOCTOU race condition in task-counter.ts (CodeQL alert #164) - Ban manual dev server in loop preamble (loop handles validation) - Export shared WEB_TASK_KEYWORDS to eliminate keyword list divergence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add fix command to README, CLI docs, and llms.txt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: regenerate SEO/AEO artifacts from docusaurus build Rebuild docs and sync auto-generated files (llms.txt, llms-full.txt, docs.json, ai-index.json, sidebar.json, sitemap.xml, docs-urls.txt) to include the new fix command documentation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(skills): cap skill installation + filter prompt to top 5 relevant - Add installation ceiling: skip API search when >=3 relevant skills exist - Filter formatSkillsForPrompt to only task-relevant skills, capped at 5 - Fix stall detection: lastValidationFeedback is a string (never null), so `!== null` was always true — use `!!` for correct falsy check Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(context): spec-adherent preamble + include spec summary in iterations 2+ The context builder was dropping all spec content after iteration 1, causing "spec amnesia" where the agent lost sight of design requirements. Also, the preamble only had negative design guidance ("NEVER use...") with no positive instruction to follow the spec faithfully. - Add buildSpecSummary() to read specs/ directory for later iterations - Rewrite design section: spec is now "FIRST PRIORITY" source of truth - Include spec summary in iterations 2-3 and truncated hint in 4+ - Add dev server exception clause for visual verification flows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): pass spec summary to context builder for iterations 2+ Wire up buildSpecSummary() in the executor so the context builder can include abbreviated spec content in later iterations, preventing the agent from losing sight of design requirements. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(skills): add maxSkills parameter to formatSkillsForPrompt Allow callers to cap the number of skills included in the prompt. Used by the --design flag to limit to 3-4 focused design skills instead of the default 5. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(fix): add --design flag for visual-first design fix flow The fix command was losing all original design context — the agent only saw the custom task string and build errors, never the spec or plan. This caused design fixes to be guesswork rather than spec-adherent. Changes: - Include specs/ and IMPLEMENTATION_PLAN.md content in fixTask so the agent knows what "correct" looks like - Add --design flag: structured screenshot → analyze → plan → fix flow with 3 viewport breakpoints (desktop/tablet/mobile) - Bump default iterations: 7 for --design, 5 for design keywords, 3 default - Clarify dev server override for visual verification - Register --design option in CLI Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(validation): use detected package manager in detectBuildCommands detectBuildCommands was hardcoding `npm run build` instead of using the project's actual package manager (pnpm/yarn/bun). This caused build validation to fail in projects that enforce a specific pm. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(fix): don't bail early when --design flag is set The fix command was exiting with "nothing to fix!" when build checks passed and no custom task was given. But --design targets visual issues that build checks can't detect, so it should always proceed to the screenshot/analysis flow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(fix): cap skills to 4 in --design mode + screenshot-first prompt Three issues fixed: - Skills were showing "25 detected" because maxSkills wasn't threaded through LoopOptions to formatSkillsForPrompt. Now --design caps to 4. - Startup display now shows "4 active (25 installed)" instead of raw count - Design prompt now forcefully instructs the agent to start with dev server + screenshots as the VERY FIRST action, ignoring IMPLEMENTATION_PLAN.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(fix): rewrite --design prompt to catch structural issues first The design fix prompt was too vague — "layout/spacing problems" led the agent to suggest padding tweaks instead of catching obvious structural issues like content not being centered or huge empty gaps. Rewritten Phase 2 (Issue Identification) to: - Prioritize page structure (centering, containers, max-width) over cosmetic - Check for content pinned to edges, broken grid layouts, unbalanced columns - Require CONCRETE issues visible in screenshots, not generic improvements Rewritten Phase 3 (Fix Plan) to: - Require exact file + CSS property for each fix - Focus on minimal fixes, not redesigning entire components Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(skills): reduce skill bloat in fix command Pass user's custom task text (not the full generated prompt) to autoInstallSkillsFromTask. The --design prompt contains dozens of CSS/design keywords that triggered excessive skill search queries, causing skills to accumulate globally (25+ after a few runs). Also lower MAX_SKILLS_TO_INSTALL from 5 to 3 to cap accumulation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): skip IMPLEMENTATION_PLAN.md instructions for fix --design The preamble said "Study IMPLEMENTATION_PLAN.md and work on ONE task" which directly conflicted with the --design prompt's "Ignore IMPLEMENTATION_PLAN.md — this is a visual fix pass." The preamble appeared first and won, confusing the agent. Add skipPlanInstructions option that replaces plan-related rules with "This is a fix/review pass" when active. Set from fix --design. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf(fix): reduce default iterations for design tasks fix --design: 7 → 5 (5-phase structure should complete in 3-4 iters) isDesignTask: 5 → 4 (visual tasks with keyword detection) Reduces worst-case wall time from 35min to 25min. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): add iteration log for inter-iteration memory Each iteration now appends a summary to .ralph/iteration-log.md with status (validation passed/failed), whether files changed, and agent summary text. On iterations 2+, the last 3 entries are included in the prompt as "## Previous Iterations" so the agent knows what was already tried and can avoid repeating failed approaches. This is a lightweight alternative to full session continuity (--resume) which is deferred to 0.3.1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): better header labels + subtask tree display - Loop header shows "Design Fix", "Fix", or agent name based on fixMode instead of always showing "Running Claude Code" - Subtask tree renders below header when current task has subtasks: [x] Create hero component [ ] Add responsive styles - Add fixMode option to LoopOptions ('design' | 'scan' | 'custom') Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(loop): kill orphaned dev servers after design iterations After each iteration in design mode, check ports 3000/5173/4321/8080 for orphaned dev server processes and SIGTERM them. This prevents resource leaks when the agent crashes or times out without cleaning up the dev server it started for visual verification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(session): persist validation feedback for resume Add lastValidationFeedback field to SessionState so that when a session is paused and later resumed, the agent gets the last validation errors as context. The resume command now passes this as initialValidationFeedback to runLoop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(wizard): add uiLibrary field to TechStack interface Add uiLibrary as an optional field in TechStack to support UI component library selection (shadcn/ui, shadcn-vue, shadcn-svelte, MUI, Chakra). Updated normalizeTechStack, hasTechStack, and the wizard summary display. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(wizard): default to shadcn + tailwind + motion-primitives for web projects When no UI library/styling is specified, web projects now default to: - Tailwind CSS for styling - shadcn/ui (React/Next.js), shadcn-vue (Vue), or shadcn-svelte (Svelte) Updated REFINEMENT_PROMPT to include uiLibrary field and guidance for the LLM to suggest this default stack. Template fallback also sets these defaults when the LLM is unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(wizard): rich spec + AGENTS.md generation with UI stack details - Add uiLibrary to spec display (A4) - Add Tailwind v4 setup instructions to AGENTS.md including cascade layers warning and explicit "no manual CSS resets" guidance (B1) - Add shadcn/ui + motion-primitives setup instructions to AGENTS.md (B1) - Add Setup Notes section to spec with Tailwind v4 + UI library details to prevent CSS cascade conflicts (B2) - Add formatTech entries for shadcn, MUI, Chakra, motion-primitives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(fix): improve --design loop with DESIGN_VERIFIED token + cascade check - C1: Require DESIGN_VERIFIED completion token for design mode, disable legacy "All tasks completed" markers via requireExitSignal - C2: Update Phases 4-5 to instruct agent to emit DESIGN_VERIFIED only after taking verification screenshots - C3: Increase default design iterations from 5 to 7 for fix+verify cycles - C4: Add CSS cascade conflict check as priority 0 in Phase 2 — detects the "spacing broken + colors working" pattern caused by unlayered CSS overriding Tailwind v4 @layer utilities Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(loop): design mode stall detection + conditional completion instruction - D1: Credit screenshot/viewport activity as productive progress in design mode, preventing stall detector from killing analysis iterations - D2: Suppress "All tasks completed" instruction for design mode (skipPlanInstructions=true), replacing with "Follow the completion instructions in the task below" to avoid conflicting with DESIGN_VERIFIED Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add --design flag, UI defaults, and changelog for beta.17 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent a8edcc3 commit 7ade690

44 files changed

Lines changed: 2778 additions & 309 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
---
2+
name: code-quality-reviewer
3+
description: "Use this agent when the user wants to improve code quality, review recently written code for issues, refactor existing code, or get suggestions for better patterns and practices. This includes requests to review a file, clean up code, improve readability, reduce complexity, fix code smells, or apply best practices.\\n\\nExamples:\\n\\n- Example 1:\\n user: \"I just finished implementing the new integration class, can you review it?\"\\n assistant: \"Let me use the code-quality-reviewer agent to analyze your new integration class for quality improvements.\"\\n (The assistant launches the code-quality-reviewer agent via the Task tool to review the recently written integration class.)\\n\\n- Example 2:\\n user: \"This function feels messy, can you help clean it up?\"\\n assistant: \"I'll use the code-quality-reviewer agent to analyze the function and suggest improvements.\"\\n (The assistant launches the code-quality-reviewer agent via the Task tool to review and suggest refactoring for the messy function.)\\n\\n- Example 3:\\n user: \"Can you check my recent changes for any issues?\"\\n assistant: \"I'll launch the code-quality-reviewer agent to review your recent changes for potential issues and improvements.\"\\n (The assistant launches the code-quality-reviewer agent via the Task tool to review the recent diff or modified files.)\\n\\n- Example 4 (proactive usage):\\n Context: The user has just written a substantial block of new code.\\n user: \"Okay, I think that feature is done.\"\\n assistant: \"Great! Now let me use the code-quality-reviewer agent to review the code you just wrote and make sure it's solid before we move on.\"\\n (The assistant proactively launches the code-quality-reviewer agent via the Task tool to review the newly written code.)"
4+
model: opus
5+
color: green
6+
memory: project
7+
---
8+
9+
You are an elite code quality engineer with deep expertise in software craftsmanship, clean code principles, design patterns, and language-specific best practices. You have decades of experience reviewing production codebases across multiple languages and frameworks, with a particular strength in TypeScript/JavaScript ecosystems. You approach code review with a constructive, educational mindset—your goal is not just to identify issues but to help developers understand *why* something should change and *how* to make it better.
10+
11+
## Core Responsibilities
12+
13+
You review recently written or modified code and provide actionable, prioritized feedback to improve its quality. You focus on code that was recently changed or written, not the entire codebase, unless explicitly asked otherwise.
14+
15+
## Review Methodology
16+
17+
When reviewing code, systematically evaluate these dimensions in order of importance:
18+
19+
### 1. Correctness & Bugs
20+
- Logic errors, off-by-one errors, race conditions
21+
- Null/undefined handling and edge cases
22+
- Error handling completeness (are errors caught, logged, and handled appropriately?)
23+
- Type safety issues (especially in TypeScript: `any` abuse, missing type guards, unsafe casts)
24+
25+
### 2. Security
26+
- Input validation and sanitization
27+
- Secrets or credentials in code
28+
- Injection vulnerabilities (SQL, command, path traversal)
29+
- Unsafe deserialization or eval usage
30+
31+
### 3. Architecture & Design
32+
- Single Responsibility Principle violations
33+
- Inappropriate coupling between modules
34+
- Missing abstractions or over-abstraction
35+
- Consistency with existing codebase patterns
36+
- Proper separation of concerns
37+
38+
### 4. Readability & Maintainability
39+
- Naming clarity (variables, functions, classes, files)
40+
- Function length and complexity (cyclomatic complexity)
41+
- Code duplication (DRY violations)
42+
- Comment quality (missing where needed, excessive where code should be self-documenting)
43+
- Consistent formatting and style
44+
45+
### 5. Performance
46+
- Unnecessary computations or allocations
47+
- N+1 query patterns or inefficient data access
48+
- Memory leaks (event listeners, subscriptions, closures)
49+
- Algorithmic complexity concerns
50+
51+
### 6. Testing & Testability
52+
- Is the code structured to be testable?
53+
- Are there missing test cases for the logic?
54+
- Are edge cases covered?
55+
56+
## Output Format
57+
58+
Structure your review as follows:
59+
60+
**Summary**: A 1-3 sentence overview of the code's overall quality and the most important finding.
61+
62+
**Critical Issues** (must fix):
63+
- Each issue with: location, description, why it matters, and a concrete fix
64+
65+
**Improvements** (should fix):
66+
- Each suggestion with: location, current state, proposed improvement, and rationale
67+
68+
**Minor Suggestions** (nice to have):
69+
- Style, naming, or minor readability tweaks
70+
71+
**What's Done Well**:
72+
- Highlight genuinely good patterns to reinforce positive practices
73+
74+
## Review Principles
75+
76+
1. **Be specific**: Always reference exact lines, functions, or patterns. Never give vague feedback like "improve error handling" without saying exactly where and how.
77+
2. **Provide fixes, not just complaints**: Every issue should include a concrete code suggestion or clear description of the fix.
78+
3. **Prioritize ruthlessly**: A review with 3 critical findings is more valuable than one with 30 nitpicks. Lead with what matters most.
79+
4. **Respect existing patterns**: If the codebase has established conventions, suggest improvements that align with them rather than introducing entirely new patterns.
80+
5. **Be constructive**: Frame feedback as improvements, not criticisms. Use "Consider..." or "This could be improved by..." rather than "This is wrong."
81+
6. **Context matters**: Consider the purpose of the code. A quick prototype has different quality standards than a production API endpoint.
82+
83+
## Project-Specific Guidelines
84+
85+
When working in projects with specific coding standards (from CLAUDE.md or similar configuration):
86+
- Always check for and respect project-specific linting rules, formatting standards, and architectural patterns
87+
- Verify import styles match project conventions (e.g., ESM imports with `.js` extensions in TypeScript projects)
88+
- Check that the correct package manager is used in any scripts or commands
89+
- Ensure changes align with the project's stated priorities and patterns
90+
91+
## Self-Verification
92+
93+
Before delivering your review:
94+
1. Re-read each finding—is it actionable and specific?
95+
2. Verify your suggested fixes are syntactically correct
96+
3. Check that you haven't contradicted yourself
97+
4. Ensure your priority ordering is correct (critical issues first)
98+
5. Confirm you've looked at the actual changed/new code, not unrelated files
99+
100+
## Edge Cases
101+
102+
- If the code is too short or trivial for meaningful review, acknowledge this and focus on any improvements that would still add value.
103+
- If you need more context (e.g., related files, the purpose of the code, or the broader architecture), ask for it before proceeding with assumptions.
104+
- If the code is generally excellent, say so clearly and focus your review on minor polish items.
105+
106+
**Update your agent memory** as you discover code patterns, style conventions, common issues, architectural decisions, and recurring quality concerns in this codebase. This builds up institutional knowledge across conversations. Write concise notes about what you found and where.
107+
108+
Examples of what to record:
109+
- Recurring code smells or anti-patterns you've seen across reviews
110+
- Project-specific conventions and style preferences
111+
- Architectural patterns and module relationships
112+
- Common error handling approaches used in the codebase
113+
- Testing patterns and coverage gaps you've identified
114+
115+
# Persistent Agent Memory
116+
117+
You have a persistent Persistent Agent Memory directory at `/Users/ruben/learn/ralph-starter/.claude/agent-memory/code-quality-reviewer/`. Its contents persist across conversations.
118+
119+
As you work, consult your memory files to build on previous experience. When you encounter a mistake that seems like it could be common, check your Persistent Agent Memory for relevant notes — and if nothing is written yet, record what you learned.
120+
121+
Guidelines:
122+
- `MEMORY.md` is always loaded into your system prompt — lines after 200 will be truncated, so keep it concise
123+
- Create separate topic files (e.g., `debugging.md`, `patterns.md`) for detailed notes and link to them from MEMORY.md
124+
- Record insights about problem constraints, strategies that worked or failed, and lessons learned
125+
- Update or remove memories that turn out to be wrong or outdated
126+
- Organize memory semantically by topic, not chronologically
127+
- Use the Write and Edit tools to update your memory files
128+
- Since this memory is project-scope and shared with your team via version control, tailor your memories to this project
129+
130+
## MEMORY.md
131+
132+
Your MEMORY.md is currently empty. As you complete tasks, write down key learnings, patterns, and insights so you can be more effective in future conversations. Anything saved in MEMORY.md will be included in your system prompt next time.

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,6 +451,7 @@ This creates:
451451
|---------|-------------|
452452
| `ralph-starter` | Launch interactive wizard |
453453
| `ralph-starter run [task]` | Run an autonomous coding loop |
454+
| `ralph-starter fix [task]` | Fix build errors, lint issues, or design problems |
454455
| `ralph-starter auto` | Batch-process issues from GitHub/Linear |
455456
| `ralph-starter integrations <action>` | Manage integrations (list, help, test, fetch) |
456457
| `ralph-starter plan` | Create implementation plan from specs |
@@ -561,6 +562,31 @@ ralph-starter run --circuit-breaker-failures 2 "build Y"
561562
| `--output-dir <path>` | Directory to run task in (skips prompt) |
562563
| `--prd <file>` | Read tasks from markdown |
563564

565+
## Options for `fix`
566+
567+
| Flag | Description |
568+
|------|-------------|
569+
| `--scan` | Force full project scan (build + lint + typecheck + tests) |
570+
| `--agent <name>` | Specify agent to use (default: auto-detect) |
571+
| `--commit` | Auto-commit the fix |
572+
| `--max-iterations <n>` | Max fix iterations (default: 3) |
573+
| `--output-dir <path>` | Project directory (default: cwd) |
574+
575+
```bash
576+
# Fix build/lint errors automatically
577+
ralph-starter fix
578+
579+
# Fix a specific design/visual issue
580+
ralph-starter fix "fix the paddings and make the colors brighter"
581+
582+
# Full scan with auto-commit
583+
ralph-starter fix --scan --commit
584+
```
585+
586+
For design-related tasks (CSS, colors, spacing, etc.), the fix command automatically:
587+
- Detects and applies installed design skills
588+
- Instructs the agent to visually verify changes via browser screenshots
589+
564590
## Config Commands
565591

566592
```bash

docs/docs/cli/fix.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
sidebar_position: 2
3+
title: fix
4+
description: Fix build errors, lint issues, or design problems
5+
keywords: [cli, fix, command, build errors, lint, design, visual, css]
6+
---
7+
8+
# ralph-starter fix
9+
10+
Fix build errors, lint issues, or design problems.
11+
12+
## Synopsis
13+
14+
```bash
15+
ralph-starter fix [task] [options]
16+
```
17+
18+
## Description
19+
20+
The `fix` command runs a focused AI loop to fix project issues. It scans for build, lint, typecheck, and test failures, then orchestrates a coding agent to fix them automatically.
21+
22+
When given a custom task describing a visual or design problem (e.g., "fix the paddings and make the colors brighter"), the fix command detects CSS/design keywords and:
23+
24+
- Auto-applies installed design skills (frontend-design, ui-ux-designer, etc.)
25+
- Instructs the agent to visually verify changes using browser screenshots
26+
27+
For structured visual fix passes, use the `--design` flag — see [Design Mode](#design-mode) below.
28+
29+
## Arguments
30+
31+
| Argument | Description |
32+
|----------|-------------|
33+
| `task` | Optional description of what to fix. If not provided, scans for build/lint errors. |
34+
35+
## Options
36+
37+
| Option | Description | Default |
38+
|--------|-------------|---------|
39+
| `--scan` | Force full project scan (build + lint + typecheck + tests) | false |
40+
| `--design` | Structured visual fix mode with screenshot verification | false |
41+
| `--agent <name>` | Specify agent (claude-code, cursor, codex, opencode) | auto-detect |
42+
| `--commit` | Auto-commit the fix | false |
43+
| `--max-iterations <n>` | Maximum fix iterations | 3 (scan), 4 (design keywords), 7 (--design) |
44+
| `--output-dir <path>` | Project directory | cwd |
45+
46+
## Examples
47+
48+
### Fix Build Errors
49+
50+
```bash
51+
# Auto-detect and fix build/lint errors
52+
ralph-starter fix
53+
54+
# Force full project scan
55+
ralph-starter fix --scan
56+
```
57+
58+
### Fix Design Issues
59+
60+
```bash
61+
# Structured visual fix pass (recommended for design work)
62+
ralph-starter fix --design
63+
64+
# Design mode with specific notes
65+
ralph-starter fix --design "the hero section spacing is off and colors are too muted"
66+
67+
# Ad-hoc CSS/design fix (auto-detected as design task)
68+
ralph-starter fix "fix the paddings and make the colors brighter"
69+
70+
# Fix responsive layout
71+
ralph-starter fix "make the layout responsive on mobile"
72+
```
73+
74+
### With Options
75+
76+
```bash
77+
# Auto-commit the fix
78+
ralph-starter fix --scan --commit
79+
80+
# Use a specific agent
81+
ralph-starter fix "fix lint errors" --agent claude-code
82+
83+
# Allow more iterations for complex fixes
84+
ralph-starter fix "fix all test failures" --max-iterations 5
85+
86+
# Design fix with more room to iterate
87+
ralph-starter fix --design --max-iterations 10
88+
```
89+
90+
## Behavior
91+
92+
1. **Error Detection**:
93+
- If `task` provided → runs build check for baseline, then fixes the described issue
94+
- If no task and previous failures exist → re-runs failed validations from `.ralph/activity.md`
95+
- If `--scan` → runs full validation suite (build + lint + typecheck + tests)
96+
97+
2. **Skill Detection**:
98+
- Detects installed Claude Code skills relevant to the task
99+
- For CSS/design tasks → auto-applies design skills and adds visual verification instructions
100+
- Searches skills.sh for complementary skills if needed
101+
102+
3. **Fix Loop**:
103+
- Agent works on fixing issues (default: 3 iterations for scan, 7 for `--design`)
104+
- Lint checks run between iterations (fast feedback)
105+
- Full build check runs on final iteration
106+
- If build fails on final iteration → extends loop by 2 extra iterations
107+
108+
4. **Verification**:
109+
- Re-runs original validation commands after the loop
110+
- Reports success only if all checks pass (not just agent completion)
111+
112+
## Design Mode
113+
114+
The `--design` flag enables a structured visual fix workflow specifically designed for CSS, layout, and styling issues. It runs the agent through a 5-phase process:
115+
116+
### Phase 1: Visual Audit
117+
118+
The agent's **first action** is to start the dev server and take screenshots at 3 viewports:
119+
- Desktop (1440px)
120+
- Tablet (768px)
121+
- Mobile (375px)
122+
123+
### Phase 2: Issue Identification
124+
125+
The agent analyzes screenshots against the project spec and checks for issues in priority order:
126+
127+
0. **CSS cascade conflicts** — Detects unlayered CSS resets (e.g., `* { margin: 0; padding: 0; }`) that silently override Tailwind v4 utilities. This is the most common cause of "classes are correct but nothing works."
128+
1. **Page structure** — Content centering, max-width wrappers, empty gaps
129+
2. **Layout & positioning** — Grid/flex rendering, column balance, overlaps
130+
3. **Responsive issues** — Viewport breakage, overflow, clipping
131+
4. **Spacing** — Vertical rhythm, abnormal gaps
132+
5. **Typography & colors** — Font loading, readability, consistency
133+
134+
### Phase 3: Fix Plan
135+
136+
The agent creates a `DESIGN_FIX_PLAN.md` with specific issues, exact files, and CSS properties to change.
137+
138+
### Phase 4: Execute & Verify
139+
140+
Fixes are applied in priority order (structural first, cosmetic last). The agent re-screenshots after each structural fix to verify improvement.
141+
142+
### Phase 5: Completion
143+
144+
The loop requires the agent to output `DESIGN_VERIFIED` after taking final verification screenshots. The loop will **not** accept generic completion signals like "All tasks completed" — only `DESIGN_VERIFIED` after visual confirmation.
145+
146+
### Why Design Mode Exists
147+
148+
Without `--design`, agents often:
149+
- Read code and see "correct" Tailwind classes, then declare victory without visual verification
150+
- Add more CSS classes on top of cascade conflicts instead of fixing the root cause
151+
- Complete in 1 iteration without actually verifying the visual result
152+
153+
Design mode forces visual-first debugging and prevents premature exit.
154+
155+
## Exit Codes
156+
157+
| Code | Description |
158+
|------|-------------|
159+
| 0 | All issues fixed |
160+
| 1 | Could not fix all issues automatically |
161+
162+
## See Also
163+
164+
- [ralph-starter run](/docs/cli/run)
165+
- [ralph-starter skill](/docs/cli/skill)
166+
- [Validation](/docs/advanced/validation)
167+
- [Skills System](/docs/guides/skills-system)

docs/docs/cli/skill.md

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -125,17 +125,11 @@ and included in the agent's prompt context when relevant.
125125

126126
## Auto Skill Discovery
127127

128-
Auto skill discovery is opt-in. When enabled, ralph-starter
129-
queries the skills.sh registry to find and install relevant
130-
skills automatically.
128+
Auto skill discovery is enabled by default. ralph-starter
129+
queries the skills.sh API to find and install relevant
130+
skills automatically before each run.
131131

132-
Enable it by setting:
133-
134-
```bash
135-
RALPH_ENABLE_SKILL_AUTO_INSTALL=1
136-
```
137-
138-
You can also force-disable it with:
132+
To disable it, set:
139133

140134
```bash
141135
RALPH_DISABLE_SKILL_AUTO_INSTALL=1

docs/docs/community/changelog.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,24 @@ All notable changes to ralph-starter are documented here. This project follows [
1111

1212
---
1313

14+
## [0.1.1-beta.17] - 2026-02-14
15+
16+
### Added
17+
- **`fix --design` mode**: Structured 5-phase visual fix workflow with screenshot verification, CSS cascade conflict detection, and `DESIGN_VERIFIED` completion token
18+
- **Smart UI defaults**: Web projects now default to Tailwind CSS + shadcn/ui + motion-primitives when no styling is specified (framework-aware: shadcn-vue for Vue, shadcn-svelte for Svelte)
19+
- **`uiLibrary` field** in TechStack for explicit UI component library selection
20+
- **Rich spec generation**: Specs and AGENTS.md now include Tailwind v4 setup notes, CSS cascade layer warnings, and shadcn component setup instructions
21+
22+
### Fixed
23+
- Design loop premature exit — `fix --design` now requires explicit `DESIGN_VERIFIED` token after visual confirmation (prevents 1-iteration false completions)
24+
- Design loop stall detection — screenshot/viewport analysis no longer falsely triggers idle detection
25+
- Default design iterations increased from 5 to 7 for more thorough visual fixes
26+
27+
### Changed
28+
- Completion instruction in agent preamble is now conditional — design mode uses task-specific completion flow instead of generic "All tasks completed"
29+
30+
---
31+
1432
## [0.1.1-beta.16] - 2026-02-07
1533

1634
### Added

0 commit comments

Comments
 (0)