|
| 1 | +# Image Analysis Reference |
| 2 | + |
| 3 | +Advanced prompt patterns for `--analyze` mode. Only read this file when the user needs structured, comparative, or targeted image analysis beyond a simple description. |
| 4 | + |
| 5 | +## Analysis Prompt Patterns |
| 6 | + |
| 7 | +### Plain Text Description |
| 8 | + |
| 9 | +Default behavior — no custom prompt needed. The built-in default covers subject, style, colors, composition, mood, and visible text. |
| 10 | + |
| 11 | +```bash |
| 12 | +uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py --analyze -r "image.png" |
| 13 | +``` |
| 14 | + |
| 15 | +### JSON Structured Output |
| 16 | + |
| 17 | +Ask the model to return structured data: |
| 18 | + |
| 19 | +``` |
| 20 | +Analyze this image and return a JSON object with these fields: |
| 21 | +- image_type: the type/medium of the image (photo, illustration, screenshot, etc.) |
| 22 | +- subjects: array of objects with {name, position, description} |
| 23 | +- colors: dominant color palette as hex codes |
| 24 | +- text_content: any visible text in the image |
| 25 | +- style: artistic style or visual treatment |
| 26 | +- mood: emotional tone |
| 27 | +- composition: layout and framing description |
| 28 | +``` |
| 29 | + |
| 30 | +### Plain Text + JSON Combined |
| 31 | + |
| 32 | +``` |
| 33 | +Describe this image in two sections: |
| 34 | +1. PLAIN TEXT: A natural-language paragraph describing what the image shows |
| 35 | +2. JSON: A structured JSON object with fields: image_type, subjects, colors, style, mood, composition, text_content |
| 36 | +``` |
| 37 | + |
| 38 | +### Comparison (Multiple Images) |
| 39 | + |
| 40 | +Use multiple `-r` flags to compare images: |
| 41 | + |
| 42 | +```bash |
| 43 | +uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \ |
| 44 | + --analyze -r "v1.png" -r "v2.png" \ |
| 45 | + -p "Compare these two images. Describe the differences in composition, color, and style. Which is more suitable for a professional website hero banner?" |
| 46 | +``` |
| 47 | + |
| 48 | +### Targeted Analysis |
| 49 | + |
| 50 | +Focus the model on specific aspects: |
| 51 | + |
| 52 | +| Focus | Prompt Pattern | |
| 53 | +|-------|---------------| |
| 54 | +| Accessibility | "Evaluate this UI screenshot for accessibility: contrast ratios, text readability, color-blind friendliness" | |
| 55 | +| Brand consistency | "Does this image match a modern tech brand aesthetic? Evaluate color palette, typography, and visual style" | |
| 56 | +| Text extraction | "Extract all visible text from this image, preserving layout and hierarchy" | |
| 57 | +| Technical specs | "Describe the technical properties: estimated resolution, aspect ratio, color space, compression artifacts" | |
| 58 | +| Content moderation | "Describe the content of this image objectively. Flag any potentially sensitive content" | |
| 59 | +| UI/UX review | "Analyze this UI screenshot: layout, visual hierarchy, spacing, typography, and potential usability issues" | |
| 60 | + |
| 61 | +### Batch Analysis Workflow |
| 62 | + |
| 63 | +To analyze multiple images individually (not comparing), loop in the skill: |
| 64 | + |
| 65 | +```bash |
| 66 | +for img in screenshots/*.png; do |
| 67 | + echo "=== $img ===" >&2 |
| 68 | + uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \ |
| 69 | + --analyze -r "$img" -p "Describe this screenshot in one paragraph" |
| 70 | +done |
| 71 | +``` |
| 72 | + |
| 73 | +## Model Recommendations |
| 74 | + |
| 75 | +| Model | Best For | |
| 76 | +|-------|----------| |
| 77 | +| `gemini` (default) | General analysis, fast and cost-effective. Good at text extraction and structured output | |
| 78 | +| `gpt5` | Nuanced descriptions, creative interpretation, detailed comparisons | |
| 79 | + |
| 80 | +## Output Handling |
| 81 | + |
| 82 | +The `--analyze` flag outputs JSON to stdout. The `analysis` field contains the model's text response. To extract just the analysis text in a script: |
| 83 | + |
| 84 | +```bash |
| 85 | +uv run python ${CLAUDE_SKILL_DIR}/scripts/generate-image.py \ |
| 86 | + --analyze -r "image.png" | python -c "import sys,json; print(json.load(sys.stdin)['analysis'])" |
| 87 | +``` |
| 88 | + |
| 89 | +Status messages go to stderr, so piping stdout gives clean JSON. |
0 commit comments