|
| 1 | +--- |
| 2 | +description: Extract frames from video around timestamp and select best illustration |
| 3 | +arguments: |
| 4 | + - name: video |
| 5 | + description: Path to video file |
| 6 | + required: true |
| 7 | + - name: timestamp |
| 8 | + description: Timestamp in MM:SS or HH:MM:SS format (estimated) |
| 9 | + required: true |
| 10 | + - name: name |
| 11 | + description: Illustration name (used for directory: _temp/frames/{name}/) |
| 12 | + required: true |
| 13 | + - name: description |
| 14 | + description: What content we're looking for (helps selection) |
| 15 | + required: true |
| 16 | + - name: context |
| 17 | + description: Article text around the illustration (for verification) |
| 18 | + required: true |
| 19 | +--- |
| 20 | + |
| 21 | +# Extract Illustration Frames from Video |
| 22 | + |
| 23 | +## Dependencies |
| 24 | + |
| 25 | +- **ffmpeg** - For extracting keyframes from video |
| 26 | +- **ImageMagick** (convert command) - For cropping and JPEG conversion |
| 27 | + |
| 28 | +Install if needed: |
| 29 | +```bash |
| 30 | +# Windows (with Chocolatey) |
| 31 | +choco install ffmpeg imagemagick |
| 32 | + |
| 33 | +# macOS (with Homebrew) |
| 34 | +brew install ffmpeg imagemagick |
| 35 | + |
| 36 | +# Linux (Ubuntu/Debian) |
| 37 | +sudo apt install ffmpeg imagemagick |
| 38 | +``` |
| 39 | + |
| 40 | +## Process Overview |
| 41 | + |
| 42 | +Extract keyframes in a narrow window around the timestamp, remove duplicates, then select the best frame for the article illustration. |
| 43 | + |
| 44 | +## Steps |
| 45 | + |
| 46 | +### 0. Setup Aliases |
| 47 | + |
| 48 | +```bash |
| 49 | +# Arguments: video=$1, timestamp=$2, name=$3, description=$4, context=$5 |
| 50 | +video=$1 |
| 51 | +timestamp=$2 |
| 52 | +name=$3 |
| 53 | +description=$4 |
| 54 | +context=$5 |
| 55 | +``` |
| 56 | + |
| 57 | +### 1. Create Output Directories (Parallel-Safe) |
| 58 | + |
| 59 | +```bash |
| 60 | +# Create dedicated directory for this illustration extraction |
| 61 | +mkdir -p "_temp/frames/$name" |
| 62 | +mkdir -p _temp/illustrations |
| 63 | +``` |
| 64 | + |
| 65 | +### 2. Extract Frames Using FFmpeg |
| 66 | + |
| 67 | +**First, try keyframes (natural scene changes):** |
| 68 | + |
| 69 | +```bash |
| 70 | +# Seek to timestamp, then extract keyframes from ±5 second window |
| 71 | +ffmpeg -ss $timestamp -i "$video" -ss -00:00:05 -t 00:00:10 \ |
| 72 | + -vf "select=eq(pict_type\,I)+setpts=N/TB" -vsync 0 \ |
| 73 | + "_temp/frames/$name/%{pts:hms}.png" |
| 74 | +``` |
| 75 | + |
| 76 | +This produces files like: `00_09_47.321.png` where the timestamp is the actual video time. |
| 77 | + |
| 78 | +**Check if we got any keyframes:** |
| 79 | +```bash |
| 80 | +ls "_temp/frames/$name/" | wc -l |
| 81 | +``` |
| 82 | + |
| 83 | +**If 0 keyframes found, extract 7 specific frames at exact offsets:** |
| 84 | + |
| 85 | +```bash |
| 86 | +# Extract frames at: -2, -1, -0.5, 0, +0.5, +1, +2 seconds |
| 87 | +# Example: for timestamp 10:00, extract at 9:58, 9:59, 9:59.5, 10:00, 10:00.5, 10:01, 10:02 |
| 88 | + |
| 89 | +# Calculate timestamps based on $timestamp - adjust these values for your timestamp |
| 90 | +# For $timestamp = 10:00: |
| 91 | +ffmpeg -ss 00:09:58 -i "$video" -vframes 1 "_temp/frames/$name/00_09_58.000.png" |
| 92 | +ffmpeg -ss 00:09:59 -i "$video" -vframes 1 "_temp/frames/$name/00_09_59.000.png" |
| 93 | +ffmpeg -ss 00:09:59.5 -i "$video" -vframes 1 "_temp/frames/$name/00_09_59.500.png" |
| 94 | +ffmpeg -ss 00:10:00 -i "$video" -vframes 1 "_temp/frames/$name/00_10_00.000.png" |
| 95 | +ffmpeg -ss 00:10:00.5 -i "$video" -vframes 1 "_temp/frames/$name/00_10_00.500.png" |
| 96 | +ffmpeg -ss 00:10:01 -i "$video" -vframes 1 "_temp/frames/$name/00_10_01.000.png" |
| 97 | +ffmpeg -ss 00:10:02 -i "$video" -vframes 1 "_temp/frames/$name/00_10_02.000.png" |
| 98 | +``` |
| 99 | + |
| 100 | +### 3. Remove Duplicate Frames (keyframes only) |
| 101 | + |
| 102 | +First pass: Remove exact duplicates by file size: |
| 103 | + |
| 104 | +```python |
| 105 | +from pathlib import Path |
| 106 | + |
| 107 | +frames_dir = Path("_temp/frames/$name") |
| 108 | +sizes = {} |
| 109 | +for f in frames_dir.glob("*.png"): |
| 110 | + size = f.stat().st_size |
| 111 | + if size not in sizes: |
| 112 | + sizes[size] = f |
| 113 | + else: |
| 114 | + f.unlink() # Remove duplicate |
| 115 | + |
| 116 | +print(f"After dedup: {len(sizes)} unique frames") |
| 117 | +``` |
| 118 | + |
| 119 | +Second pass: Remove visually similar frames (optional, if still too many): |
| 120 | + |
| 121 | +```python |
| 122 | +from PIL import Image |
| 123 | +import numpy as np |
| 124 | + |
| 125 | +frames = sorted(Path("_temp/frames/$name").glob("*.png")) |
| 126 | +to_remove = set() |
| 127 | + |
| 128 | +for i in range(len(frames) - 1): |
| 129 | + img1 = np.array(Image.open(frames[i])) |
| 130 | + img2 = np.array(Image.open(frames[i + 1])) |
| 131 | + |
| 132 | + # Simple difference: mean absolute error |
| 133 | + diff = np.abs(img1.astype(float) - img2.astype(float)).mean() |
| 134 | + |
| 135 | + # Threshold: if difference < 5% of pixel range, consider duplicate |
| 136 | + if diff < 12.75: # 255 * 0.05 |
| 137 | + to_remove.add(frames[i + 1]) |
| 138 | + |
| 139 | +for f in to_remove: |
| 140 | + f.unlink() |
| 141 | + |
| 142 | +print(f"After visual dedup: {len(frames) - len(to_remove)} frames remaining") |
| 143 | +``` |
| 144 | + |
| 145 | +### 4. Review Remaining Frames and Verify Match |
| 146 | + |
| 147 | +Read each frame and evaluate based on: |
| 148 | + |
| 149 | +**Selection Criteria:** |
| 150 | +- **Clarity**: Text is readable, not motion-blurred |
| 151 | +- **Completeness**: Full content visible (no cut-off elements) |
| 152 | +- **Relevance**: Shows exactly what the description asks for |
| 153 | +- **Visual Quality**: Good contrast, no visual artifacts |
| 154 | +- **UI State**: Buttons/menus in clear, useful state |
| 155 | + |
| 156 | +**Verification Step (CRITICAL - MUST use analyze_image tool):** |
| 157 | + |
| 158 | +The `Read` tool alone CANNOT reliably verify image content. You MUST use the `analyze_image` tool with a detailed prompt. |
| 159 | + |
| 160 | +**What is analyze_image?** |
| 161 | +- A tool that analyzes images and returns detailed text descriptions |
| 162 | +- Can read text, identify UI elements, describe layouts, and understand content |
| 163 | +- Takes two parameters: |
| 164 | + - `imageSource`: URL of the image to analyze |
| 165 | + - `prompt`: What question to ask about the image |
| 166 | + |
| 167 | +**Verification Process:** |
| 168 | + |
| 169 | +1. **Use analyze_image with a dynamic prompt based on description and context:** |
| 170 | + |
| 171 | +``` |
| 172 | +We expect this image to show: $description |
| 173 | +
|
| 174 | +Context from article: $context |
| 175 | +
|
| 176 | +Please analyze: |
| 177 | +1. What does this image actually show? (describe type of page, main text, content) |
| 178 | +2. Does this match what we expect? If not, what DOES it show? |
| 179 | +3. For cropping: any browser chrome at top (how many pixels to remove)? Sidebars to crop? |
| 180 | +``` |
| 181 | + |
| 182 | +2. **Compare the analyze_image output:** |
| 183 | + - The prompt tells the tool what we EXPECT (from $description) |
| 184 | + - The tool tells us what it ACTUALLY sees |
| 185 | + - Compare: Does the actual content match the expected content? |
| 186 | + |
| 187 | +3. **Decision based on comparison:** |
| 188 | + - If the content matches → Proceed to save |
| 189 | + - If the content does NOT match → Wrong timestamp. Search transcript for keywords to find correct time. |
| 190 | + |
| 191 | +**Example verification using analyze_image:** |
| 192 | + |
| 193 | +Prompt expects: "Certificate example showing requirements: complete final project successfully, participate in peer reviews" |
| 194 | + |
| 195 | +analyze_image result: "This is a Q&A interface from Slido... showing anonymous user questions about getting DE jobs without degrees" |
| 196 | + |
| 197 | +Verdict: MISMATCH - The image shows Slido Q&A, NOT certificate requirements. Timestamp is wrong. |
| 198 | + |
| 199 | +**Common verification failures (detected by analyze_image):** |
| 200 | +- Article says "Docker & Infrastructure" → Image shows "Course logistics" |
| 201 | +- Article says "Certificate example" → Image shows "Slido Q&A interface" |
| 202 | +- Article says "YouTube channel" → Image shows only "LinkedIn" |
| 203 | +- Article says "Project pipeline" → Image shows "GitHub repo/commits" |
| 204 | + |
| 205 | +**Timestamp Proximity Rule:** |
| 206 | +- We ONLY look within ±5 seconds of target timestamp - never more |
| 207 | +- Frame names include ACTUAL timestamp (e.g., `09_58.png`, `10_00.png`, `10_02.png`) |
| 208 | +- This makes it obvious the exact video time each frame represents |
| 209 | + |
| 210 | +### 5. Select Best Frame |
| 211 | + |
| 212 | +After reviewing all frames: |
| 213 | +1. Identify the best frame |
| 214 | +2. Explain why it was chosen |
| 215 | +3. Note if cropping is needed |
| 216 | + |
| 217 | +### 6. Crop if Necessary (using ImageMagick) |
| 218 | + |
| 219 | +If the best frame needs cropping: |
| 220 | + |
| 221 | +```bash |
| 222 | +# First crop to temp filename to assess |
| 223 | +convert "_temp/frames/$name/keyframe_XXXX.png" -crop {width}x{height}+{x}+{y} "_temp/frames/$name/keyframe_XXXX-cropped.png" |
| 224 | + |
| 225 | +# Read and assess the cropped version, then finalize |
| 226 | +convert "_temp/frames/$name/keyframe_XXXX-cropped.png" -quality 85 "_temp/illustrations/$name.jpg" |
| 227 | +``` |
| 228 | + |
| 229 | +Common crop patterns: |
| 230 | +- Browser chrome removal: `-crop 1280x650+0+70` (remove ~70px from top) |
| 231 | +- Sidebars: Adjust width/x-offset to crop left or right |
| 232 | + |
| 233 | +### 7. Clean Up |
| 234 | + |
| 235 | +```bash |
| 236 | +rm -rf "_temp/frames/$name" |
| 237 | +``` |
| 238 | + |
| 239 | +## Selection Guidelines by Content Type |
| 240 | + |
| 241 | +| Content Type | What to Look For | |
| 242 | +|--------------|------------------| |
| 243 | +| **UI Screenshots** | No loading spinners, fully populated data, clear labels | |
| 244 | +| **Diagrams** | Complete diagram, no cutting off edges, clear text | |
| 245 | +| **Code/Terminal** | Complete commands visible, no partial lines | |
| 246 | +| **People** | Faces visible, not mid-blink, natural expression | |
| 247 | +| **Data Visualizations** | Axes/labels visible, clear data points | |
| 248 | +| **Websites/Pages** | Fully loaded, no broken images, header visible | |
| 249 | + |
| 250 | +## Output Format |
| 251 | + |
| 252 | +Save final illustration as: |
| 253 | +- Filename: `[descriptive-name].jpg` |
| 254 | +- Location: `_temp/illustrations/` |
| 255 | +- Format: JPEG at quality 85 (~65% smaller than PNG) |
| 256 | +- Reasoning: Document why this frame was chosen |
0 commit comments