You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+6-7Lines changed: 6 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,13 +5,12 @@ All notable changes to OpenReels will be documented in this file.
5
5
## [0.14.0] - 2026-04-09
6
6
7
7
### Added
8
-
-**CaptionWrapper abstraction** that owns timing, word-state computation, spring physics, and chunk entrance animation. Style components are now thin renderers (~25-45 LOC each).
9
-
-**Spring-animated captions**: words smoothly scale up when spoken via Remotion `spring()` with per-style physics configs (punchy, elegant, snappy, smooth).
10
-
-**Three-state word rendering**: unspoken (dim), active (bright, spring-animated), spoken (settled). The currently-spoken word is visually distinct from already-spoken words.
11
-
-**BoxHighlight** caption style (7th style): spring-animated background rectangle tracks the active word.
12
-
-**KaraokeSweep** gradient wipe: accent-colored fill progresses left-to-right during active word.
13
-
-**ColorHighlight** spotlight effect: only the active word gets a background (bouncing spotlight).
14
-
-**Visual snapshot script** (`npx tsx src/remotion/captions/visual-snapshot.ts`) for manual QA of all 7 styles.
8
+
-**Spring-animated captions**: words now smoothly scale up when spoken, with per-style physics (punchy, elegant, snappy, smooth). Captions feel alive instead of toggling on/off.
9
+
-**Three-state word rendering**: each word transitions through unspoken (dim), active (bright, animated), and spoken (settled). The currently-spoken word is always visually distinct.
10
+
-**BoxHighlight** caption style (7th style): a spring-animated background rectangle tracks the active word across the chunk.
11
+
-**KaraokeSweep** gradient wipe: an accent-colored fill sweeps left-to-right behind each word as it's spoken.
12
+
-**ColorHighlight** spotlight effect: only the active word gets a background highlight, creating a bouncing spotlight.
13
+
-**Visual snapshot script** for QA testing all 7 caption styles (`npx tsx src/remotion/captions/visual-snapshot.ts`).
15
14
16
15
### Changed
17
16
- Caption accent colors now use the archetype's `colorPalette.accent` instead of hardcoded green/red.
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,7 +70,7 @@ Give it a topic. It handles everything:
70
70
|**Voiceover**| Generates TTS audio with word-level timestamps for karaoke-style captions |
71
71
|**Visuals**| AI images (Gemini, DALL-E), AI video clips (Google Veo, fal.ai Kling), and vision-verified stock footage that rejects bad matches and retries automatically |
72
72
|**Music**| AI-generated background score via Google Lyria 3 Pro, scene-synced to match the video's emotional arc. Or pick from 25 bundled royalty-free tracks |
73
-
|**Captions**|Styled, animated captions with proper Unicode, CJK, and RTL support across 6 caption styles|
73
+
|**Captions**|Spring-animated 3-state captions with 7 distinct styles, per-archetype theming, and word-level karaoke highlighting|
74
74
|**Assembly**| Composites everything into a vertical MP4 via Remotion with crossfade, slide, wipe, and flip transitions |
75
75
|**Critique**| AI critic scores the output. If quality is below threshold, the pipeline re-runs |
description: The 6 caption styles available in OpenReels, how they render, and which archetypes use them.
3
+
description: The 7 caption styles available in OpenReels, how they render, and which archetypes use them.
4
4
---
5
5
6
-
OpenReels includes 6 caption styles that render word-by-word captions synchronized to the voiceover using TTS word-level timestamps. Captions are rendered as a single layer on top of the full video composition -- not per-scene -- using absolute timestamps from the complete voiceover track.
6
+
OpenReels includes 7 caption styles with spring-animated word-level highlighting synchronized to the voiceover. Captions render as a single layer on top of the full video composition using absolute timestamps from the complete voiceover track.
7
7
8
8
## How captions work
9
9
10
-
All caption styles share the same rendering approach:
10
+
All caption styles share a common rendering pipeline via the `CaptionWrapper` abstraction:
11
11
12
12
1. The TTS provider generates voiceover audio with per-word start/end timestamps
13
-
2. At render time, `useCurrentFrame()` converts the current frame to a time position
14
-
3. The `getWordChunk` utility selects a window of 5 words surrounding the current playback position
15
-
4. Each word in the chunk is styled differently based on whether it has been spoken yet (`currentTime >= word.start`)
13
+
2.`CaptionWrapper` converts the current frame to a time position and selects a chunk of words
14
+
3. Each word is classified into one of three states: **unspoken** (upcoming), **active** (currently being spoken), or **spoken** (already said)
15
+
4. A Remotion `spring()` animation drives smooth transitions as words activate
16
+
5. A 6-frame fade-in animates each new chunk's entrance
16
17
17
-
Words are displayed in fixed-size chunks that advance progressively, creating a karaoke-style reading experience. The chunk stays on screen until all 5 words have been spoken, then the next chunk appears.
18
+
### Three-state word rendering
18
19
19
-
All caption components position themselves at the bottom of the frame with 18% padding, keeping them in the caption-safe zone for mobile viewing.
20
+
Unlike simple on/off highlighting, each word passes through three visual states:
20
21
21
-
## The 6 styles
22
+
-**Unspoken**: dimmed, smaller font, reduced opacity. The viewer can read ahead.
23
+
-**Active**: bright, spring-animated scale-up, full opacity. The currently-spoken word stands out.
24
+
-**Spoken**: settled, slightly dimmer than active but brighter than unspoken. Already-said words stay legible.
25
+
26
+
The active word is the one where `currentTime >= word.start && currentTime < word.end`. This creates a "bouncing ball" karaoke effect that guides the viewer's eye.
27
+
28
+
### Archetype-aware tuning
29
+
30
+
Each archetype can configure:
31
+
32
+
-**`captionChunkSize`**: how many words appear at once (default: 5). Fast archetypes use 6, cinematic use 4.
33
+
-**`captionLingerS`**: how long a chunk stays visible after its last word is spoken (default: 0.3s). Cinematic archetypes linger for 0.5s, fast archetypes for 0.15s.
34
+
-**Accent color**: KaraokeSweep, ColorHighlight, and BoxHighlight use the archetype's `colorPalette.accent` instead of hardcoded colors.
35
+
36
+
Chunks advance immediately when the next chunk's first word starts speaking, even if the current chunk is still within its linger window. This prevents words from being hidden during the transition.
37
+
38
+
### Per-style spring physics
39
+
40
+
Each style has its own spring configuration that controls how the word activation animation feels:
41
+
42
+
| Style | Spring feel | Damping | Stiffness |
43
+
|-------|------------|---------|-----------|
44
+
|`bold_outline`| Punchy | 15 | 250 |
45
+
|`clean`| Balanced | 12 | 200 |
46
+
|`gradient_rise`| Elegant | 8 | 150 |
47
+
|`box_highlight`| Smooth | 10 | 180 |
48
+
|`karaoke_sweep`| Precise | 14 | 220 |
49
+
|`color_highlight`| Balanced | 12 | 200 |
50
+
|`block_impact`| Snappy | 18 | 300 |
51
+
52
+
## The 7 styles
22
53
23
54
### `clean`
24
55
25
-
**Font:** Inter (400/500/700) | **Used by:**`cinematic_documentary`, `moody_cinematic`, `studio_realism`, `warm_narrative`
56
+
**Font:** Inter (400/500/700) | **Spring:**Balanced
26
57
27
-
Minimal and legible. Spoken words appear at 56px bold white, unspoken words at 44px medium with 50% opacity. No background, just a subtle text shadow for readability over any visual. The least visually intrusive style.
58
+
Minimal and legible. Active words spring from 44px to 52px with a weight jump to bold. Spoken words settle to 75% white. Unspoken words sit at 40% opacity. No background, just a subtle text shadow. The least visually intrusive style.
High-impact uppercase text with thick black stroke outlines. Spoken words scale up to 72px extra-bold with a 3px stroke. Unspoken words sit at 48px bold with a 2px stroke and 50% opacity. Strong text shadow adds depth. Designed to be readable over busy, colorful visuals.
64
+
High-impact uppercase text with thick black stroke outlines. Active words spring from 48px to 56px with stroke width growing from 2px to 3px. Spoken words settle at 85% opacity. Designed to be readable over busy, colorful visuals.
Uppercase text where the currently spoken word gets a red background highlight (`#E53E3E`) with rounded corners and padding. Unspoken words have no background and sit at 50% opacity. Creates a strong visual indicator of the current word without affecting the overall text layout.
70
+
A gradient wipe fills left-to-right behind the active word, creating a progress-bar effect. Once spoken, words show a solid accent-colored background. Unspoken words have no background. Uses the archetype's accent color.
Similar to `color_highlight` but with a green highlight (`#38A169`). Spoken words get a green background with rounded corners. Unspoken words are more transparent (40% opacity). The green creates a "sweep" effect as words are spoken in sequence.
76
+
Spotlight effect: only the **active** word gets an accent-colored background. Spoken words lose the highlight entirely, creating a bouncing spotlight that jumps word-to-word. Uses the archetype's accent color. Visually distinct from KaraokeSweep.
The most visually distinctive style. Spoken words render with a purple-to-red gradient (`#9F7AEA` to `#E53E3E`) applied via `background-clip: text`, creating colorful gradient text. A purple glow filter adds a soft aura. Spoken words scale up to 64px. Unspoken words appear in plain white at 45% opacity. Uses a serif font (Playfair Display) for an elegant feel.
82
+
Spoken and active words render with a purple-to-red gradient (`#9F7AEA` to `#E53E3E`) applied via `background-clip: text`. The active word gets a purple glow drop-shadow. Active words spring from 48px to 56px. Uses a serif font for an elegant feel.
Words render inside a semi-transparent black container (`rgba(0,0,0,0.85)`) with rounded corners. Spoken words appear in full white, unspoken at 35% opacity. Uppercase with letter spacing. The dark background block ensures readability over any visual, making this the most legible style at the cost of covering more of the frame.
88
+
Words render inside a semi-transparent black container with rounded corners. Active words flash bright white with a spring from 48px to 54px. Spoken words settle to 70% white. Uppercase with letter spacing. The dark background ensures readability over any visual.
58
89
59
-
## Archetype-to-caption mapping
90
+
### `box_highlight`
91
+
92
+
**Font:** Montserrat (700) | **Spring:** Smooth
60
93
61
-
The caption style is set by the archetype's `captionStyle` field. Here is the full mapping:
94
+
The active word gets a spring-animated accent-colored background with expanding padding. Spoken words show a subtle 8% white background tint. The "box" effect tracks the active word as it moves through the chunk. Uses the archetype's accent color.
0 commit comments