Skip to content

Latest commit

 

History

History
410 lines (242 loc) · 22.8 KB

File metadata and controls

410 lines (242 loc) · 22.8 KB

Hyperframe Launch Video — Storyboard v3

Format: 1920×1080 | 60 seconds max Audio: ElevenLabs voiceover + SFX design + underscore VO direction: Mid-age male, calm confident delivery. Apple keynote presenter register — Craig Federighi energy without the jokes. Economy of words. Silence between sentences is a feature. Every word earns its place. Built with: Hyperframe (the video is its own proof)


Color & Style Direction

Do not prescribe a palette. Each composition should use the Editor Agent's own style system and skill guidance to select colors, fonts, and textures appropriate to the mood direction described per beat. The storyboard specifies aesthetic mood, energy level, and reference direction — the agent handles execution.

Global guardrails:

  • No dark-mode-default. The home-base feel across the video is light, warm, and open. Dark moments are brief departures, not the norm.
  • Push color presence. Muted is fine, flat is not. Every beat should have at least one color that pulls your eye.
  • Typography should be distinctive per beat. No single font family across the whole video. Each aesthetic world gets its own typographic voice.
  • Motion should be visible and intentional. Err toward more movement than feels safe — subtle reads as static at 30fps.

Underscore Direction

Minimal electronic. A warm sustained pad already playing when the video starts — no silence-to-sound transition. Sits underneath everything, never competing with VO. Swells gently during the flex section (0:24–0:38), drops to near-nothing for the comparison beat, resolves on a final chord at close. Reference vibe: Tycho's ambient work, the softer end of Jon Hopkins. Not trailer music. Not corporate stock.


Script (~140 words, ~60s with Apple-cadence pauses)

Your AI agent already knows how to make videos.
It just needs the right format.

[pause]

This is Hyperframe. Open source. HTML in, video out.

A div is a keyframe. Data attributes are your timeline.
CSS is your look. GSAP is your animation engine.

Anything a browser can render becomes a frame in your video.

[pause]

CSS animations. GSAP. Lottie. Shaders. Three.js.

Drop in music, sound effects, footage — it all composes together.

[pause]

No new framework to learn. No thousands of lines of instructions.
Just HTML.

The agent writes it. The renderer captures every frame.
Deterministic. Identical output, every time.

[pause]

Give your agent the skill. Tell it what to make.
Watch it build.

Hyperframe. Go make something.

Beat-by-Beat


BEAT 1 — COLD OPEN: INFINITE CANVAS (0:00–0:05)

VO: "Your AI agent already knows how to make videos."

Concept: Camera is already moving when the video starts. No fade in. No preamble. We're mid-flight over a vast canvas — an infinite artboard — and scattered across it are dozens of living composition cards.

Camera: Slow smooth diagonal drift (top-left to bottom-right), slight rotation (2-3° over 5s). Calm, authoritative. GSAP-driven path, gentle power1.inOut ease. Drone-shot-over-a-city energy.

The canvas: Light, open ground with a faint dot grid or subtle texture for scale. Not pure white — the agent should give it warmth and materiality. Feels like an infinite desk, not a void.

The cards: Rounded rectangles scattered at organic angles (±5-15° rotation). Soft shadows. Thin borders. They feel like physical objects on a surface, not UI windows. Some overlap slightly. Varying scales — close cards partially cropped by viewport, mid-distance cards crisp, far cards small and receding.

Each card contains a DIFFERENT running animation — these are live previews of what Hyperframe can produce:

  • Kinetic typography composition
  • Gradient morph / color animation
  • Data visualization (bars rising, chart drawing)
  • Particle system in formation
  • Logo animation assembling itself
  • Lottie-style vector character loop
  • Shader / generative noise field
  • Video composite (static person frame with lower third + captions overlaid)
  • SVG line illustration drawing itself
  • 3D object rotating softly

Depth cues: Subtle depth-of-field. Close cards slightly blurred. Focal sweet-spot in mid-distance. Far cards smaller and slightly desaturated.

Text: None. The visual works alone. VO lands over this — the viewer's brain connects "AI agent" + "videos" while scanning all these living examples.

SFX: Ambient warmth pad already playing. Faint textured hum implying activity. Not literal audio from each card — more like overhearing life from a distance.


BEAT 2 — THE FORMAT (0:05–0:09)

VO: "It just needs the right format. [pause] This is Hyperframe."

Camera slows. Drifts toward one specific card — a clean, elegant composition. That card scales up, filling the frame. Other cards drift past viewport edges.

On "This is Hyperframe" — the card border dissolves. Its content expands to fill the viewport. Seamless transition from "looking at a card on a canvas" to "being inside a composition." The canvas is gone.

What we've entered: a clean, light typographic space. "Hyperframe" appears center-screen in confident sans-serif. Not the final logo lockup — just the name. Unhurried.

SFX: Ambient hum resolves to a cleaner harmonic as we settle. Subtle shift.


BEAT 3 — THE PROPOSITION (0:09–0:12)

VO: "Open source. HTML in, video out."

Mood: Light, typographic, minimal. The simplicity IS the point.

Visual: Below the Hyperframe name, "Open source" fades in — smaller, lighter weight. Then a minimal diagram draws itself: an HTML bracket < > on the left, a gentle arrow extending rightward, a play button on the right. The stroke has a hand-drawn reveal quality — left to right, like someone sketching it. Not mechanical.

SFX: Barely-there pencil-on-paper texture as the line draws. One soft tonal ping when the play button completes.


BEAT 4 — THE ANATOMY (0:12–0:20)

VO: "A div is a keyframe. Data attributes are your timeline. CSS is your look. GSAP is your animation engine."

Mood: Warm workspace. The clean space gains faint structure — a subtle grid or graph-paper quality. More "nice notebook" than "technical blueprint."

Visual: Four sequential builds, each synced to its phrase:

"A div is a keyframe" — A rounded rectangle (the "div") fades in and slides into position on a horizontal timeline bar that draws itself beneath. Satisfying ease on the slide (power2.out).

"Data attributes are your timeline" — Small labels animate onto the rectangle in clean mono: start: 0 and duration: 5. The timeline bar fills with a gradient showing the time span. A second rectangle appears and docks at the 5-second mark. The timeline is populating itself.

"CSS is your look" — The first rectangle transforms: background shifts to a richer color, corners round further, a shadow appears, the label type upgrades from mono to a proper sans-serif. CSS properties visually applying themselves. Wireframe → styled.

"GSAP is your animation engine" — An easing curve draws itself above the timeline — a smooth power2.out S-curve. The styled rectangle slides along the curve's path, demonstrating the ease in real time. Buttery motion. The curve stays visible, annotated.

By the end: a tiny but complete composition diagram. Timeline, clips, styling, easing. Attractive, not dry.

SFX: Soft tonal accents. A four-note ascending motif — one chime per concept, each landing on the downbeat of its phrase.


BEAT 5 — THE THESIS (0:20–0:24)

VO: "Anything a browser can render becomes a frame in your video."

Mood: Big statement. This sentence gets its own canvas. Clean, spacious, typographic.

Visual: The workspace dissolves to open space. Words appear as staggered kinetic typography:

"Anything a browser can render" — distinctive serif, gentle fade + rise (y: 24px → 0, opacity 0 → 1, 0.4s, power2.out).

Held beat — one second of stillness.

"becomes a frame in your video." — appears below, same treatment. As the final word lands, the entire text pulses once — a brief warm flash, subtle scale bump to 101% and back. The text is being "captured" as a frame. Not harsh. A soft shutter moment.

This is the setup for the flex section. Thesis stated. Now proven.

SFX: Silence under the first line. On the capture pulse — a soft, warm analog shutter click. Single.


BEAT 6 — THE FLEX (0:24–0:38)

VO: "CSS animations. GSAP. Lottie. Shaders. Three.js. Drop in music, sound effects, footage — it all composes together."

14 seconds. The centrepiece. Each capability gets its own complete visual world. HARD smash cuts between them — no dissolves, no transitions. The abrupt switches ARE the proof. Each world should feel like it was made by a different designer in a different decade.

Style instruction for all flex sub-compositions: The agent should use a DIFFERENT palette, typography, and visual language for each. No two should share a font or dominant color. Push variety as far as taste allows. These are not variations on a theme — they are different universes.


6A — "CSS animations." (0:24–0:26.5)

Mood direction: Geometric, rhythmic, precise. Pure CSS energy — transforms, transitions, clip-paths. Think Josef Albers or Bauhaus color studies, but moving. Light background, saturated shapes.

Visual: CSS-only choreography. Colored shapes (circles, rectangles) in synchronized rotation or orbit. A panel wipes open via clip-path reveal. A gradient that breathes through a color cycle. Shapes scale rhythmically. Everything is pure transforms and transitions. Mathematical but alive.

SFX: Rhythmic tick. Metronome energy.


6B — "GSAP." (0:26.5–0:29)

Mood direction: Kinetic typography. High energy. Visible easing. The motion itself is the content — you should FEEL the curves. Light or cream base, bold type, one strong accent color.

Visual: The word "GSAP" enters small, then SNAPS to fill the screen with a back.out overshoot. Letters separate, orbit, leave faint motion trails, then reform with staggered timing (0.08s/letter, power2.inOut). Secondary text elements — "timelines," "easing," "control" — cascade in with a waterfall stagger, building a typographic composition around the reformed word. Every movement has visible, intentional curve. The easing IS the show.

SFX: Percussive snap on the scale-up. Whooshes on orbits. Staccato taps as secondary text lands.


6C — "Lottie." (0:29–0:31)

Mood direction: Flat vector illustration. Playful. Smooth. Handmade feel. Thick strokes, rounded joins. Think Headspace or Duolingo character animation.

Visual: A Lottie-style animation — an abstract bird (or similar) assembling itself from simple geometric shapes. Parts drift in from offscreen, find positions, lock together. Once assembled, the figure takes flight — smooth looping motion. Buttery and loopable.

SFX: Playful pluck. Soft marimba. Two notes — assembly, then takeoff.


6D — "Shaders." (0:31–0:34)

Mood direction: Full-bleed generative art. This is the moment the visual language breaks free of everything before it. The agent should push into territory that feels genuinely new — not the typical dark/neon shader demo. Organic, warm-leaning, mesmerizing. Think: liquid materials, soap-bubble iridescence, molten surfaces.

Visual: A WebGL fragment shader fills the entire frame. Perlin noise or simplex driving slow organic deformation. The palette should feel like it belongs in nature or fine art, not a coding demo. Full-screen. Hypnotic. Slow.

SFX: Ambient wash. Airy, reverbed, almost vocal. A sustained tone.


6E — "Three.js." (0:34–0:36)

Mood direction: Soft 3D. Studio-lit. Gallery object. Not sci-fi, not neon. Think: a ceramic form in warm directional light. Quiet sophistication.

Visual: Camera slowly orbits a 3D abstract sculpture (torus knot, blobby form, or similar) on a light ground plane. Warm directional light, soft shadow. Matte material — not metallic, not glossy. Shallow depth of field on edges. Looks like a product shot for a design object.

SFX: Deep gentle hum. Almost subsonic. Grounding.


6F — "Drop in music, sound effects, footage — it all composes together." (0:36–0:38)

Mood direction: Production canvas. This beat demonstrates that Hyperframe is a COMPOSITION layer — users bring their own media (video files, audio files, images) and Hyperframe assembles them with motion graphics into a finished production. Hyperframe does not generate media. It composes it.

Visual: A mini composition builds itself on screen:

  1. A video frame appears (rounded rectangle, soft shadow) — inside it, a Seedance-generated creator speaking to camera. Warm lighting, natural setting, mid-sentence gesture. 2 seconds of footage.
  2. Around the video, Hyperframe elements layer on in real time:
    • A lower third slides up from bottom (name + title)
    • Kinetic captions appear, tracking speech rhythm
    • A branded frame surrounds the video
    • An audio waveform pulses gently below the frame

The point: real video + real audio + motion graphics = finished production. All composed in HTML.

SEEDANCE INSERT — this is the primary avatar moment. The creator clip is brief and embedded in the production UI, not a standalone talking head. It proves compositing works.

Seedance prompt direction: Single person, warm natural lighting, neutral warm background. Casual professional. Speaking mid-sentence with a natural hand gesture. Medium shot, chest-up, shallow DOF. Confident, warm, mid-conversation. Corporate/product genre.

SFX: For one beat, you HEAR the full composition — the creator's voice (muffled, as if through a preview monitor), a low music bed, a subtle whoosh on the lower third. Brief sensory proof that audio compositing works. Then the underscore reasserts.


BEAT 7 — THE CONTRAST (0:38–0:44)

VO: "No new framework to learn. No thousands of lines of instructions. Just HTML."

Mood direction: Clean comparison. Light base. Two worlds side by side.

Visual:

Left half: Dense code. Small, compressed, overwhelming. Real content — actual framework skill/instruction files. A label: "Other frameworks: 4,500 lines of instructions." Scrolls slowly upward. The density is uncomfortable. Slightly desaturated.

Right half: Spacious Hyperframe HTML. Syntax-highlighted with the agent's own palette choice. A label: "Hyperframe: one skill file." Generous line spacing. Legible. Inviting. Real Hyperframe composition code.

On "Just HTML." — the left side folds inward along its center line, like a book closing. Compresses to nothing. The right side gently expands to fill the full frame. The code gains more whitespace. A subtle warm glow rises behind it.

SFX: Left side carries a faint low drone. On fold: drone cuts. Silence. Then a single clean chime as the right side expands.


BEAT 8 — THE ENGINE (0:44–0:48)

VO: "The agent writes it. The renderer captures every frame. Deterministic. Identical output, every time."

Mood direction: Soft diagrammatic. Warm, clean technical illustration. Apple "how the chip works" energy, but lighter.

Visual: A rendering pipeline draws itself left to right, each node appearing as its phrase is spoken:

  • "The agent writes it" → A circle icon (with a spark/intelligence detail) labeled "Agent"
  • Arrow extends → document icon (HTML bracket inside) labeled "HTML"
  • Arrow extends → lens/aperture icon labeled "Renderer"
  • Arrow extends → play-button icon labeled "MP4"

Icons styled by the agent's palette — each node a different accent color. Arrows draw with a satisfying stroke reveal.

"Deterministic. Identical output, every time." → Below the pipeline, two thumbnail frames appear side by side — identical compositions rendered twice. A visual diff overlay sweeps across: solid match indicator. A small label: "Byte-identical." Visual and visceral proof.

SFX: Soft clicks as nodes appear. Drawing sound on arrows. On "deterministic" — a crisp, satisfying lock/latch sound.


BEAT 9 — THE CTA (0:48–0:54)

VO: "Give your agent the skill. Tell it what to make. Watch it build."

Mood direction: Warm workspace. Light-themed editor feel. NOT a dark terminal. Inviting, not intimidating.

Visual:

"Give your agent the skill" → A document icon with a Hyperframe mark descends into a workspace panel. Soft bounce on landing (back.out, subtle). Label: hyperframe-skill.md.

"Tell it what to make" → A light-toned input field appears. A prompt types itself with natural pacing: "a product intro with kinetic typography". Cursor blinks at natural rhythm. Clean type.

"Watch it build" → A preview panel slides in from offscreen (power2.out). Inside it, a composition assembles at 4× speed — elements appearing, positioning, animating. Code streams in a narrow panel on the left; visual output forms on the right. The output uses the agent's own style — it looks like it belongs. A complete composition, built from a sentence.

SFX: Soft typing on the prompt. Rising build sound — ascending tonal pings as the composition populates. Satisfying completion chime when done.


BEAT 10 — THE CLOSE: RETURN TO CANVAS (0:54–1:00)

VO: "Hyperframe. Go make something."

Visual:

On "Hyperframe" — the workspace pulls back. We're zooming out. The preview, input field, workspace — they shrink and settle into a rounded-rectangle card on the canvas. We're BACK on the infinite canvas from Beat 1.

Same gentle diagonal drift. All original cards still animating. But now there's one MORE card: the video we just watched. Playing on a loop among the others.

The canvas extends in every direction. Open space between cards. Room for more.

On "Go make something" — the Hyperframe wordmark fades in at center screen, over a semi-transparent overlay. Below it: github.com/heygen-com/hyperframe. The canvas is still faintly visible behind, still drifting. Compositions still alive.

Hold 2 seconds. The visible empty space on the canvas — where new compositions could go — IS the invitation.

SFX: Underscore resolves to a final chord. Holds. Faint ambient canvas hum returns underneath. Fade to silence.


Seedance Strategy

One definite insert:

Beat 6F — "A-roll" capability moment (0:36–0:38). Creator on camera, composited INTO a Hyperframe production layout with lower third, captions, waveform, branded frame. Functional, not decorative — proves video compositing.

Prompt direction: Single person, warm natural lighting (golden hour), neutral warm background (wood, linen, soft-focus plants). Casual professional wardrobe. Speaking mid-sentence, natural hand gesture. Medium shot, chest-up, shallow DOF. Confident, warm, mid-conversation. Genre: corporate/product.

Optional second insert (evaluate in production):

Beat 9 — "Watch it build" (0:50–0:52). Tight on a person's face watching a screen, reflected light shifting, a slow quiet smile. If it adds warmth, keep it. If it interrupts the build demo flow, cut it. The video works without it.


Media Clarification

Hyperframe is a composition layer, not a media generator. The framework composes video, audio, and images that the user provides — it does not create them. The VO line "Drop in music, sound effects, footage — it all composes together" is deliberately framed as additive: you bring your assets, Hyperframe brings the timeline, animation, and rendering.

Supported media inputs (user-provided):

  • <video> — A-roll, B-roll, talking heads, screen recordings
  • <audio> — Music beds, sound effects, voiceover
  • <img> — Photos, illustrations, logos, screenshots

What Hyperframe generates:

  • Motion graphics (GSAP animations, CSS animations)
  • Text/typography compositions
  • Data visualizations
  • Shader/generative visuals
  • Layout and compositing of all the above with user media

This distinction matters for the flex section: Beats 6A–6E demonstrate what Hyperframe GENERATES (motion graphics, animations, shaders, 3D). Beat 6F demonstrates what Hyperframe COMPOSES (user media + generated graphics = finished production).


Production Architecture

One top-level Hyperframe composition. Major sections as sub-compositions:

index.html                          root — 60s, VO + underscore + orchestration
├── compositions/canvas-open.html   infinite canvas fly-through (0:00–0:05)
├── compositions/canvas-zoom.html   settle into card (0:05–0:09)
├── compositions/proposition.html   HTML-in-video-out diagram (0:09–0:12)
├── compositions/anatomy.html       div/timeline/css/gsap build (0:12–0:20)
├── compositions/thesis.html        big type + capture pulse (0:20–0:24)
├── compositions/flex-css.html      CSS capability (0:24–0:26.5)
├── compositions/flex-gsap.html     GSAP capability (0:26.5–0:29)
├── compositions/flex-lottie.html   Lottie capability (0:29–0:31)
├── compositions/flex-shader.html   WebGL shader (0:31–0:34)
├── compositions/flex-threejs.html  3D scene (0:34–0:36)
├── compositions/flex-compose.html  full stack + Seedance (0:36–0:38)
├── compositions/contrast.html      4500 vs skill comparison (0:38–0:44)
├── compositions/engine.html        render pipeline diagram (0:44–0:48)
├── compositions/cta.html           skill → prompt → build (0:48–0:54)
└── compositions/canvas-close.html  return to canvas + wordmark (0:54–1:00)

Root index.html handles: VO audio element, underscore audio element, sequential placement of all sub-compositions via data-start references.


Timing Table

Beat Start End Dur VO cue Visual
1 0:00 0:05 5.0s "Your AI agent already knows..." Infinite canvas fly-through
2 0:05 0:09 4.0s "It just needs the right format..." Camera settles into card
3 0:09 0:12 3.0s "Open source. HTML in, video out." Bracket → arrow → play diagram
4 0:12 0:20 8.0s "A div is a keyframe..." Four-part anatomy build
5 0:20 0:24 4.0s "Anything a browser can render..." Big type + capture pulse
6A 0:24 0:26.5 2.5s "CSS animations." Geometric CSS choreography
6B 0:26.5 0:29 2.5s "GSAP." Kinetic typography explosion
6C 0:29 0:31 2.0s "Lottie." Vector character assembly
6D 0:31 0:34 3.0s "Shaders." Full-bleed generative art
6E 0:34 0:36 2.0s "Three.js." Soft-lit 3D orbit
6F 0:36 0:38 2.0s "Drop in music, sound effects..." Composition stack + Seedance
7 0:38 0:44 6.0s "No new framework..." Split comparison + fold
8 0:44 0:48 4.0s "The agent writes it..." Render pipeline diagram
9 0:48 0:54 6.0s "Give your agent the skill..." Skill → prompt → live build
10 0:54 1:00 6.0s "Hyperframe. Go make something." Pull back to canvas + wordmark