diff --git a/.changeset/grok-imagine-video-adapter.md b/.changeset/grok-imagine-video-adapter.md
new file mode 100644
index 000000000..717a299f4
--- /dev/null
+++ b/.changeset/grok-imagine-video-adapter.md
@@ -0,0 +1,5 @@
+---
+'@tanstack/ai-grok': minor
+---
+
+Add a `grokVideo` adapter for xAI's Imagine video models. `grok-imagine-video` (v1.0) supports text-to-video and image-to-video; `grok-imagine-video-1.5` is image-to-video only — a text-only prompt is rejected by the API, so the adapter fails fast with a clear error telling you to add a starting-frame image or use `grok-imagine-video`. Image-to-video starting frames are supplied as an `image` prompt part (public URL or base64 data source), with the text part describing the motion. Follows the experimental `generateVideo()` jobs/polling architecture: `createVideoJob` posts to `/v1/videos/generations`, status polling reads `/v1/videos/{request_id}`, and the completed result carries the hosted video URL plus usage (`unitsBilled` seconds and exact `cost` in USD). Sizing uses the aspect-ratio template consistent with the grok-imagine image models (`size: '16:9_720p'` → `aspect_ratio` / `resolution`), and durations are 1–15 integer seconds.
diff --git a/docs/adapters/grok.md b/docs/adapters/grok.md
index 5b4e043be..860bf200e 100644
--- a/docs/adapters/grok.md
+++ b/docs/adapters/grok.md
@@ -2,17 +2,20 @@
title: Grok (xAI)
id: grok-adapter
order: 5
-description: "Use xAI Grok Responses models with TanStack AI — Grok 4.3 and Grok Build 0.1 via @tanstack/ai-grok."
+description: "Use xAI Grok models with TanStack AI — Grok 4.3, Grok Build 0.1, Grok Imagine image generation, and Grok Imagine video generation via @tanstack/ai-grok."
keywords:
- tanstack ai
- grok
- xai
- grok 4.3
- grok build
+ - image generation
+ - video generation
+ - grok imagine
- adapter
---
-The Grok text and summarization adapters provide access to xAI's Responses API for `grok-4.3` and `grok-build-0.1`.
+The Grok text and summarization adapters provide access to xAI's Responses API for `grok-4.3` and `grok-build-0.1`, plus Grok Imagine image generation and Grok Imagine video generation.
## Installation
@@ -203,6 +206,67 @@ reachable; use a `data` source for private images. `grok-2-image-1212` is
text-to-image only — image prompt parts are a compile-time type error and
throw at runtime.
+## Video Generation (Experimental)
+
+Generate short video clips (1–15 seconds, with audio) with the Grok Imagine video models via xAI's asynchronous jobs/polling API.
+
+Available models:
+
+- `grok-imagine-video` (v1.0) — text-to-video and image-to-video, $0.05 per second of video.
+- `grok-imagine-video-1.5` — **image-to-video only**, $0.08 per second of video. A text-only prompt is rejected by the API; the adapter fails fast with a clear error telling you to add a starting-frame image or use `grok-imagine-video`.
+
+Text-to-video with the base `grok-imagine-video` model:
+
+```typescript
+import { generateVideo, getVideoJobStatus } from "@tanstack/ai";
+import { grokVideo } from "@tanstack/ai-grok";
+
+const adapter = grokVideo("grok-imagine-video");
+
+// 1. Create the job
+const { jobId } = await generateVideo({
+ adapter,
+ prompt: "A red panda balancing on a bamboo stalk in the rain",
+ size: "16:9_720p", // "aspectRatio" or "aspectRatio_resolution"
+ duration: 5, // integer seconds, 1–15
+});
+
+// 2. Poll until complete, then read the video URL
+let status = await getVideoJobStatus({ adapter, jobId });
+while (status.status !== "completed" && status.status !== "failed") {
+ await new Promise((r) => setTimeout(r, 5000));
+ status = await getVideoJobStatus({ adapter, jobId });
+}
+
+console.log(status.url); // hosted .mp4 URL
+```
+
+For image-to-video (required for `grok-imagine-video-1.5`, optional for `grok-imagine-video`), include an `image` prompt part as the starting frame and describe the desired motion in the text part. URL sources are fetched by xAI's servers (so they must be publicly reachable); use a `data` source for a base64 starting frame:
+
+```typescript
+const { jobId } = await generateVideo({
+ adapter: grokVideo("grok-imagine-video-1.5"),
+ prompt: [
+ {
+ type: "text",
+ content: "Make the waterfall crash down and slowly pan out the camera",
+ },
+ {
+ type: "image",
+ source: { type: "url", value: "https://example.com/waterfall-still.png" },
+ },
+ ],
+ size: "16:9_720p",
+ duration: 10,
+});
+```
+
+Like the Grok Imagine image models, sizing is aspect-ratio based: the `size` option takes an `aspectRatio_resolution` template. Supported aspect ratios are `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, and `2:3`; supported resolutions are `480p`, `720p`, and `1080p` (e.g. `"9:16_1080p"`). The resolution suffix is optional.
+
+When the job completes, the adapter reports usage on the result: `usage.unitsBilled` carries the billed seconds of video and `usage.cost` the exact cost in USD, both as returned by the xAI API.
+
+See [Video Generation](../media/video-generation) for the full jobs/polling flow, streaming mode, and the `useGenerateVideo` hook.
+
## Text-to-Speech
Generate speech with Grok TTS:
@@ -298,6 +362,10 @@ Creates a Grok summarization adapter with an explicit API key.
Creates a Grok image generation adapter.
+### `grokVideo(model, config?)` / `createGrokVideo(model, apiKey, config?)`
+
+Creates a Grok video generation adapter (experimental) for the Grok Imagine video models (`'grok-imagine-video'`, `'grok-imagine-video-1.5'`).
+
### `grokSpeech(model, config?)` / `createGrokSpeech(model, apiKey, config?)`
Creates a Grok text-to-speech adapter.
diff --git a/docs/config.json b/docs/config.json
index 966b75108..72782240a 100644
--- a/docs/config.json
+++ b/docs/config.json
@@ -262,7 +262,7 @@
"label": "Video Generation",
"to": "media/video-generation",
"addedAt": "2026-04-15",
- "updatedAt": "2026-06-08"
+ "updatedAt": "2026-06-24"
},
{
"label": "Generation Hooks",
@@ -434,7 +434,8 @@
{
"label": "Grok (xAI)",
"to": "adapters/grok",
- "addedAt": "2026-04-15"
+ "addedAt": "2026-04-15",
+ "updatedAt": "2026-06-24"
},
{
"label": "Groq",
diff --git a/docs/media/video-generation.md b/docs/media/video-generation.md
index eebbdf530..940de2915 100644
--- a/docs/media/video-generation.md
+++ b/docs/media/video-generation.md
@@ -2,13 +2,15 @@
title: Video Generation
id: video-generation
order: 6
-description: "Generate video from text prompts with OpenAI Sora or Google Veo using TanStack AI's experimental generateVideo() jobs/polling API."
+description: "Generate video from text prompts with OpenAI Sora, Google Veo, xAI Grok Imagine, or fal.ai using TanStack AI's experimental generateVideo() jobs/polling API."
keywords:
- tanstack ai
- video generation
- sora
- veo
- gemini
+ - grok imagine
+ - fal
- generateVideo
- jobs api
- experimental
@@ -39,6 +41,8 @@ TanStack AI provides experimental support for video generation through dedicated
Currently supported:
- **OpenAI**: Sora-2 and Sora-2-Pro models (when available)
- **Google Gemini**: Veo 3.1, Veo 3, and Veo 2 models (via the long-running operations API)
+- **Grok (xAI)**: grok-imagine-video (text-to-video + image-to-video) and grok-imagine-video-1.5 (image-to-video only) models
+- **fal.ai**: MiniMax, Luma, Kling, Hunyuan, and other hosted video models
## Basic Usage
@@ -552,6 +556,59 @@ Adapters that haven't declared a per-model duration map keep the plain
> Files API and requires your API key to download (send it as an
> `x-goog-api-key` header or `key` query parameter).
+### Grok (xAI Imagine) Model Options
+
+Based on the [xAI video generation API](https://docs.x.ai/docs/guides/video-generations). Two models are available: `grok-imagine-video` (v1.0) supports **text-to-video and image-to-video**, while `grok-imagine-video-1.5` is **image-to-video only** (a text-only prompt is rejected by the API; the adapter throws a clear error pointing you at `grok-imagine-video`). Both are aspect-ratio sized — the generic `size` option takes an `aspectRatio_resolution` template (like the Grok Imagine image models), and clips can be 1–15 seconds long.
+
+Text-to-video with the base model:
+
+```typescript
+import { generateVideo } from '@tanstack/ai'
+import { grokVideo } from '@tanstack/ai-grok'
+
+const { jobId } = await generateVideo({
+ adapter: grokVideo('grok-imagine-video'),
+ prompt: 'A beautiful sunset over the ocean',
+ size: '16:9_720p', // aspect ratio: '1:1' | '16:9' | '9:16' | '4:3' | '3:4' | '3:2' | '2:3'
+ // resolution (optional suffix): '480p' | '720p' | '1080p'
+ duration: 5, // integer seconds, 1-15
+ modelOptions: {
+ aspect_ratio: '16:9', // Alternative way to specify the aspect ratio
+ resolution: '720p', // Alternative way to specify the resolution
+ duration: 5, // Alternative way to specify the duration
+ },
+})
+```
+
+Image-to-video (required for `grok-imagine-video-1.5`) — include an `image` prompt part as the starting frame. URL sources are fetched by xAI's servers (so they must be publicly reachable); use a `data` source for a base64 starting frame:
+
+```typescript
+const { jobId } = await generateVideo({
+ adapter: grokVideo('grok-imagine-video-1.5'),
+ prompt: [
+ { type: 'text', content: 'Slowly pan out as the waves roll in' },
+ {
+ type: 'image',
+ source: { type: 'url', value: 'https://example.com/still.png' },
+ },
+ ],
+ size: '16:9_720p',
+ duration: 5,
+})
+```
+
+Both models accept any whole second in the **1–15** range. A raw `duration` is coerced into that range rather than rejected — values are clamped to `[1, 15]` and rounded to the nearest second. Inspect or pre-snap the range the same way as Veo:
+
+```typescript
+const adapter = grokVideo('grok-imagine-video')
+
+adapter.availableDurations() // { kind: 'range', min: 1, max: 15, step: 1, unit: 'seconds' }
+adapter.snapDuration(2.5) // 3 — clamped/rounded into range
+adapter.snapDuration(99) // 15
+```
+
+Generated clips include an audio track. When the job completes, the adapter reports `usage.unitsBilled` (billed seconds of video) and `usage.cost` (exact USD cost as returned by the API) on the result.
+
## Response Types
> **Note:** The interfaces below are the underlying adapter-level types. The `getVideoJobStatus()` helper returns a single merged object, `{ status, progress?, url?, error?, usage? }` — it does not return `jobId` or `expiresAt`.
diff --git a/examples/ts-react-media/.env.example b/examples/ts-react-media/.env.example
index b7c897653..fdf123604 100644
--- a/examples/ts-react-media/.env.example
+++ b/examples/ts-react-media/.env.example
@@ -5,3 +5,7 @@ FAL_KEY=
# Get a Google API key at https://aistudio.google.com/apikey
GOOGLE_API_KEY=
+
+# Get an xAI API key at https://console.x.ai — used by the "xAI Direct"
+# Grok Imagine video models (the other Grok Imagine entries go through fal).
+XAI_API_KEY=
diff --git a/examples/ts-react-media/package.json b/examples/ts-react-media/package.json
index 80bc30ce8..9adf242d1 100644
--- a/examples/ts-react-media/package.json
+++ b/examples/ts-react-media/package.json
@@ -14,6 +14,7 @@
"@tanstack/ai": "workspace:*",
"@tanstack/ai-fal": "workspace:*",
"@tanstack/ai-gemini": "workspace:*",
+ "@tanstack/ai-grok": "workspace:*",
"@tanstack/react-router": "^1.158.4",
"@tanstack/react-start": "^1.159.0",
"@tanstack/router-plugin": "^1.158.4",
diff --git a/examples/ts-react-media/src/components/ImageGenerator.tsx b/examples/ts-react-media/src/components/ImageGenerator.tsx
index ca72e3823..9b4d5fd29 100644
--- a/examples/ts-react-media/src/components/ImageGenerator.tsx
+++ b/examples/ts-react-media/src/components/ImageGenerator.tsx
@@ -27,6 +27,7 @@ function getImageSrc(image: { url?: string; b64Json?: string }): string {
const falModels = IMAGE_MODELS.filter((m) => m.provider === 'fal')
const geminiModels = IMAGE_MODELS.filter((m) => m.provider === 'gemini')
+const xaiModels = IMAGE_MODELS.filter((m) => m.provider === 'xai')
export default function ImageGenerator({
onImageGenerated,
@@ -161,6 +162,13 @@ export default function ImageGenerator({
))}
+
+ {xaiModels.map((model) => (
+
+ {model.name}
+
+ ))}
+
{currentModel && selectedModel !== 'all' && (
diff --git a/examples/ts-react-media/src/components/VideoGenerator.tsx b/examples/ts-react-media/src/components/VideoGenerator.tsx
index 5661df9ac..f31a8078e 100644
--- a/examples/ts-react-media/src/components/VideoGenerator.tsx
+++ b/examples/ts-react-media/src/components/VideoGenerator.tsx
@@ -21,7 +21,7 @@ type JobState =
model: string
progress?: number | undefined
}
- | { status: 'completed'; url: string; unitsBilled?: number }
+ | { status: 'completed'; url: string; unitsBilled?: number; cost?: number }
| { status: 'error'; message: string }
interface VideoGeneratorProps {
@@ -42,6 +42,8 @@ export default function VideoGenerator({
const pollingRefs = useRef>(new Map())
const filteredModels = VIDEO_MODELS.filter((m) => m.mode === mode)
+ const falModels = filteredModels.filter((m) => m.provider === 'fal')
+ const xaiModels = filteredModels.filter((m) => m.provider === 'xai')
useEffect(() => {
if (initialImageUrl) {
@@ -97,6 +99,7 @@ export default function VideoGenerator({
status: 'completed',
url: url,
unitsBilled: urlResult.usage?.unitsBilled,
+ cost: urlResult.usage?.cost,
},
}))
} else if (status.status === 'processing') {
@@ -164,8 +167,11 @@ export default function VideoGenerator({
},
}))
+ // Poll keyed by the UI model id, not result.model: the direct-xAI
+ // entries share one adapter model ('grok-imagine-video-1.5'),
+ // so result.model wouldn't identify the card (or the adapter) uniquely.
const interval = setInterval(() => {
- pollStatus(result.jobId, result.model)
+ pollStatus(result.jobId, modelId)
}, 4000)
pollingRefs.current.set(modelId, interval)
} catch (err) {
@@ -249,11 +255,20 @@ export default function VideoGenerator({
className="w-full px-4 py-3 bg-gray-800 border border-gray-700 rounded-lg text-white focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent disabled:opacity-50"
>
All Models
- {filteredModels.map((model) => (
-
- {model.name}
-
- ))}
+
+ {falModels.map((model) => (
+
+ {model.name}
+
+ ))}
+
+
+ {xaiModels.map((model) => (
+
+ {model.name}
+
+ ))}
+
@@ -406,12 +421,21 @@ export default function VideoGenerator({
className="w-full h-auto"
/>
- {state.unitsBilled != null && (
+ {state.cost != null ? (
- Billed {state.unitsBilled} fal unit
- {state.unitsBilled === 1 ? '' : 's'} — multiply by the
- endpoint unit price for USD cost
+ Billed ${state.cost.toFixed(3)}
+ {state.unitsBilled != null
+ ? ` for ${state.unitsBilled} second${state.unitsBilled === 1 ? '' : 's'} of video`
+ : ''}
+ ) : (
+ state.unitsBilled != null && (
+
+ Billed {state.unitsBilled} fal unit
+ {state.unitsBilled === 1 ? '' : 's'} — multiply by the
+ endpoint unit price for USD cost
+
+ )
)}
>
)}
diff --git a/examples/ts-react-media/src/lib/models.ts b/examples/ts-react-media/src/lib/models.ts
index cfa36dfc5..5947febe5 100644
--- a/examples/ts-react-media/src/lib/models.ts
+++ b/examples/ts-react-media/src/lib/models.ts
@@ -15,6 +15,22 @@ export const IMAGE_MODELS = [
sizeType: 'aspect_ratio' as const,
provider: 'fal' as const,
},
+ {
+ id: 'grok-imagine-image',
+ name: 'Grok Imagine (xAI Direct)',
+ description: 'xAI Imagine API via the native grokImage adapter',
+ defaultSize: '16:9' as const,
+ sizeType: 'aspect_ratio' as const,
+ provider: 'xai' as const,
+ },
+ {
+ id: 'grok-imagine-image-quality',
+ name: 'Grok Imagine Quality (xAI Direct)',
+ description: 'Higher-quality xAI Imagine images via the native adapter',
+ defaultSize: '16:9' as const,
+ sizeType: 'aspect_ratio' as const,
+ provider: 'xai' as const,
+ },
{
id: 'fal-ai/flux-2/klein/9b',
name: 'FLUX.2 Klein 9B',
@@ -79,48 +95,72 @@ export const VIDEO_MODELS = [
name: 'Kling 3 Pro (Text-to-Video)',
description: 'High-quality text-to-video generation',
mode: 'text-to-video' as const,
+ provider: 'fal' as const,
},
{
id: 'fal-ai/kling-video/v3/pro/image-to-video',
name: 'Kling 3 Pro (Image-to-Video)',
description: 'Animate images with Kling',
mode: 'image-to-video' as const,
+ provider: 'fal' as const,
},
{
id: 'fal-ai/veo3.1',
name: 'Veo 3.1 (Text-to-Video)',
description: 'Google Veo text-to-video',
mode: 'text-to-video' as const,
+ provider: 'fal' as const,
},
{
id: 'fal-ai/veo3.1/image-to-video',
name: 'Veo 3.1 (Image-to-Video)',
description: 'Google Veo image-to-video',
mode: 'image-to-video' as const,
+ provider: 'fal' as const,
},
{
id: 'xai/grok-imagine-video/text-to-video',
name: 'Grok Imagine Video (Text-to-Video)',
description: 'xAI video generation from text',
mode: 'text-to-video' as const,
+ provider: 'fal' as const,
},
{
id: 'xai/grok-imagine-video/image-to-video',
name: 'Grok Imagine Video (Image-to-Video)',
description: 'xAI animate images to video',
mode: 'image-to-video' as const,
+ provider: 'fal' as const,
+ },
+ {
+ id: 'grok-imagine-video',
+ name: 'Grok Imagine Video 1.0 (Text-to-Video)',
+ description:
+ 'xAI Imagine API via the native grokVideo adapter (v1.0 supports text-to-video)',
+ mode: 'text-to-video' as const,
+ provider: 'xai' as const,
+ },
+ {
+ id: 'grok-imagine-video-1.5/image-to-video',
+ name: 'Grok Imagine Video 1.5 (Image-to-Video)',
+ description:
+ 'Animate a starting frame via the native grokVideo adapter (1.5 is image-to-video only)',
+ mode: 'image-to-video' as const,
+ provider: 'xai' as const,
},
{
id: 'fal-ai/ltx-2.3/text-to-video/fast',
name: 'LTX-2.3 Fast (Text-to-Video)',
description: 'Fast text-to-video generation',
mode: 'text-to-video' as const,
+ provider: 'fal' as const,
},
{
id: 'fal-ai/ltx-2.3/image-to-video/fast',
name: 'LTX-2.3 Fast (Image-to-Video)',
description: 'Fast image-to-video animation',
mode: 'image-to-video' as const,
+ provider: 'fal' as const,
},
] as const
diff --git a/examples/ts-react-media/src/lib/server-functions.ts b/examples/ts-react-media/src/lib/server-functions.ts
index d4b010ad2..1b3b52639 100644
--- a/examples/ts-react-media/src/lib/server-functions.ts
+++ b/examples/ts-react-media/src/lib/server-functions.ts
@@ -1,9 +1,9 @@
import { createServerFn } from '@tanstack/react-start'
import { falImage, falVideo } from '@tanstack/ai-fal'
import { geminiImage } from '@tanstack/ai-gemini'
+import { grokImage, grokVideo } from '@tanstack/ai-grok'
import { generateImage, generateVideo, getVideoJobStatus } from '@tanstack/ai'
-import type { FalModel } from '@tanstack/ai-fal'
import type {
ImagePart,
MediaInputMetadata,
@@ -67,6 +67,21 @@ function asImageToVideoPrompt(
return narrowed
}
+/**
+ * Resolves the video adapter for a UI model id. The native grok-imagine
+ * entries hit xAI's Imagine API directly via the `grokVideo` adapter
+ * (XAI_API_KEY); everything else is a fal-hosted model.
+ */
+function videoAdapterForModel(model: string) {
+ if (model === 'grok-imagine-video') {
+ return grokVideo('grok-imagine-video')
+ }
+ if (model === 'grok-imagine-video-1.5/image-to-video') {
+ return grokVideo('grok-imagine-video-1.5')
+ }
+ return falVideo(model)
+}
+
export const generateImageFn = createServerFn({ method: 'POST' })
.inputValidator((data: { prompt: MediaPrompt; model: string }) => {
if (!hasPromptContent(data.prompt)) throw new Error('Prompt is required')
@@ -104,6 +119,26 @@ export const generateImageFn = createServerFn({ method: 'POST' })
modelOptions: { aspect_ratio: '16:9' },
})
}
+ case 'grok-imagine-image': {
+ // Direct xAI Imagine API (XAI_API_KEY) via the native grokImage
+ // adapter — no fal in between. The grok-imagine models accept image
+ // prompt parts for image-conditioned generation, so we narrow with
+ // asImagePrompt. Sizing uses the aspect-ratio template.
+ return generateImage({
+ adapter: grokImage('grok-imagine-image'),
+ prompt: asImagePrompt(data.prompt),
+ numberOfImages: 1,
+ size: '16:9',
+ })
+ }
+ case 'grok-imagine-image-quality': {
+ return generateImage({
+ adapter: grokImage('grok-imagine-image-quality'),
+ prompt: asImagePrompt(data.prompt),
+ numberOfImages: 1,
+ size: '16:9',
+ })
+ }
case 'fal-ai/flux-2/klein/9b': {
// NOTE: Newer models are untyped (at the moment)
return generateImage({
@@ -214,6 +249,18 @@ export const createVideoJobFn = createServerFn({ method: 'POST' })
},
})
}
+ case 'grok-imagine-video': {
+ // Direct xAI Imagine API (XAI_API_KEY) — no fal in between. The base
+ // grok-imagine-video (v1.0) supports text-to-video; durations are
+ // 1-15 integer seconds. Completed jobs report usage.unitsBilled
+ // (billed seconds) and usage.cost (exact USD).
+ return generateVideo({
+ adapter: grokVideo('grok-imagine-video'),
+ prompt: asTextPrompt(data.prompt),
+ size: '16:9_720p',
+ duration: 5,
+ })
+ }
case 'fal-ai/ltx-2.3/text-to-video/fast': {
return generateVideo({
adapter: falVideo('fal-ai/ltx-2.3/text-to-video/fast'),
@@ -252,6 +299,17 @@ export const createVideoJobFn = createServerFn({ method: 'POST' })
},
})
}
+ case 'grok-imagine-video-1.5/image-to-video': {
+ // Direct xAI Imagine API. The starting frame is supplied as an image
+ // prompt part (asImageToVideoPrompt requires one); the grokVideo
+ // adapter forwards it to the Imagine endpoint as the start frame.
+ return generateVideo({
+ adapter: grokVideo('grok-imagine-video-1.5'),
+ prompt: asImageToVideoPrompt(data.prompt),
+ size: '16:9_720p',
+ duration: 5,
+ })
+ }
case 'fal-ai/ltx-2.3/image-to-video/fast': {
return generateVideo({
adapter: falVideo('fal-ai/ltx-2.3/image-to-video/fast'),
@@ -265,9 +323,9 @@ export const createVideoJobFn = createServerFn({ method: 'POST' })
})
export const getVideoStatusFn = createServerFn({ method: 'GET' })
- .inputValidator((data: { jobId: string; model: FalModel }) => data)
+ .inputValidator((data: { jobId: string; model: string }) => data)
.handler(async ({ data }) => {
- const adapter = falVideo(data.model)
+ const adapter = videoAdapterForModel(data.model)
return await getVideoJobStatus({
adapter,
jobId: data.jobId,
@@ -277,7 +335,7 @@ export const getVideoStatusFn = createServerFn({ method: 'GET' })
export const getVideoUrlFn = createServerFn({ method: 'GET' })
.inputValidator((data: { jobId: string; model: string }) => data)
.handler(async ({ data }) => {
- const adapter = falVideo(data.model)
+ const adapter = videoAdapterForModel(data.model)
return await getVideoJobStatus({
adapter,
jobId: data.jobId,
diff --git a/packages/ai-grok/src/adapters/video.ts b/packages/ai-grok/src/adapters/video.ts
new file mode 100644
index 000000000..a59c45230
--- /dev/null
+++ b/packages/ai-grok/src/adapters/video.ts
@@ -0,0 +1,462 @@
+import { resolveMediaPrompt } from '@tanstack/ai'
+import { BaseVideoAdapter, snapToDurationOption } from '@tanstack/ai/adapters'
+import { toRunErrorPayload } from '@tanstack/ai/adapter-internals'
+import { getGrokApiKeyFromEnv, withGrokDefaults } from '../utils/client'
+import {
+ getGrokVideoDurationOptions,
+ isImageToVideoOnlyModel,
+ parseGrokVideoSize,
+ validateVideoSize,
+} from '../video/video-provider-options'
+import type { DurationOptions } from '@tanstack/ai/adapters'
+import type {
+ ImagePart,
+ MediaInputMetadata,
+ TokenUsage,
+ VideoGenerationOptions,
+ VideoJobResult,
+ VideoStatusResult,
+ VideoUrlResult,
+} from '@tanstack/ai'
+import type { GrokVideoModel } from '../model-meta'
+import type {
+ GrokVideoModelDurationByName,
+ GrokVideoModelInputModalitiesByName,
+ GrokVideoModelProviderOptionsByName,
+ GrokVideoModelSizeByName,
+ GrokVideoProviderOptions,
+} from '../video/video-provider-options'
+import type { GrokClientConfig } from '../utils'
+
+/**
+ * Configuration for Grok video adapter.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export interface GrokVideoConfig extends GrokClientConfig {}
+
+/**
+ * xAI bills video generation in "USD ticks": 10^10 ticks per US dollar
+ * (e.g. one grok-imagine-video-1.5 second costs $0.08 = 800_000_000 ticks).
+ */
+const USD_TICKS_PER_DOLLAR = 10_000_000_000
+
+/** Response of POST /v1/videos/generations. */
+interface GrokVideoCreateResponse {
+ request_id?: string
+}
+
+/** Response of GET /v1/videos/{request_id}. */
+interface GrokVideoStatusResponse {
+ status?: string
+ progress?: number
+ model?: string
+ video?: {
+ url?: string
+ duration?: number
+ }
+ usage?: {
+ cost_in_usd_ticks?: number
+ }
+ error?: string
+}
+
+/**
+ * Convert a TanStack ImagePart to the URL string accepted by xAI's Imagine
+ * video endpoint: public URLs pass through (fetched by xAI's servers), data
+ * sources become base64 data URIs.
+ */
+function imagePartToUrl(part: ImagePart): string {
+ if (part.source.type === 'url') return part.source.value
+ return `data:${part.source.mimeType};base64,${part.source.value}`
+}
+
+function buildGrokVideoUsage(
+ response: GrokVideoStatusResponse,
+): TokenUsage | undefined {
+ const seconds = response.video?.duration
+ const ticks = response.usage?.cost_in_usd_ticks
+ if (seconds === undefined && ticks === undefined) return undefined
+ return {
+ promptTokens: 0,
+ completionTokens: 0,
+ totalTokens: 0,
+ ...(seconds !== undefined && { unitsBilled: seconds }),
+ ...(ticks !== undefined && { cost: ticks / USD_TICKS_PER_DOLLAR }),
+ }
+}
+
+/**
+ * Grok Video Generation Adapter (xAI Imagine API)
+ *
+ * Tree-shakeable adapter for the grok-imagine video models using the
+ * async jobs/polling architecture: create a generation request, poll it,
+ * then read the completed video URL.
+ *
+ * `grok-imagine-video` (v1.0) supports text-to-video and image-to-video.
+ * `grok-imagine-video-1.5` is image-to-video only — every request needs an
+ * image prompt part as the starting frame, and the adapter rejects a
+ * text-only prompt with a clear error rather than a raw API 400.
+ *
+ * The Imagine video endpoints are not part of the OpenAI SDK surface (and
+ * xAI rejects the SDK's multipart paths), so requests are plain JSON calls
+ * issued with the configured `fetch` (or the global one).
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ *
+ * Features:
+ * - Async job-based video generation (1–15 second clips with audio)
+ * - Aspect-ratio sizing via the "aspectRatio_resolution" size template
+ * (e.g. '16:9_720p'), consistent with the grok-imagine image models
+ * - Image-to-video via an `image` prompt part (starting frame URL or data URI)
+ * - Usage reporting: billed seconds (`unitsBilled`) and exact cost
+ */
+export class GrokVideoAdapter<
+ TModel extends GrokVideoModel,
+> extends BaseVideoAdapter<
+ TModel,
+ GrokVideoProviderOptions,
+ GrokVideoModelProviderOptionsByName,
+ GrokVideoModelSizeByName,
+ GrokVideoModelInputModalitiesByName,
+ GrokVideoModelDurationByName
+> {
+ readonly name = 'grok' as const
+
+ private readonly clientConfig: GrokVideoConfig
+
+ constructor(config: GrokVideoConfig, model: TModel) {
+ super({}, model)
+ this.clientConfig = withGrokDefaults(config)
+ }
+
+ private get fetch(): (
+ input: string,
+ init?: RequestInit,
+ ) => Promise {
+ return this.clientConfig.fetch ?? fetch
+ }
+
+ private async request(
+ path: string,
+ init?: Omit,
+ ): Promise {
+ return await this.fetch(`${this.clientConfig.baseURL}${path}`, {
+ ...init,
+ headers: {
+ 'Content-Type': 'application/json',
+ Authorization: `Bearer ${this.clientConfig.apiKey}`,
+ },
+ })
+ }
+
+ /**
+ * Reads the error message out of an Imagine API error body
+ * (`{"code": "...", "error": "..."}`), falling back to the raw text.
+ */
+ private async errorMessage(response: Response): Promise {
+ const body = await response.text()
+ try {
+ const parsed: unknown = JSON.parse(body)
+ if (
+ typeof parsed === 'object' &&
+ parsed !== null &&
+ 'error' in parsed &&
+ typeof parsed.error === 'string'
+ ) {
+ return parsed.error
+ }
+ } catch {
+ // not JSON — fall through to the raw body
+ }
+ return body
+ }
+
+ async createVideoJob(
+ options: VideoGenerationOptions<
+ GrokVideoProviderOptions,
+ GrokVideoModelSizeByName[TModel],
+ GrokVideoModelDurationByName[TModel]
+ >,
+ ): Promise {
+ const { model, size, modelOptions, logger } = options
+
+ validateVideoSize(model, size)
+
+ // Coerce the requested duration into the model's valid range (1–15s,
+ // integer) instead of rejecting it — `snapDuration` clamps and rounds.
+ // modelOptions wins over the generic `duration`, mirroring the size
+ // precedence below.
+ const rawDuration = modelOptions?.duration ?? options.duration
+ const duration =
+ rawDuration !== undefined ? this.snapDuration(rawDuration) : undefined
+
+ // The interleaved prompt decomposes into verbatim text plus typed media
+ // buckets. The Imagine video endpoint takes a text prompt and an optional
+ // starting frame; reject the modalities it can't consume.
+ const resolved = resolveMediaPrompt(options.prompt)
+ if (resolved.videos.length > 0) {
+ throw new Error(
+ `${this.name}.createVideoJob does not support video prompt parts (model: ${model}).`,
+ )
+ }
+ if (resolved.audios.length > 0) {
+ throw new Error(
+ `${this.name}.createVideoJob does not support audio prompt parts (model: ${model}).`,
+ )
+ }
+ // grok-imagine-video-1.5 is image-to-video only — text-to-video is
+ // rejected by the API, so fail fast with a clear, actionable message
+ // pointing at the model that does support text-to-video.
+ if (resolved.images.length === 0 && isImageToVideoOnlyModel(model)) {
+ throw new Error(
+ `${this.name}: ${model} does not support text-to-video — it is image-to-video only. ` +
+ `Include an image prompt part as the starting frame, or use 'grok-imagine-video' for text-to-video.`,
+ )
+ }
+ if (resolved.images.length > 1) {
+ throw new Error(
+ `${this.name}: ${model} accepts at most one starting-frame image; received ${resolved.images.length}.`,
+ )
+ }
+
+ // Image-to-video: the single image prompt part becomes the starting frame
+ // and the prompt text describes the desired motion. URL sources are
+ // fetched by xAI's servers; data sources are sent as base64 data URIs.
+ const [startFrame] = resolved.images
+
+ // The generic `size` option carries an "aspectRatio_resolution" template
+ // (e.g. '16:9_720p') and maps to the Imagine API's `aspect_ratio` /
+ // `resolution` parameters; explicit modelOptions win over the template.
+ const parsedSize = size !== undefined ? parseGrokVideoSize(size) : undefined
+ const request = {
+ model,
+ prompt: resolved.text,
+ ...(startFrame && { image: { url: imagePartToUrl(startFrame) } }),
+ ...(parsedSize && {
+ aspect_ratio: parsedSize.aspectRatio,
+ ...(parsedSize.resolution !== undefined && {
+ resolution: parsedSize.resolution,
+ }),
+ }),
+ ...modelOptions,
+ // Spread after modelOptions so the snapped duration is authoritative
+ // (modelOptions.duration is folded into `duration` via snapDuration above).
+ ...(duration !== undefined && { duration }),
+ }
+
+ try {
+ logger.request(
+ `activity=video.create provider=${this.name} model=${model} size=${size ?? 'default'} duration=${duration ?? 'default'}`,
+ { provider: this.name, model },
+ )
+
+ const response = await this.request('/videos/generations', {
+ method: 'POST',
+ body: JSON.stringify(request),
+ })
+ if (!response.ok) {
+ throw new Error(
+ `grok: video generation request failed (${response.status} ${response.statusText}): ${await this.errorMessage(response)}`,
+ )
+ }
+
+ const result = (await response.json()) as GrokVideoCreateResponse
+ if (!result.request_id) {
+ throw new Error(
+ 'grok: video generation response contained no request_id',
+ )
+ }
+ return { jobId: result.request_id, model }
+ } catch (error: unknown) {
+ logger.errors(`${this.name}.createVideoJob fatal`, {
+ error: toRunErrorPayload(error, `${this.name}.createVideoJob failed`),
+ source: `${this.name}.createVideoJob`,
+ })
+ throw error
+ }
+ }
+
+ private async retrieveJob(jobId: string): Promise {
+ const response = await this.request(`/videos/${jobId}`)
+ if (!response.ok) {
+ const error = new Error(
+ `grok: video status request failed (${response.status} ${response.statusText}): ${await this.errorMessage(response)}`,
+ )
+ ;(error as { status?: number }).status = response.status
+ throw error
+ }
+ return (await response.json()) as GrokVideoStatusResponse
+ }
+
+ async getVideoStatus(jobId: string): Promise {
+ let response: GrokVideoStatusResponse
+ try {
+ response = await this.retrieveJob(jobId)
+ } catch (error) {
+ if ((error as { status?: number }).status === 404) {
+ return { jobId, status: 'failed', error: 'Job not found' }
+ }
+ throw error
+ }
+
+ return {
+ jobId,
+ status: this.mapStatus(response.status),
+ ...(response.progress !== undefined && { progress: response.progress }),
+ ...(response.error !== undefined && { error: response.error }),
+ }
+ }
+
+ async getVideoUrl(jobId: string): Promise {
+ let response: GrokVideoStatusResponse
+ try {
+ response = await this.retrieveJob(jobId)
+ } catch (error) {
+ if ((error as { status?: number }).status === 404) {
+ throw new Error(`Video job not found: ${jobId}`)
+ }
+ throw error
+ }
+
+ const status = this.mapStatus(response.status)
+ if (status === 'failed') {
+ throw new Error(
+ `Video generation failed${response.error ? `: ${response.error}` : ''}. Job ID: ${jobId}`,
+ )
+ }
+ const url = response.video?.url
+ if (!url) {
+ throw new Error(
+ `Video is not ready for download. Check status first. Job ID: ${jobId}`,
+ )
+ }
+
+ const usage = buildGrokVideoUsage(response)
+ return {
+ jobId,
+ url,
+ ...(usage && { usage }),
+ }
+ }
+
+ /**
+ * Maps Imagine API job statuses onto the generic video status set. The
+ * API reports 'pending' while queued/generating (with a numeric
+ * `progress`), then a terminal 'done' / 'failed' / 'expired'.
+ */
+ protected mapStatus(
+ apiStatus: string | undefined,
+ ): 'pending' | 'processing' | 'completed' | 'failed' {
+ switch (apiStatus) {
+ case 'pending':
+ case 'queued':
+ return 'pending'
+ case 'done':
+ case 'completed':
+ case 'succeeded':
+ return 'completed'
+ case 'failed':
+ case 'expired':
+ case 'error':
+ case 'cancelled':
+ return 'failed'
+ case undefined:
+ default:
+ return 'processing'
+ }
+ }
+
+ /**
+ * Both grok-imagine video models accept a continuous 1–15 integer-second
+ * range. Consumers can use this to render UI without provider knowledge.
+ */
+ override availableDurations(): DurationOptions<
+ GrokVideoModelDurationByName[TModel]
+ > {
+ return getGrokVideoDurationOptions(this.model)
+ }
+
+ /**
+ * Coerce a raw seconds value to the closest valid duration (clamped to
+ * [1, 15] and rounded to whole seconds).
+ */
+ override snapDuration(
+ seconds: number,
+ ): GrokVideoModelDurationByName[TModel] | undefined {
+ return snapToDurationOption(seconds, this.availableDurations())
+ }
+}
+
+/**
+ * Creates a Grok video adapter with an explicit API key.
+ * Type resolution happens here at the call site.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ *
+ * @param model - The model name (e.g., 'grok-imagine-video')
+ * @param apiKey - Your xAI API key
+ * @param config - Optional additional configuration
+ * @returns Configured Grok video adapter instance with resolved types
+ *
+ * @example
+ * ```typescript
+ * // grok-imagine-video (v1.0) supports text-to-video.
+ * const adapter = createGrokVideo('grok-imagine-video', 'xai-...');
+ *
+ * const { jobId } = await generateVideo({
+ * adapter,
+ * prompt: 'A beautiful sunset over the ocean',
+ * size: '16:9_720p',
+ * duration: 5
+ * });
+ * ```
+ */
+export function createGrokVideo(
+ model: TModel,
+ apiKey: string,
+ config?: Omit,
+): GrokVideoAdapter {
+ return new GrokVideoAdapter({ apiKey, ...config }, model)
+}
+
+/**
+ * Creates a Grok video adapter with automatic API key detection from environment variables.
+ * Type resolution happens here at the call site.
+ *
+ * Looks for `XAI_API_KEY` in:
+ * - `process.env` (Node.js)
+ * - `window.env` (Browser with injected env)
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ *
+ * @param model - The model name (e.g., 'grok-imagine-video-1.5')
+ * @param config - Optional configuration (excluding apiKey which is auto-detected)
+ * @returns Configured Grok video adapter instance with resolved types
+ * @throws Error if XAI_API_KEY is not found in environment
+ *
+ * @example
+ * ```typescript
+ * // Automatically uses XAI_API_KEY from environment
+ * const adapter = grokVideo('grok-imagine-video-1.5');
+ *
+ * // Image-to-video only: the prompt must carry a starting-frame image part.
+ * const { jobId } = await generateVideo({
+ * adapter,
+ * prompt: [
+ * { type: 'text', content: 'Make the cat start playing the piano' },
+ * { type: 'image', source: { type: 'url', value: 'https://example.com/cat.png' } },
+ * ],
+ * });
+ *
+ * // Poll for status
+ * const status = await getVideoJobStatus({ adapter, jobId });
+ * ```
+ */
+export function grokVideo(
+ model: TModel,
+ config?: Omit,
+): GrokVideoAdapter {
+ const apiKey = getGrokApiKeyFromEnv()
+ return createGrokVideo(model, apiKey, config)
+}
diff --git a/packages/ai-grok/src/index.ts b/packages/ai-grok/src/index.ts
index 142ab3346..e342645ca 100644
--- a/packages/ai-grok/src/index.ts
+++ b/packages/ai-grok/src/index.ts
@@ -31,6 +31,27 @@ export type {
GrokImageModelProviderOptionsByName,
} from './image/image-provider-options'
+// Video adapter - for video generation (xAI Imagine API)
+export {
+ GrokVideoAdapter,
+ createGrokVideo,
+ grokVideo,
+ type GrokVideoConfig,
+} from './adapters/video'
+export {
+ GROK_VIDEO_DURATIONS,
+ getGrokVideoDurationOptions,
+} from './video/video-provider-options'
+export type {
+ GrokVideoProviderOptions,
+ GrokVideoModelProviderOptionsByName,
+ GrokVideoModelSizeByName,
+ GrokVideoModelDurationByName,
+ GrokVideoAspectRatio,
+ GrokVideoResolution,
+ GrokVideoSize,
+} from './video/video-provider-options'
+
// Speech (TTS) adapter - for text-to-speech
export {
GrokSpeechAdapter,
@@ -68,6 +89,7 @@ export type {
ResolveInputModalities,
GrokChatModel,
GrokImageModel,
+ GrokVideoModel,
GrokTTSModel,
GrokTranscriptionModel,
GrokRealtimeModel,
@@ -75,6 +97,7 @@ export type {
export {
GROK_CHAT_MODELS,
GROK_IMAGE_MODELS,
+ GROK_VIDEO_MODELS,
GROK_TTS_MODELS,
GROK_TRANSCRIPTION_MODELS,
GROK_REALTIME_MODELS,
diff --git a/packages/ai-grok/src/model-meta.ts b/packages/ai-grok/src/model-meta.ts
index 6f9caa6a2..91c0d6105 100644
--- a/packages/ai-grok/src/model-meta.ts
+++ b/packages/ai-grok/src/model-meta.ts
@@ -91,6 +91,47 @@ const GROK_IMAGINE_IMAGE_QUALITY = {
},
} as const satisfies ModelMeta
+// Imagine API video models. Pricing is per second of generated video
+// (output only); generated videos carry an audio track.
+//
+// grok-imagine-video (v1.0) supports both text-to-video (a starting image is
+// optional) and image-to-video. grok-imagine-video-1.5 is image-to-video
+// only: a starting-frame image is required (the text prompt describes the
+// desired motion) — its text-to-video is rejected by the API.
+const GROK_IMAGINE_VIDEO = {
+ name: 'grok-imagine-video',
+ supports: {
+ input: ['text', 'image'],
+ output: ['video', 'audio'],
+ },
+ pricing: {
+ input: {
+ normal: 0,
+ },
+ output: {
+ // per second of video
+ normal: 0.05,
+ },
+ },
+} as const satisfies ModelMeta
+
+const GROK_IMAGINE_VIDEO_1_5 = {
+ name: 'grok-imagine-video-1.5',
+ supports: {
+ input: ['text', 'image'],
+ output: ['video', 'audio'],
+ },
+ pricing: {
+ input: {
+ normal: 0,
+ },
+ output: {
+ // per second of video
+ normal: 0.08,
+ },
+ },
+} as const satisfies ModelMeta
+
const GROK_4_3 = {
name: 'grok-4.3',
context_window: 1_000_000,
@@ -145,6 +186,16 @@ export const GROK_IMAGE_MODELS = [
GROK_IMAGINE_IMAGE_QUALITY.name,
] as const
+/**
+ * Grok Video Generation Models (xAI Imagine API)
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export const GROK_VIDEO_MODELS = [
+ GROK_IMAGINE_VIDEO.name,
+ GROK_IMAGINE_VIDEO_1_5.name,
+] as const
+
// xAI's `/v1/tts` endpoint is endpoint-addressed and does not take a `model`
// parameter. This synthetic identifier satisfies the SDK's `TTSOptions.model`
// contract and provides a stable value for logging and fixture matching.
@@ -198,6 +249,7 @@ export const GROK_REALTIME_MODELS = [
export type GrokChatModel = (typeof GROK_CHAT_MODELS)[number]
export type GrokImageModel = (typeof GROK_IMAGE_MODELS)[number]
+export type GrokVideoModel = (typeof GROK_VIDEO_MODELS)[number]
export type GrokTTSModel = (typeof GROK_TTS_MODELS)[number]
export type GrokTranscriptionModel = (typeof GROK_TRANSCRIPTION_MODELS)[number]
export type GrokRealtimeModel = (typeof GROK_REALTIME_MODELS)[number]
diff --git a/packages/ai-grok/src/video/video-provider-options.ts b/packages/ai-grok/src/video/video-provider-options.ts
new file mode 100644
index 000000000..b84c03f8b
--- /dev/null
+++ b/packages/ai-grok/src/video/video-provider-options.ts
@@ -0,0 +1,241 @@
+/**
+ * Grok Video Generation Provider Options (xAI Imagine API)
+ *
+ * Based on https://docs.x.ai/docs/guides/video-generations
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+
+import type { DurationOptions } from '@tanstack/ai/adapters'
+import type { GrokVideoModel } from '../model-meta'
+
+/**
+ * Aspect ratios accepted by the grok-imagine video models.
+ *
+ * Note: this is a narrower set than the grok-imagine image models — the
+ * video endpoint rejects the phone-screen ratios ('9:19.5', '9:20', …) and
+ * 'auto'.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoAspectRatio =
+ | '1:1'
+ | '16:9'
+ | '9:16'
+ | '4:3'
+ | '3:4'
+ | '3:2'
+ | '2:3'
+
+/**
+ * Resolution tiers for the grok-imagine video models.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoResolution = '480p' | '720p' | '1080p'
+
+/**
+ * Size strings for grok-imagine video models. The Imagine API is
+ * aspect-ratio based rather than pixel-size based; like the grok-imagine
+ * image models, the generic `size` option uses an
+ * `aspectRatio_resolution` template ("16:9_720p") — the resolution suffix
+ * is optional ("16:9" uses the API default).
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoSize =
+ | GrokVideoAspectRatio
+ | `${GrokVideoAspectRatio}_${GrokVideoResolution}`
+
+const GROK_VIDEO_ASPECT_RATIOS: ReadonlyArray = [
+ '1:1',
+ '16:9',
+ '9:16',
+ '4:3',
+ '3:4',
+ '3:2',
+ '2:3',
+]
+
+const GROK_VIDEO_RESOLUTIONS: ReadonlyArray = ['480p', '720p', '1080p']
+
+/**
+ * Video duration limits enforced by the Imagine API (seconds).
+ */
+export const GROK_VIDEO_MIN_DURATION = 1
+export const GROK_VIDEO_MAX_DURATION = 15
+
+/**
+ * Parses a grok video size string into its components.
+ * Format: "aspectRatio" or "aspectRatio_resolution",
+ * e.g. "16:9_720p" → { aspectRatio: "16:9", resolution: "720p" }.
+ * Returns undefined when the string doesn't match the template.
+ */
+export function parseGrokVideoSize(
+ size: string,
+): { aspectRatio: string; resolution?: string } | undefined {
+ const match = size.match(/^([\d.]+:[\d.]+)(?:_(.+))?$/)
+ const [, aspectRatio, resolution] = match ?? []
+ if (aspectRatio === undefined) return undefined
+ return { aspectRatio, ...(resolution !== undefined && { resolution }) }
+}
+
+/**
+ * Validate the `size` template for a given grok video model.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export function validateVideoSize(
+ model: string,
+ size?: string,
+): asserts size is GrokVideoSize | undefined {
+ if (size === undefined) return
+ const parsed = parseGrokVideoSize(size)
+ if (!parsed || !GROK_VIDEO_ASPECT_RATIOS.includes(parsed.aspectRatio)) {
+ throw new Error(
+ `Size "${size}" is not supported by model "${model}". Expected ` +
+ `"aspectRatio" or "aspectRatio_resolution" (e.g. "16:9_720p") with ` +
+ `aspect ratio one of: ${GROK_VIDEO_ASPECT_RATIOS.join(', ')}`,
+ )
+ }
+ if (
+ parsed.resolution !== undefined &&
+ !GROK_VIDEO_RESOLUTIONS.includes(parsed.resolution)
+ ) {
+ throw new Error(
+ `Resolution "${parsed.resolution}" is not supported by model "${model}". ` +
+ `Supported resolutions: ${GROK_VIDEO_RESOLUTIONS.join(', ')}`,
+ )
+ }
+}
+
+/**
+ * Per-model duration type. The Imagine API accepts any integer second in the
+ * 1–15 range, so this is a continuous range expressed as `number` (a literal
+ * union can't represent it). `snapDuration()` coerces a raw seconds value into
+ * the valid range at runtime.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoModelDurationByName = {
+ 'grok-imagine-video': number
+ 'grok-imagine-video-1.5': number
+}
+
+/**
+ * Runtime duration table backing `availableDurations()` / `snapDuration()`.
+ * Both grok-imagine video models accept the same continuous 1–15 integer-second
+ * range.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export const GROK_VIDEO_DURATIONS: {
+ readonly [TModel in GrokVideoModel]: DurationOptions<
+ GrokVideoModelDurationByName[TModel]
+ >
+} = {
+ 'grok-imagine-video': {
+ kind: 'range',
+ min: GROK_VIDEO_MIN_DURATION,
+ max: GROK_VIDEO_MAX_DURATION,
+ step: 1,
+ unit: 'seconds',
+ },
+ 'grok-imagine-video-1.5': {
+ kind: 'range',
+ min: GROK_VIDEO_MIN_DURATION,
+ max: GROK_VIDEO_MAX_DURATION,
+ step: 1,
+ unit: 'seconds',
+ },
+}
+
+/**
+ * Look up the duration options for a grok video model.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export function getGrokVideoDurationOptions(
+ model: TModel,
+): DurationOptions {
+ return GROK_VIDEO_DURATIONS[model]
+}
+
+/**
+ * Provider-specific options for grok video generation. These map directly
+ * onto the Imagine API request body and take precedence over the generic
+ * `size` / `duration` options when both are provided.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export interface GrokVideoProviderOptions {
+ /**
+ * Output aspect ratio.
+ */
+ aspect_ratio?: GrokVideoAspectRatio
+
+ /**
+ * Output resolution tier.
+ */
+ resolution?: GrokVideoResolution
+
+ /**
+ * Video duration in integer seconds (1–15).
+ */
+ duration?: number
+}
+
+/**
+ * Type-only map from model name to its specific provider options.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoModelProviderOptionsByName = {
+ 'grok-imagine-video': GrokVideoProviderOptions
+ 'grok-imagine-video-1.5': GrokVideoProviderOptions
+}
+
+/**
+ * Type-only map from model name to its supported `size` strings.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoModelSizeByName = {
+ 'grok-imagine-video': GrokVideoSize
+ 'grok-imagine-video-1.5': GrokVideoSize
+}
+
+/**
+ * Type-only map from model name to the non-text prompt modalities it accepts.
+ * Both models accept an `image` prompt part as the starting frame:
+ * `grok-imagine-video` (v1.0) does text-to-video and image-to-video, while
+ * `grok-imagine-video-1.5` is image-to-video only (the image is required).
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export type GrokVideoModelInputModalitiesByName = {
+ 'grok-imagine-video': readonly ['image']
+ 'grok-imagine-video-1.5': readonly ['image']
+}
+
+/**
+ * Models that only support image-to-video — a starting-frame image is
+ * required and text-to-video is rejected by the Imagine API. Used by the
+ * adapter to fail fast with a clear message instead of surfacing the raw
+ * "Text-to-video is not supported for this model" 400.
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+const GROK_VIDEO_IMAGE_TO_VIDEO_ONLY: ReadonlySet = new Set([
+ 'grok-imagine-video-1.5',
+])
+
+/**
+ * True when the model only supports image-to-video (a starting frame is
+ * required).
+ *
+ * @experimental Video generation is an experimental feature and may change.
+ */
+export function isImageToVideoOnlyModel(model: string): boolean {
+ return GROK_VIDEO_IMAGE_TO_VIDEO_ONLY.has(model)
+}
diff --git a/packages/ai-grok/tests/video-adapter.test.ts b/packages/ai-grok/tests/video-adapter.test.ts
new file mode 100644
index 000000000..a6239adbc
--- /dev/null
+++ b/packages/ai-grok/tests/video-adapter.test.ts
@@ -0,0 +1,644 @@
+import { describe, expect, it, vi } from 'vitest'
+import { resolveDebugOption } from '@tanstack/ai/adapter-internals'
+import {
+ GrokVideoAdapter,
+ createGrokVideo,
+ grokVideo,
+} from '../src/adapters/video'
+import {
+ getGrokVideoDurationOptions,
+ parseGrokVideoSize,
+ validateVideoSize,
+} from '../src/video/video-provider-options'
+
+const testLogger = resolveDebugOption(false)
+
+function jsonResponse(body: unknown, status = 200): Response {
+ return new Response(JSON.stringify(body), {
+ status,
+ headers: { 'Content-Type': 'application/json' },
+ })
+}
+
+/**
+ * A `vi.fn` fetch stub with the real fetch parameter list, so call
+ * assertions (`mock.calls[0]`) are typed as `[input, init?]`.
+ */
+function mockFetch(handler: () => Response) {
+ return vi.fn(async (_input: string | URL | Request, _init?: RequestInit) =>
+ handler(),
+ )
+}
+
+/**
+ * Builds an adapter whose HTTP layer is the provided mock — injected via
+ * the adapter config's `fetch` seam, so no globals are touched.
+ */
+function adapterWithFetch(
+ fetchMock: (
+ input: string | URL | Request,
+ init?: RequestInit,
+ ) => Promise,
+) {
+ return createGrokVideo('grok-imagine-video-1.5', 'test-api-key', {
+ fetch: fetchMock,
+ })
+}
+
+/**
+ * grok-imagine-video-1.5 is image-to-video only, so every request needs a
+ * starting-frame image part. This builds a text + image prompt for the
+ * request-shape / status / error tests.
+ */
+function i2vPrompt(text = 'p') {
+ return [
+ { type: 'text' as const, content: text },
+ {
+ type: 'image' as const,
+ source: { type: 'url' as const, value: 'https://example.com/start.png' },
+ },
+ ]
+}
+
+describe('Grok Video Adapter', () => {
+ describe('factories', () => {
+ it('creates an adapter with the provided API key', () => {
+ const adapter = createGrokVideo('grok-imagine-video-1.5', 'test-api-key')
+ expect(adapter).toBeInstanceOf(GrokVideoAdapter)
+ expect(adapter.kind).toBe('video')
+ expect(adapter.name).toBe('grok')
+ expect(adapter.model).toBe('grok-imagine-video-1.5')
+ })
+
+ it('grokVideo reads XAI_API_KEY from the environment', () => {
+ vi.stubEnv('XAI_API_KEY', 'env-key')
+ try {
+ const adapter = grokVideo('grok-imagine-video-1.5')
+ expect(adapter).toBeInstanceOf(GrokVideoAdapter)
+ } finally {
+ vi.unstubAllEnvs()
+ }
+ })
+ })
+
+ describe('createVideoJob', () => {
+ it('posts a JSON request to the Imagine generations endpoint', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'req-123' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ const result = await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt('A red ball bouncing once'),
+ size: '16:9_720p',
+ duration: 5,
+ logger: testLogger,
+ })
+
+ expect(result).toEqual({
+ jobId: 'req-123',
+ model: 'grok-imagine-video-1.5',
+ })
+ expect(fetchMock).toHaveBeenCalledTimes(1)
+ const [url, init] = fetchMock.mock.calls[0]!
+ expect(url).toBe('https://api.x.ai/v1/videos/generations')
+ expect(init?.method).toBe('POST')
+ expect(init?.headers).toMatchObject({
+ 'Content-Type': 'application/json',
+ Authorization: 'Bearer test-api-key',
+ })
+ expect(JSON.parse(String(init?.body))).toEqual({
+ model: 'grok-imagine-video-1.5',
+ prompt: 'A red ball bouncing once',
+ image: { url: 'https://example.com/start.png' },
+ aspect_ratio: '16:9',
+ resolution: '720p',
+ duration: 5,
+ })
+ })
+
+ it('maps a bare aspect-ratio size without a resolution', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ size: '9:16',
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.aspect_ratio).toBe('9:16')
+ expect(body).not.toHaveProperty('resolution')
+ expect(body).not.toHaveProperty('duration')
+ })
+
+ it('passes modelOptions through', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt('make the waterfall crash down'),
+ modelOptions: {
+ resolution: '1080p',
+ duration: 10,
+ },
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.prompt).toBe('make the waterfall crash down')
+ expect(body.resolution).toBe('1080p')
+ expect(body.duration).toBe(10)
+ })
+
+ it('maps an image prompt part to the starting frame (image-to-video)', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: [
+ { type: 'text', content: 'make the waterfall crash down' },
+ {
+ type: 'image',
+ source: { type: 'url', value: 'https://example.com/still.png' },
+ },
+ ],
+ duration: 10,
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ // Prompt text is sent verbatim; the image becomes the starting frame.
+ expect(body.prompt).toBe('make the waterfall crash down')
+ expect(body.image).toEqual({ url: 'https://example.com/still.png' })
+ expect(body.duration).toBe(10)
+ })
+
+ it('sends a base64 data source as a data URI starting frame', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: [
+ { type: 'text', content: 'pan out slowly' },
+ {
+ type: 'image',
+ source: { type: 'data', mimeType: 'image/png', value: 'AAAA' },
+ },
+ ],
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.image).toEqual({ url: 'data:image/png;base64,AAAA' })
+ })
+
+ it('rejects more than one image prompt part before calling the API', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: [
+ { type: 'text', content: 'p' },
+ {
+ type: 'image',
+ source: { type: 'url', value: 'https://example.com/a.png' },
+ },
+ {
+ type: 'image',
+ source: { type: 'url', value: 'https://example.com/b.png' },
+ },
+ ],
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(/at most one starting-frame image/)
+ expect(fetchMock).not.toHaveBeenCalled()
+ })
+
+ it('rejects video and audio prompt parts before calling the API', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: [
+ { type: 'text', content: 'p' },
+ {
+ type: 'video',
+ source: { type: 'url', value: 'https://example.com/clip.mp4' },
+ },
+ ],
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(/does not support video prompt parts/)
+ expect(fetchMock).not.toHaveBeenCalled()
+ })
+
+ it('rejects a text-only prompt on 1.5 — image-to-video only', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: 'a red ball bouncing once',
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(/does not support text-to-video/)
+ expect(fetchMock).not.toHaveBeenCalled()
+ })
+
+ it('allows a text-only prompt on grok-imagine-video (text-to-video)', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'tv-1' }))
+ const adapter = createGrokVideo('grok-imagine-video', 'test-api-key', {
+ fetch: fetchMock,
+ })
+
+ const result = await adapter.createVideoJob({
+ model: 'grok-imagine-video',
+ prompt: 'A beautiful sunset over the ocean',
+ size: '16:9_720p',
+ duration: 5,
+ logger: testLogger,
+ })
+
+ expect(result).toEqual({ jobId: 'tv-1', model: 'grok-imagine-video' })
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.prompt).toBe('A beautiful sunset over the ocean')
+ expect(body).not.toHaveProperty('image')
+ })
+
+ it('maps a starting frame on grok-imagine-video (image-to-video)', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'iv-1' }))
+ const adapter = createGrokVideo('grok-imagine-video', 'test-api-key', {
+ fetch: fetchMock,
+ })
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video',
+ prompt: i2vPrompt('animate this'),
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.image).toEqual({ url: 'https://example.com/start.png' })
+ expect(body.prompt).toBe('animate this')
+ })
+
+ it('lets modelOptions win over the generic size template', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ size: '16:9_480p',
+ modelOptions: { resolution: '1080p' },
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.aspect_ratio).toBe('16:9')
+ expect(body.resolution).toBe('1080p')
+ })
+
+ it('rejects unsupported sizes before calling the API', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: 'p',
+ // @ts-expect-error invalid size is also rejected at compile time
+ size: '7:5',
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(/Size "7:5" is not supported/)
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: 'p',
+ // @ts-expect-error invalid resolution is also rejected at compile time
+ size: '16:9_9k',
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(/Resolution "9k" is not supported/)
+ expect(fetchMock).not.toHaveBeenCalled()
+ })
+
+ it('snaps out-of-range and non-integer durations into the valid range', async () => {
+ // [requested, snapped]: clamp to [1, 15], round to whole seconds.
+ const cases: Array<[number, number]> = [
+ [0, 1],
+ [16, 15],
+ [2.5, 3],
+ [7, 7],
+ ]
+ for (const [requested, snapped] of cases) {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ duration: requested,
+ logger: testLogger,
+ })
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.duration).toBe(snapped)
+ }
+ })
+
+ it('snaps a duration supplied via modelOptions', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ modelOptions: { duration: 99 },
+ logger: testLogger,
+ })
+
+ const body = JSON.parse(String(fetchMock.mock.calls[0]![1]?.body))
+ expect(body.duration).toBe(15)
+ })
+
+ it('surfaces API error messages from the xAI error body', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse(
+ {
+ code: 'invalid-argument',
+ error: 'Duration must be between 1 and 15 seconds',
+ },
+ 400,
+ ),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(
+ /video generation request failed \(400.*Duration must be between 1 and 15 seconds/,
+ )
+ })
+
+ it('throws when the response carries no request_id', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({}))
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(
+ adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ logger: testLogger,
+ }),
+ ).rejects.toThrow(/no request_id/)
+ })
+
+ it('honours a custom baseURL', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ request_id: 'r' }))
+ const adapter = createGrokVideo('grok-imagine-video-1.5', 'k', {
+ baseURL: 'https://proxy.example.com/v1',
+ fetch: fetchMock,
+ })
+
+ await adapter.createVideoJob({
+ model: 'grok-imagine-video-1.5',
+ prompt: i2vPrompt(),
+ logger: testLogger,
+ })
+
+ expect(fetchMock.mock.calls[0]![0]).toBe(
+ 'https://proxy.example.com/v1/videos/generations',
+ )
+ })
+ })
+
+ describe('getVideoStatus', () => {
+ it('maps a pending job with progress', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({ status: 'pending', progress: 18 }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ const status = await adapter.getVideoStatus('req-123')
+
+ expect(fetchMock.mock.calls[0]![0]).toBe(
+ 'https://api.x.ai/v1/videos/req-123',
+ )
+ expect(status).toEqual({
+ jobId: 'req-123',
+ status: 'pending',
+ progress: 18,
+ })
+ })
+
+ it('maps a done job to completed', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({
+ status: 'done',
+ progress: 100,
+ video: { url: 'https://vidgen.x.ai/video.mp4', duration: 5 },
+ }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ expect(await adapter.getVideoStatus('req-123')).toEqual({
+ jobId: 'req-123',
+ status: 'completed',
+ progress: 100,
+ })
+ })
+
+ it.each(['failed', 'expired'])('maps %s to failed', async (apiStatus) => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({ status: apiStatus, error: 'moderation' }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ expect(await adapter.getVideoStatus('req-123')).toEqual({
+ jobId: 'req-123',
+ status: 'failed',
+ error: 'moderation',
+ })
+ })
+
+ it('maps an unknown in-flight status to processing', async () => {
+ const fetchMock = mockFetch(() => jsonResponse({ status: 'generating' }))
+ const adapter = adapterWithFetch(fetchMock)
+
+ expect((await adapter.getVideoStatus('req-123')).status).toBe(
+ 'processing',
+ )
+ })
+
+ it('reports a 404 as a failed job rather than throwing', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse(
+ { code: 'not-found', error: 'Failed to read static file.' },
+ 404,
+ ),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ expect(await adapter.getVideoStatus('missing')).toEqual({
+ jobId: 'missing',
+ status: 'failed',
+ error: 'Job not found',
+ })
+ })
+
+ it('throws on non-404 API errors', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({ error: 'server exploded' }, 500),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(adapter.getVideoStatus('req-123')).rejects.toThrow(
+ /video status request failed \(500/,
+ )
+ })
+ })
+
+ describe('getVideoUrl', () => {
+ it('returns the video URL with billed seconds and exact cost', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({
+ status: 'done',
+ progress: 100,
+ model: 'grok-imagine-video-1.5',
+ video: {
+ url: 'https://vidgen.x.ai/video.mp4',
+ duration: 5,
+ },
+ usage: { cost_in_usd_ticks: 2_500_000_000 },
+ }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ expect(await adapter.getVideoUrl('req-123')).toEqual({
+ jobId: 'req-123',
+ url: 'https://vidgen.x.ai/video.mp4',
+ usage: {
+ promptTokens: 0,
+ completionTokens: 0,
+ totalTokens: 0,
+ unitsBilled: 5,
+ cost: 0.25,
+ },
+ })
+ })
+
+ it('omits usage when the response carries none', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({
+ status: 'done',
+ video: { url: 'https://vidgen.x.ai/video.mp4' },
+ }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ expect(await adapter.getVideoUrl('req-123')).toEqual({
+ jobId: 'req-123',
+ url: 'https://vidgen.x.ai/video.mp4',
+ })
+ })
+
+ it('throws when the job is not finished yet', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({ status: 'pending', progress: 40 }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(adapter.getVideoUrl('req-123')).rejects.toThrow(
+ /not ready for download/,
+ )
+ })
+
+ it('throws with the provider error when the job failed', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({ status: 'failed', error: 'moderation' }),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(adapter.getVideoUrl('req-123')).rejects.toThrow(
+ /Video generation failed: moderation/,
+ )
+ })
+
+ it('throws a not-found error for unknown jobs', async () => {
+ const fetchMock = mockFetch(() =>
+ jsonResponse({ code: 'not-found', error: 'nope' }, 404),
+ )
+ const adapter = adapterWithFetch(fetchMock)
+
+ await expect(adapter.getVideoUrl('missing')).rejects.toThrow(
+ /Video job not found: missing/,
+ )
+ })
+ })
+
+ describe('video provider option helpers', () => {
+ it('parses size templates', () => {
+ expect(parseGrokVideoSize('16:9_720p')).toEqual({
+ aspectRatio: '16:9',
+ resolution: '720p',
+ })
+ expect(parseGrokVideoSize('3:4')).toEqual({ aspectRatio: '3:4' })
+ expect(parseGrokVideoSize('not-a-size')).toBeUndefined()
+ })
+
+ it('validates sizes', () => {
+ expect(() => validateVideoSize('m', '16:9')).not.toThrow()
+ expect(() => validateVideoSize('m', '2:3_1080p')).not.toThrow()
+ expect(() => validateVideoSize('m', undefined)).not.toThrow()
+ expect(() => validateVideoSize('m', '9:19.5')).toThrow(/not supported/)
+ expect(() => validateVideoSize('m', 'auto')).toThrow(/not supported/)
+ expect(() => validateVideoSize('m', '16:9_2k')).toThrow(/Resolution/)
+ })
+
+ it('exposes the 1–15s duration range via getGrokVideoDurationOptions', () => {
+ expect(getGrokVideoDurationOptions('grok-imagine-video')).toEqual({
+ kind: 'range',
+ min: 1,
+ max: 15,
+ step: 1,
+ unit: 'seconds',
+ })
+ expect(getGrokVideoDurationOptions('grok-imagine-video-1.5')).toEqual({
+ kind: 'range',
+ min: 1,
+ max: 15,
+ step: 1,
+ unit: 'seconds',
+ })
+ })
+
+ it('availableDurations / snapDuration coerce raw seconds into range', () => {
+ const adapter = createGrokVideo('grok-imagine-video', 'test-api-key')
+ expect(adapter.availableDurations()).toEqual({
+ kind: 'range',
+ min: 1,
+ max: 15,
+ step: 1,
+ unit: 'seconds',
+ })
+ expect(adapter.snapDuration(0)).toBe(1)
+ expect(adapter.snapDuration(16)).toBe(15)
+ expect(adapter.snapDuration(2.5)).toBe(3)
+ expect(adapter.snapDuration(7)).toBe(7)
+ })
+ })
+})
diff --git a/packages/ai/skills/ai-core/media-generation/SKILL.md b/packages/ai/skills/ai-core/media-generation/SKILL.md
index cae40b000..966e6253e 100644
--- a/packages/ai/skills/ai-core/media-generation/SKILL.md
+++ b/packages/ai/skills/ai-core/media-generation/SKILL.md
@@ -3,10 +3,10 @@ name: ai-core/media-generation
description: >
Image, audio, video, speech (TTS), and transcription generation using
activity-specific adapters: generateImage() with openaiImage/geminiImage,
- generateAudio() with geminiAudio/falAudio, generateVideo() with
- openaiVideo/geminiVideo (async polling, per-model typed durations),
- generateSpeech() with openaiSpeech, generateTranscription() with
- openaiTranscription. React hooks: useGenerateImage, useGenerateAudio,
+ generateAudio() with geminiAudio/falAudio, generateVideo() with async
+ polling (openaiVideo/geminiVideo/grokVideo/falVideo, per-model typed
+ durations), generateSpeech() with openaiSpeech, generateTranscription()
+ with openaiTranscription. React hooks: useGenerateImage, useGenerateAudio,
useGenerateSpeech, useTranscription, useGenerateVideo.
TanStack Start server function integration with toServerSentEventsResponse.
type: sub-skill
@@ -454,6 +454,13 @@ const { jobId } = await generateVideo({
// (x-goog-api-key header or ?key= query parameter).
```
+Other video adapters: `openaiVideo('sora-2')` (pixel sizes like `'1280x720'`,
+durations 4/8/12s, single `input_reference` image prompt part), `grokVideo(...)`
+(`grok-imagine-video` does text-to-video + image-to-video; `grok-imagine-video-1.5` is
+image-to-video only — needs an `image` prompt part as the starting frame, text-only throws;
+aspect-ratio size template like `'16:9_720p'`, integer durations 1-15s, reports
+`usage.unitsBilled` seconds and exact `usage.cost`), and `falVideo(...)` (hosted models, see cost tracking below).
+
Client hook with job tracking:
```tsx
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index daec47674..c1fdd141c 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -555,6 +555,9 @@ importers:
'@tanstack/ai-gemini':
specifier: workspace:*
version: link:../../packages/ai-gemini
+ '@tanstack/ai-grok':
+ specifier: workspace:*
+ version: link:../../packages/ai-grok
'@tanstack/react-router':
specifier: ^1.158.4
version: 1.159.5(react-dom@19.2.3(react@19.2.3))(react@19.2.3)