TanStack · tombeckenham · Jun 24, 2026 · Jun 24, 2026 · coderabbitai · Jun 23, 2026
diff --git a/.changeset/grok-imagine-video-adapter.md b/.changeset/grok-imagine-video-adapter.md
@@ -0,0 +1,5 @@
+---
+'@tanstack/ai-grok': minor
+---
+
+Add a `grokVideo` adapter for xAI's Imagine video models. `grok-imagine-video` (v1.0) supports text-to-video and image-to-video; `grok-imagine-video-1.5` is image-to-video only — a text-only prompt is rejected by the API, so the adapter fails fast with a clear error telling you to add a starting-frame image or use `grok-imagine-video`. Image-to-video starting frames are supplied as an `image` prompt part (public URL or base64 data source), with the text part describing the motion. Follows the experimental `generateVideo()` jobs/polling architecture: `createVideoJob` posts to `/v1/videos/generations`, status polling reads `/v1/videos/{request_id}`, and the completed result carries the hosted video URL plus usage (`unitsBilled` seconds and exact `cost` in USD). Sizing uses the aspect-ratio template consistent with the grok-imagine image models (`size: '16:9_720p'` → `aspect_ratio` / `resolution`), and durations are 1–15 integer seconds.
diff --git a/docs/adapters/grok.md b/docs/adapters/grok.md
@@ -2,17 +2,20 @@
 title: Grok (xAI)
 id: grok-adapter
 order: 5
-description: "Use xAI Grok Responses models with TanStack AI — Grok 4.3 and Grok Build 0.1 via @tanstack/ai-grok."
+description: "Use xAI Grok models with TanStack AI — Grok 4.3, Grok Build 0.1, Grok Imagine image generation, and Grok Imagine video generation via @tanstack/ai-grok."
 keywords:
   - tanstack ai
   - grok
   - xai
   - grok 4.3
   - grok build
+  - image generation
+  - video generation
+  - grok imagine
   - adapter
 ---
 
-The Grok text and summarization adapters provide access to xAI's Responses API for `grok-4.3` and `grok-build-0.1`.
+The Grok text and summarization adapters provide access to xAI's Responses API for `grok-4.3` and `grok-build-0.1`, plus Grok Imagine image generation and Grok Imagine video generation.
 
 ## Installation
 
@@ -203,6 +206,67 @@ reachable; use a `data` source for private images. `grok-2-image-1212` is
 text-to-image only — image prompt parts are a compile-time type error and
 throw at runtime.
 
+## Video Generation (Experimental)
+
+Generate short video clips (1–15 seconds, with audio) with the Grok Imagine video models via xAI's asynchronous jobs/polling API.
+
+Available models:
+
+- `grok-imagine-video` (v1.0) — text-to-video and image-to-video, $0.05 per second of video.
+- `grok-imagine-video-1.5` — **image-to-video only**, $0.08 per second of video. A text-only prompt is rejected by the API; the adapter fails fast with a clear error telling you to add a starting-frame image or use `grok-imagine-video`.
+
+Text-to-video with the base `grok-imagine-video` model:
+
+```typescript
+import { generateVideo, getVideoJobStatus } from "@tanstack/ai";
+import { grokVideo } from "@tanstack/ai-grok";
+
+const adapter = grokVideo("grok-imagine-video");
+
+// 1. Create the job
+const { jobId } = await generateVideo({
+  adapter,
+  prompt: "A red panda balancing on a bamboo stalk in the rain",
+  size: "16:9_720p", // "aspectRatio" or "aspectRatio_resolution"
+  duration: 5, // integer seconds, 1–15
+});
+
+// 2. Poll until complete, then read the video URL
+let status = await getVideoJobStatus({ adapter, jobId });
+while (status.status !== "completed" && status.status !== "failed") {
+  await new Promise((r) => setTimeout(r, 5000));
+  status = await getVideoJobStatus({ adapter, jobId });
+}
+
+console.log(status.url); // hosted .mp4 URL
+```
+
+For image-to-video (required for `grok-imagine-video-1.5`, optional for `grok-imagine-video`), include an `image` prompt part as the starting frame and describe the desired motion in the text part. URL sources are fetched by xAI's servers (so they must be publicly reachable); use a `data` source for a base64 starting frame:
+
+```typescript
+const { jobId } = await generateVideo({
+  adapter: grokVideo("grok-imagine-video-1.5"),
+  prompt: [
+    {
+      type: "text",
+      content: "Make the waterfall crash down and slowly pan out the camera",
+    },
+    {
+      type: "image",
+      source: { type: "url", value: "https://example.com/waterfall-still.png" },
+    },
+  ],
+  size: "16:9_720p",
+  duration: 10,
+});
+```
+
+Like the Grok Imagine image models, sizing is aspect-ratio based: the `size` option takes an `aspectRatio_resolution` template. Supported aspect ratios are `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, and `2:3`; supported resolutions are `480p`, `720p`, and `1080p` (e.g. `"9:16_1080p"`). The resolution suffix is optional.
+
+When the job completes, the adapter reports usage on the result: `usage.unitsBilled` carries the billed seconds of video and `usage.cost` the exact cost in USD, both as returned by the xAI API.
+
+See [Video Generation](../media/video-generation) for the full jobs/polling flow, streaming mode, and the `useGenerateVideo` hook.
+
 ## Text-to-Speech
 
 Generate speech with Grok TTS:
@@ -298,6 +362,10 @@ Creates a Grok summarization adapter with an explicit API key.
 
 Creates a Grok image generation adapter.
 
+### `grokVideo(model, config?)` / `createGrokVideo(model, apiKey, config?)`
+
+Creates a Grok video generation adapter (experimental) for the Grok Imagine video models (`'grok-imagine-video'`, `'grok-imagine-video-1.5'`).
+
 ### `grokSpeech(model, config?)` / `createGrokSpeech(model, apiKey, config?)`
 
 Creates a Grok text-to-speech adapter.

diff --git a/docs/config.json b/docs/config.json
@@ -262,7 +262,7 @@
           "label": "Video Generation",
           "to": "media/video-generation",
           "addedAt": "2026-04-15",
-          "updatedAt": "2026-06-08"
+          "updatedAt": "2026-06-24"
         },
         {
           "label": "Generation Hooks",
@@ -434,7 +434,8 @@
         {
           "label": "Grok (xAI)",
           "to": "adapters/grok",
-          "addedAt": "2026-04-15"
+          "addedAt": "2026-04-15",
+          "updatedAt": "2026-06-24"
         },
         {
           "label": "Groq",

diff --git a/docs/media/video-generation.md b/docs/media/video-generation.md
@@ -2,13 +2,15 @@
 title: Video Generation
 id: video-generation
 order: 6
-description: "Generate video from text prompts with OpenAI Sora or Google Veo using TanStack AI's experimental generateVideo() jobs/polling API."
+description: "Generate video from text prompts with OpenAI Sora, Google Veo, xAI Grok Imagine, or fal.ai using TanStack AI's experimental generateVideo() jobs/polling API."
 keywords:
   - tanstack ai
   - video generation
   - sora
   - veo
   - gemini
+  - grok imagine
+  - fal
   - generateVideo
   - jobs api
   - experimental
@@ -39,6 +41,8 @@ TanStack AI provides experimental support for video generation through dedicated
 Currently supported:
 - **OpenAI**: Sora-2 and Sora-2-Pro models (when available)
 - **Google Gemini**: Veo 3.1, Veo 3, and Veo 2 models (via the long-running operations API)
+- **Grok (xAI)**: grok-imagine-video (text-to-video + image-to-video) and grok-imagine-video-1.5 (image-to-video only) models
+- **fal.ai**: MiniMax, Luma, Kling, Hunyuan, and other hosted video models
 
 ## Basic Usage
 
@@ -552,6 +556,59 @@ Adapters that haven't declared a per-model duration map keep the plain
 > Files API and requires your API key to download (send it as an
 > `x-goog-api-key` header or `key` query parameter).
 
+### Grok (xAI Imagine) Model Options
+
+Based on the [xAI video generation API](https://docs.x.ai/docs/guides/video-generations). Two models are available: `grok-imagine-video` (v1.0) supports **text-to-video and image-to-video**, while `grok-imagine-video-1.5` is **image-to-video only** (a text-only prompt is rejected by the API; the adapter throws a clear error pointing you at `grok-imagine-video`). Both are aspect-ratio sized — the generic `size` option takes an `aspectRatio_resolution` template (like the Grok Imagine image models), and clips can be 1–15 seconds long.
+
+Text-to-video with the base model:
+
+```typescript
+import { generateVideo } from '@tanstack/ai'
+import { grokVideo } from '@tanstack/ai-grok'
+
+const { jobId } = await generateVideo({
+  adapter: grokVideo('grok-imagine-video'),
+  prompt: 'A beautiful sunset over the ocean',
+  size: '16:9_720p',  // aspect ratio: '1:1' | '16:9' | '9:16' | '4:3' | '3:4' | '3:2' | '2:3'
+                      // resolution (optional suffix): '480p' | '720p' | '1080p'
+  duration: 5,        // integer seconds, 1-15
+  modelOptions: {
+    aspect_ratio: '16:9',  // Alternative way to specify the aspect ratio
+    resolution: '720p',    // Alternative way to specify the resolution
+    duration: 5,           // Alternative way to specify the duration
+  },
+})
+```
+
+Image-to-video (required for `grok-imagine-video-1.5`) — include an `image` prompt part as the starting frame. URL sources are fetched by xAI's servers (so they must be publicly reachable); use a `data` source for a base64 starting frame:
+
+```typescript
+const { jobId } = await generateVideo({
+  adapter: grokVideo('grok-imagine-video-1.5'),
+  prompt: [
+    { type: 'text', content: 'Slowly pan out as the waves roll in' },
+    {
+      type: 'image',
+      source: { type: 'url', value: 'https://example.com/still.png' },
+    },
+  ],
+  size: '16:9_720p',
+  duration: 5,
+})
+```
+
+Both models accept any whole second in the **1–15** range. A raw `duration` is coerced into that range rather than rejected — values are clamped to `[1, 15]` and rounded to the nearest second. Inspect or pre-snap the range the same way as Veo:
+
+```typescript
+const adapter = grokVideo('grok-imagine-video')
+
+adapter.availableDurations() // { kind: 'range', min: 1, max: 15, step: 1, unit: 'seconds' }
+adapter.snapDuration(2.5) // 3 — clamped/rounded into range
+adapter.snapDuration(99) // 15
+```
+
+Generated clips include an audio track. When the job completes, the adapter reports `usage.unitsBilled` (billed seconds of video) and `usage.cost` (exact USD cost as returned by the API) on the result.
+
 ## Response Types
 
 > **Note:** The interfaces below are the underlying adapter-level types. The `getVideoJobStatus()` helper returns a single merged object, `{ status, progress?, url?, error?, usage? }` — it does not return `jobId` or `expiresAt`.

diff --git a/examples/ts-react-media/.env.example b/examples/ts-react-media/.env.example
@@ -5,3 +5,7 @@ FAL_KEY=
 
 # Get a Google API key at https://aistudio.google.com/apikey
 GOOGLE_API_KEY=
+
+# Get an xAI API key at https://console.x.ai — used by the "xAI Direct"
+# Grok Imagine video models (the other Grok Imagine entries go through fal).
+XAI_API_KEY=
diff --git a/examples/ts-react-media/package.json b/examples/ts-react-media/package.json
@@ -14,6 +14,7 @@
     "@tanstack/ai": "workspace:*",
     "@tanstack/ai-fal": "workspace:*",
     "@tanstack/ai-gemini": "workspace:*",
+    "@tanstack/ai-grok": "workspace:*",
     "@tanstack/react-router": "^1.158.4",
     "@tanstack/react-start": "^1.159.0",
     "@tanstack/router-plugin": "^1.158.4",

diff --git a/examples/ts-react-media/src/components/ImageGenerator.tsx b/examples/ts-react-media/src/components/ImageGenerator.tsx
@@ -27,6 +27,7 @@ function getImageSrc(image: { url?: string; b64Json?: string }): string {
 
 const falModels = IMAGE_MODELS.filter((m) => m.provider === 'fal')
 const geminiModels = IMAGE_MODELS.filter((m) => m.provider === 'gemini')
+const xaiModels = IMAGE_MODELS.filter((m) => m.provider === 'xai')
 
 export default function ImageGenerator({
   onImageGenerated,
@@ -161,6 +162,13 @@ export default function ImageGenerator({
                 </option>
               ))}
             </optgroup>
+            <optgroup label="xAI (direct)">
+              {xaiModels.map((model) => (
+                <option key={model.id} value={model.id}>
+                  {model.name}
+                </option>
+              ))}
+            </optgroup>
           </select>
           {currentModel && selectedModel !== 'all' && (
             <p className="mt-1 text-xs text-gray-500">

diff --git a/examples/ts-react-media/src/components/VideoGenerator.tsx b/examples/ts-react-media/src/components/VideoGenerator.tsx
@@ -21,7 +21,7 @@ type JobState =
       model: string
       progress?: number | undefined
     }
-  | { status: 'completed'; url: string; unitsBilled?: number }
+  | { status: 'completed'; url: string; unitsBilled?: number; cost?: number }
   | { status: 'error'; message: string }
 
 interface VideoGeneratorProps {
@@ -42,6 +42,8 @@ export default function VideoGenerator({
   const pollingRefs = useRef<Map<string, NodeJS.Timeout>>(new Map())
 
   const filteredModels = VIDEO_MODELS.filter((m) => m.mode === mode)
+  const falModels = filteredModels.filter((m) => m.provider === 'fal')
+  const xaiModels = filteredModels.filter((m) => m.provider === 'xai')
 
   useEffect(() => {
     if (initialImageUrl) {
@@ -97,6 +99,7 @@ export default function VideoGenerator({
             status: 'completed',
             url: url,
             unitsBilled: urlResult.usage?.unitsBilled,
+            cost: urlResult.usage?.cost,
           },
         }))
       } else if (status.status === 'processing') {
@@ -164,8 +167,11 @@ export default function VideoGenerator({
         },
       }))
 
+      // Poll keyed by the UI model id, not result.model: the direct-xAI
+      // entries share one adapter model ('grok-imagine-video-1.5'),
+      // so result.model wouldn't identify the card (or the adapter) uniquely.
       const interval = setInterval(() => {
-        pollStatus(result.jobId, result.model)
+        pollStatus(result.jobId, modelId)
       }, 4000)
       pollingRefs.current.set(modelId, interval)
     } catch (err) {
@@ -249,11 +255,20 @@ export default function VideoGenerator({
             className="w-full px-4 py-3 bg-gray-800 border border-gray-700 rounded-lg text-white focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent disabled:opacity-50"
           >
             <option value="all">All Models</option>
-            {filteredModels.map((model) => (
-              <option key={model.id} value={model.id}>
-                {model.name}
-              </option>
-            ))}
+            <optgroup label="fal.ai">
+              {falModels.map((model) => (
+                <option key={model.id} value={model.id}>
+                  {model.name}
+                </option>
+              ))}
+            </optgroup>
+            <optgroup label="xAI (direct)">
+              {xaiModels.map((model) => (
+                <option key={model.id} value={model.id}>
+                  {model.name}
+                </option>
+              ))}
+            </optgroup>
           </select>
         </div>
 
@@ -406,12 +421,21 @@ export default function VideoGenerator({
                         className="w-full h-auto"
                       />
                     </div>
-                    {state.unitsBilled != null && (
+                    {state.cost != null ? (
                       <p className="text-xs text-gray-500">
-                        Billed {state.unitsBilled} fal unit
-                        {state.unitsBilled === 1 ? '' : 's'} — multiply by the
-                        endpoint unit price for USD cost
+                        Billed ${state.cost.toFixed(3)}
+                        {state.unitsBilled != null
+                          ? ` for ${state.unitsBilled} second${state.unitsBilled === 1 ? '' : 's'} of video`
+                          : ''}
                       </p>
+                    ) : (
+                      state.unitsBilled != null && (
+                        <p className="text-xs text-gray-500">
+                          Billed {state.unitsBilled} fal unit
+                          {state.unitsBilled === 1 ? '' : 's'} — multiply by the
+                          endpoint unit price for USD cost
+                        </p>
+                      )
                     )}
                   </>
                 )}