Skip to content

Native support for async video generation: Google Veo (:predictLongRunning) and xAI Grok Imagine (/v1/videos/generations) #278

Description

@tombeckenham

Summary

aimock (1.28/1.29) natively mocks OpenAI/Sora's /v1/videos job lifecycle, but two other async video-generation APIs are not modeled and currently require hand-written mock.mount() handlers in the consuming project:

  1. Google Veo (Gemini) — long-running :predictLongRunning + operations polling.
  2. xAI Grok Imagine — JSON POST /v1/videos/generations + GET /v1/videos/{request_id} polling.

Both follow the same "create job → poll until done → read hosted URL + usage" shape aimock already supports for Sora, OpenRouter (#261), and fal (#170) — just with provider-specific request/response envelopes. First-class support would let downstream suites (e.g. TanStack AI) drop their custom mounts and run these providers end-to-end like OpenAI.

1. Google Veo (Gemini)

The @google/genai SDK drives video generation through a long-running operation, not the Imagen :predict endpoint aimock currently mocks:

  • Create: POST /v1beta/models/{model}:predictLongRunning → returns { "name": "<operation-id>" }
  • Poll: the SDK calls operations.getVideosOperation({ operation }){ "name": ..., "done": false } until complete, then:
    {
      "name": "<operation-id>",
      "done": true,
      "response": {
        "generateVideoResponse": {
          "generatedSamples": [{ "video": { "uri": "https://.../video.mp4" } }]
        }
      }
    }
  • The returned video URI is served by the Gemini Files API and needs the API key to download (x-goog-api-key / key query param).
  • Models: veo-3.1-generate-preview, veo-3.0-generate-001, veo-2.0-generate-001, etc.

Today this requires mounting both :predictLongRunning and the operations-polling path manually.

2. xAI Grok Imagine

xAI's Imagine video API is plain JSON (not the OpenAI SDK surface — the SDK's multipart paths are rejected):

  • Create: POST /v1/videos/generations
    { "model": "grok-imagine-video", "prompt": "...", "image": { "url": "..." }, "aspect_ratio": "16:9", "resolution": "720p", "duration": 5 }
    { "request_id": "<id>" }
  • Poll: GET /v1/videos/{request_id}
    {
      "status": "pending" | "done" | "failed" | "expired",
      "progress": 0.5,
      "model": "grok-imagine-video",
      "video": { "url": "https://.../video.mp4", "duration": 5 },
      "usage": { "cost_in_usd_ticks": 250000000 }
    }
    (10^10 ticks = $1, so cost_in_usd_ticks / 1e10 is the USD cost.)
  • Errors: non-2xx return { "code": "...", "error": "..." }.
  • Models: grok-imagine-video (text-to-video + image-to-video), grok-imagine-video-1.5 (image-to-video only).

Why it matters

Downstream suites can only run OpenAI/Sora through aimock's native /v1/videos pipeline; Veo and Grok fall back to custom mounts or unit-test-only coverage. Native handlers (ideally with progress ramping like the fal queue realism in #170) would unify video-gen e2e across providers.

Notes

  • Happy to share the reference custom mounts and the xAI/Veo adapter implementations from TanStack AI if useful.
  • Both fit aimock's existing record/replay + job-lifecycle model; the main work is the provider-specific request matchers and response envelopes.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions