Expert reference for async video and image generation APIs. Covers authentication, submit/poll/retrieve patterns, asset management, error handling, and ComfyUI integration.
Built from real-world production experience implementing these providers in comfyui-seedance.
| Provider | Use case |
|---|---|
| AnyFast | Seedance 2.0 video generation — text-to-video, image-to-video, multimodal refs. Full asset management (upload → Active poll → asset:// URI) |
| fal.ai | Queue-based video generation. Auth header uses Key not Bearer. |
| Replicate | Prediction API — submit + poll + output array |
| NVIDIA NIM | LLM inference (OpenAI-compatible). Stable Video Diffusion. Cosmos world models. Free tier available. |
| OpenAI gpt-image-1 | Image inpainting via mask. Multipart/form-data. Base64 output. Org verification required. |
SKILL_VIDEO_API_EXPERT.md is a structured reference document organized into 12 sections:
- General async pattern — Submit → Poll → Retrieve. BFS walker for schema-drifting JSON.
- AnyFast — Seedance 2.0 generation endpoint, asset lifecycle (CreateAssetGroup → CreateAsset → ListAssets poll →
asset://), anti-error patterns. - fal.ai — Queue API, Seedance app IDs, auth gotcha (
KeynotBearer). - Replicate — Prediction API, cold start, output array.
- NVIDIA NIM — OpenAI-compatible LLM endpoint, Stable Video Diffusion, Cosmos models, DeepSeek reasoning activation, free tier.
- OpenAI gpt-image-1 — Inpainting API, mask convention (ComfyUI vs GPT), supported sizes, pricing, org verification.
- Robust implementation patterns — Retry with backoff, ID extraction, polling with deadline, asset lifecycle diagram.
- ComfyUI node integration — Custom types, optional inputs, OUTPUT_NODE, IS_CHANGED, tensor↔PIL↔base64, VIDEO input.
- Debugging checklist — Submit errors, asset-not-found, poll never ends, error message extraction.
- Provider comparison table — Auth, format, response mode, status values, output paths, rate limits.
- Master gotchas list — 20 non-obvious issues confirmed in production.
- Project structure — comfyui-seedance function map and data flow.
Add the file as context at the start of a session:
claude --context SKILL_VIDEO_API_EXPERT.mdOr reference it in your CLAUDE.md:
Read SKILL_VIDEO_API_EXPERT.md for reference on video/image generation API patterns.Open SKILL_VIDEO_API_EXPERT.md directly. Each section is self-contained. Use the debugging checklist and gotchas list when troubleshooting API integrations.
# AnyFast, Replicate, NVIDIA NIM, OpenAI
"Authorization": f"Bearer {api_key}"
# fal.ai — uses Key, not Bearer
"Authorization": f"Key {api_key}"
# NVIDIA NIM — key prefix is nvapi-
"Authorization": f"Bearer nvapi-{key}"POST /volc/asset/CreateAssetGroup → group_id
POST /volc/asset/CreateAsset → asset_id
POST /volc/asset/ListAssets → poll until Status == "Active" (timeout: 300s)
Use in generation: "asset://asset-id" (lowercase)
| Provider | Pending | Running | Done | Failed |
|---|---|---|---|---|
| AnyFast | QUEUING |
PROCESSING |
SUCCESS |
FAILED |
| fal.ai | IN_QUEUE |
IN_PROGRESS |
COMPLETED |
— |
| Replicate | starting |
processing |
succeeded |
failed |
ComfyUI MASK: white (1.0) = area to edit
GPT mask PNG: transparent (alpha=0) = area to edit
→ Where ComfyUI mask > 0.5 → set alpha = 0 in RGBA PNG
- AnyFast
GroupTypein ListAssets — groups have no type.GroupType: "AIGC"always returnsItems: []. Remove it. - fal.ai auth is
Key, notBearer— every other provider uses Bearer. - AnyFast
asset://is lowercase —Asset://may be rejected by the generation endpoint. - AnyFast
AssetTypein multipart — omitting it defaults to"Image"silently, even for video/audio. - gpt-image-1 only accepts 3 sizes —
1024x1024,1536x1024,1024x1536. No other dimensions accepted.
Full list of 20 gotchas in SKILL_VIDEO_API_EXPERT.md §11.
- comfyui-seedance — ComfyUI custom node for Seedance 2.0 via AnyFast and fal.ai
- comfyui-inpaint-cropstitch-nb2 — Crop/stitch inpainting nodes for Nano Banana 2
Apache 2.0