Merge branch 'main' into promptless/ai-sdk-custom-endpoints

lavanya-gunreddi · web-flow · commit 158e17d2cbd6 · 2026-06-05T10:06:13.000-04:00
diff --git a/flash/apps/deploy-apps.mdx b/flash/apps/deploy-apps.mdx
@@ -20,6 +20,8 @@ This command performs the following steps:
 3. **Provision**: Creates or updates Serverless endpoints.
 4. **Configure**: Sets up environment variables and service discovery.
 
+When you deploy updates to an existing application, Flash automatically triggers a rolling release if your source code has changed. Flash computes a fingerprint of your source files during build, so code-only changes (without resource configuration changes) still result in updated endpoints.
+
 ### Deployment architecture
 
 Flash deploys your application as multiple independent Serverless endpoints. Each endpoint configuration in your worker files becomes a separate endpoint.
diff --git a/flash/cli/build.mdx b/flash/cli/build.mdx
@@ -53,8 +53,9 @@ Target Python version for worker images (3.10, 3.11, 3.12, or 3.13). Overrides p
 2. **Function discovery**: Finds all `@Endpoint` decorated functions.
 3. **Grouping**: Groups functions by their endpoint configuration.
 4. **Manifest generation**: Creates `.flash/flash_manifest.json` with endpoint definitions.
-5. **Dependency installation**: Installs Python packages for Linux x86_64.
-6. **Packaging**: Bundles everything into `.flash/artifact.tar.gz`.
+5. **Source fingerprinting**: Computes a SHA-256 fingerprint of your source files to detect code changes between deployments.
+6. **Dependency installation**: Installs Python packages for Linux x86_64.
+7. **Packaging**: Bundles everything into `.flash/artifact.tar.gz`.
 
 ## Built-in ignore patterns
 
diff --git a/flash/cli/deploy.mdx b/flash/cli/deploy.mdx
@@ -73,6 +73,14 @@ Target Python version for worker images (3.10, 3.11, 3.12, or 3.13). Overrides p
 4. **Provisioning**: Creates or updates Serverless endpoints.
 5. **Configuration**: Sets up environment variables and service discovery.
 
+## Rolling releases for code changes
+
+When you run `flash deploy` on an already-deployed application, Flash compares your current build against the previous deployment to determine what needs updating.
+
+Flash triggers a rolling release when your source code changes, even if your resource configuration stays the same. During the build phase, Flash computes a fingerprint of your source files. If this fingerprint differs from the previous deployment, Flash treats it as a configuration change and initiates a rolling update to your endpoints.
+
+This means you can iterate on your code without modifying resource configurations like GPU types or worker counts. Run `flash deploy` after making code changes, and Flash rolls out the updated code to your endpoints.
+
 ## Architecture
 
 After deployment, your Flash app runs as independent Serverless endpoints on Runpod:
diff --git a/flash/overview.mdx b/flash/overview.mdx
@@ -60,7 +60,7 @@ Install Flash using `pip` or `uv`:
 pip install runpod-flash
 
 # Or uv
-uv add runpod-flash
+uv tool install runpod-flash
 ```
 
 ### Authentication
diff --git a/public-endpoints/ai-sdk.mdx b/public-endpoints/ai-sdk.mdx
@@ -5,13 +5,13 @@ description: "Use the @runpod/ai-sdk-provider package to integrate Public Endpoi
 tag: "NEW"
 ---
 
-The `@runpod/ai-sdk-provider` package integrates Runpod Public Endpoints with the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction). This gives you a streamlined, type-safe interface for text generation, streaming, and image generation in JavaScript and TypeScript projects.
+The `@runpod/ai-sdk-provider` package integrates Runpod Public Endpoints with the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction). This gives you a streamlined, type-safe interface for text generation, streaming, image generation, and video generation in JavaScript and TypeScript projects.
 
 The Vercel AI SDK is a popular open-source library for building AI applications. By using the Runpod provider, you can access Runpod's Public Endpoints using the same patterns and APIs you'd use with other AI providers like OpenAI or Anthropic.
 
 ## Why use the Vercel AI SDK?
 
-- **Unified interface**: Use the same `generateText`, `streamText`, and `generateImage` functions regardless of which AI provider you're using.
+- **Unified interface**: Use the same `generateText`, `streamText`, `generateImage`, and `generateVideo` functions regardless of which AI provider you're using.
 - **Type safety**: Full TypeScript support with typed responses and parameters.
 - **Streaming built-in**: First-class support for streaming text responses.
 - **Framework integrations**: Works seamlessly with Next.js, React, Svelte, and other frameworks.
@@ -304,6 +304,91 @@ const { image } = await generateImage({
 | `maxPollAttempts` | Max polling attempts for async generation |
 | `pollIntervalMillis` | Milliseconds between status polls |
 
+## Video generation
+
+Use `experimental_generateVideo` to generate videos from text prompts or images. The Runpod provider supports 15 video models, including Sora, Wan, Seedance, and Kling.
+
+Video generation is asynchronous—the SDK submits a job, polls for completion, and returns the video URL when ready.
+
+### Text-to-video
+
+Generate videos from text prompts:
+
+```typescript
+import { runpod } from "@runpod/ai-sdk-provider";
+import { experimental_generateVideo as generateVideo } from "ai";
+
+const { video } = await generateVideo({
+  model: runpod.video("alibaba/wan-2.6-t2v"),
+  prompt: "A golden retriever running on a sunny beach, cinematic, 4k",
+});
+
+console.log(video.url);
+```
+
+The response includes:
+- `video.url`: URL to the generated video
+- `video.mediaType`: Video MIME type (`video/mp4`)
+
+### Image-to-video
+
+Animate an existing image:
+
+```typescript
+import { runpod } from "@runpod/ai-sdk-provider";
+import { experimental_generateVideo as generateVideo } from "ai";
+
+const { video } = await generateVideo({
+  model: runpod.video("alibaba/wan-2.6-i2v"),
+  prompt: "Animate this scene with gentle camera movement",
+  image: new URL("https://example.com/image.png"),
+});
+
+console.log(video.url);
+```
+
+### Video generation parameters
+
+Control the video generation with additional parameters:
+
+```typescript
+const { video } = await generateVideo({
+  model: runpod.video("alibaba/wan-2.6-t2v"),
+  prompt: "A serene mountain landscape with flowing water",
+  duration: 5,
+  aspectRatio: "16:9",
+  seed: 42,
+});
+```
+
+### Video provider options
+
+Pass model-specific parameters using `providerOptions`:
+
+```typescript
+const { video } = await generateVideo({
+  model: runpod.video("alibaba/wan-2.6-t2v"),
+  prompt: "A serene mountain landscape with flowing water",
+  duration: 5,
+  aspectRatio: "16:9",
+  providerOptions: {
+    runpod: {
+      negative_prompt: "blurry, low quality",
+      guidance_scale: 7.5,
+    },
+  },
+});
+```
+
+| Option | Description |
+|--------|-------------|
+| `negative_prompt` | Elements to exclude from the video |
+| `guidance_scale` | How closely to follow the prompt |
+| `num_inference_steps` | Number of inference steps |
+| `style` | Style preset (model-specific) |
+| `maxPollAttempts` | Max polling attempts (default: 120) |
+| `pollIntervalMillis` | Milliseconds between status polls (default: 5000) |
+
 ## Supported models
 
 ### Text models
@@ -320,7 +405,29 @@ const { image } = await generateImage({
 | `black-forest-labs-flux-1-dev` | [Flux Dev](/public-endpoints/models/flux-dev). High quality, detailed images. |
 | `black-forest-labs-flux-1-schnell` | [Flux Schnell](/public-endpoints/models/flux-schnell). Fast generation, good for prototyping. |
 | `google-nano-banana-edit` | [Nano Banana Edit](/public-endpoints/models/nano-banana-edit). Supports multiple reference images. |
+| `google/nano-banana-2-edit` | [Nano Banana 2 Edit](/public-endpoints/models/nano-banana-2-edit). Image editing with 14 aspect ratios and resolution options (1k/2k/4k). |
 | `bytedance-seedream-4-0-t2i` | [Seedream 4.0](/public-endpoints/models/seedream-4-t2i). Text-to-image with good prompt adherence. |
+| `tongyi-mai/z-image-turbo` | [Z-Image Turbo](/public-endpoints/models/z-image-turbo). Fast 6B parameter model with text-to-image support. |
+
+### Video models
+
+| Model ID | Type | Resolution | Aspect Ratios | Duration |
+|----------|------|------------|---------------|----------|
+| `pruna/p-video` | t2v | 720p, 1080p | 16:9, 9:16 | 5s |
+| `vidu/q3-t2v` | t2v | 720p, 1080p | 16:9, 9:16, 1:1 | 5, 10s |
+| `vidu/q3-i2v` | i2v | 720p, 1080p | 16:9, 9:16, 1:1 | 5, 10s |
+| `kwaivgi/kling-v2.6-std-motion-control` | i2v + video | 720p | 16:9, 9:16, 1:1 | 5, 10s |
+| `kwaivgi/kling-video-o1-r2v` | i2v | 720p | 16:9, 9:16, 1:1 | 3–10s |
+| `kwaivgi/kling-v2.1-i2v-pro` | i2v | 720p | 16:9, 9:16, 1:1 | 5, 10s |
+| `alibaba/wan-2.6-t2v` | t2v | 720p, 1080p | 16:9, 9:16 | 5, 10, 15s |
+| `alibaba/wan-2.6-i2v` | i2v | 720p, 1080p | 16:9, 9:16 | 5, 10, 15s |
+| `alibaba/wan-2.5` | i2v | 480p, 720p, 1080p | 16:9, 9:16 | 5, 10s |
+| `alibaba/wan-2.2-t2v-720-lora` | i2v | 720p | 16:9 | 5, 8s |
+| `alibaba/wan-2.2-i2v-720` | i2v | 720p | 16:9 | 5, 8s |
+| `alibaba/wan-2.1-i2v-720` | i2v | 720p | 16:9 | 5s |
+| `bytedance/seedance-v1.5-pro-i2v` | i2v | 480p, 720p | 21:9, 16:9, 9:16, 1:1, 4:3, 3:4 | 4–12s |
+| `openai/sora-2-pro-i2v` | i2v | 720p, 1080p | 16:9, 9:16, 1:1 | 4, 8, 12s |
+| `openai/sora-2-i2v` | i2v | 720p, 1080p | 16:9, 9:16, 1:1 | 4, 8, 12s |
 
 For a complete list of available models and their parameters, see the [model reference](/public-endpoints/reference).
 
diff --git a/runpodctl/reference/runpodctl-serverless.mdx b/runpodctl/reference/runpodctl-serverless.mdx
@@ -173,6 +173,10 @@ runpodctl serverless update <endpoint-id> --workers-max 5
 New name for the endpoint.
 </ResponseField>
 
+<ResponseField name="--template-id" type="string">
+New template ID to swap to. Use this to change the template attached to an existing endpoint without recreating it.
+</ResponseField>
+
 <ResponseField name="--workers-min" type="int">
 New minimum number of workers.
 </ResponseField>
diff --git a/runpodctl/reference/runpodctl-template.mdx b/runpodctl/reference/runpodctl-template.mdx
@@ -194,6 +194,10 @@ New environment variables as a JSON object.
 New README content.
 </ResponseField>
 
+<ResponseField name="--container-disk-in-gb" type="int">
+New container disk size in GB.
+</ResponseField>
+
 ### Delete a template
 
 Delete a template:
diff --git a/sdks/graphql/manage-endpoints.mdx b/sdks/graphql/manage-endpoints.mdx
@@ -25,6 +25,7 @@ Endpoints require the following fields:
 | `gpuIds` | String | GPU tier identifier. Options: `AMPERE_16` (16GB), `AMPERE_24` (24GB), `ADA_24` (24GB Ada), `AMPERE_48` (48GB), `ADA_48_PRO` (48GB Ada Pro), `AMPERE_80` (80GB), `ADA_80_PRO` (80GB Ada Pro). |
 | `name` | String | Endpoint name. |
 | `templateId` | String | ID of the Serverless template to use. |
+| `type` | String | Endpoint type. `QB` for queue-based (default), `LB` for [load balancing](/serverless/load-balancing/overview). |
 
 ## Create an endpoint