Skip to content

Commit 158e17d

Browse files
Merge branch 'main' into promptless/ai-sdk-custom-endpoints
2 parents ce1389b + d3cb8ab commit 158e17d

8 files changed

Lines changed: 132 additions & 5 deletions

File tree

flash/apps/deploy-apps.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ This command performs the following steps:
2020
3. **Provision**: Creates or updates Serverless endpoints.
2121
4. **Configure**: Sets up environment variables and service discovery.
2222

23+
When you deploy updates to an existing application, Flash automatically triggers a rolling release if your source code has changed. Flash computes a fingerprint of your source files during build, so code-only changes (without resource configuration changes) still result in updated endpoints.
24+
2325
### Deployment architecture
2426

2527
Flash deploys your application as multiple independent Serverless endpoints. Each endpoint configuration in your worker files becomes a separate endpoint.

flash/cli/build.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,9 @@ Target Python version for worker images (3.10, 3.11, 3.12, or 3.13). Overrides p
5353
2. **Function discovery**: Finds all `@Endpoint` decorated functions.
5454
3. **Grouping**: Groups functions by their endpoint configuration.
5555
4. **Manifest generation**: Creates `.flash/flash_manifest.json` with endpoint definitions.
56-
5. **Dependency installation**: Installs Python packages for Linux x86_64.
57-
6. **Packaging**: Bundles everything into `.flash/artifact.tar.gz`.
56+
5. **Source fingerprinting**: Computes a SHA-256 fingerprint of your source files to detect code changes between deployments.
57+
6. **Dependency installation**: Installs Python packages for Linux x86_64.
58+
7. **Packaging**: Bundles everything into `.flash/artifact.tar.gz`.
5859

5960
## Built-in ignore patterns
6061

flash/cli/deploy.mdx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,14 @@ Target Python version for worker images (3.10, 3.11, 3.12, or 3.13). Overrides p
7373
4. **Provisioning**: Creates or updates Serverless endpoints.
7474
5. **Configuration**: Sets up environment variables and service discovery.
7575

76+
## Rolling releases for code changes
77+
78+
When you run `flash deploy` on an already-deployed application, Flash compares your current build against the previous deployment to determine what needs updating.
79+
80+
Flash triggers a rolling release when your source code changes, even if your resource configuration stays the same. During the build phase, Flash computes a fingerprint of your source files. If this fingerprint differs from the previous deployment, Flash treats it as a configuration change and initiates a rolling update to your endpoints.
81+
82+
This means you can iterate on your code without modifying resource configurations like GPU types or worker counts. Run `flash deploy` after making code changes, and Flash rolls out the updated code to your endpoints.
83+
7684
## Architecture
7785

7886
After deployment, your Flash app runs as independent Serverless endpoints on Runpod:

flash/overview.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ Install Flash using `pip` or `uv`:
6060
pip install runpod-flash
6161

6262
# Or uv
63-
uv add runpod-flash
63+
uv tool install runpod-flash
6464
```
6565

6666
### Authentication

public-endpoints/ai-sdk.mdx

Lines changed: 109 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ description: "Use the @runpod/ai-sdk-provider package to integrate Public Endpoi
55
tag: "NEW"
66
---
77

8-
The `@runpod/ai-sdk-provider` package integrates Runpod Public Endpoints with the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction). This gives you a streamlined, type-safe interface for text generation, streaming, and image generation in JavaScript and TypeScript projects.
8+
The `@runpod/ai-sdk-provider` package integrates Runpod Public Endpoints with the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction). This gives you a streamlined, type-safe interface for text generation, streaming, image generation, and video generation in JavaScript and TypeScript projects.
99

1010
The Vercel AI SDK is a popular open-source library for building AI applications. By using the Runpod provider, you can access Runpod's Public Endpoints using the same patterns and APIs you'd use with other AI providers like OpenAI or Anthropic.
1111

1212
## Why use the Vercel AI SDK?
1313

14-
- **Unified interface**: Use the same `generateText`, `streamText`, and `generateImage` functions regardless of which AI provider you're using.
14+
- **Unified interface**: Use the same `generateText`, `streamText`, `generateImage`, and `generateVideo` functions regardless of which AI provider you're using.
1515
- **Type safety**: Full TypeScript support with typed responses and parameters.
1616
- **Streaming built-in**: First-class support for streaming text responses.
1717
- **Framework integrations**: Works seamlessly with Next.js, React, Svelte, and other frameworks.
@@ -304,6 +304,91 @@ const { image } = await generateImage({
304304
| `maxPollAttempts` | Max polling attempts for async generation |
305305
| `pollIntervalMillis` | Milliseconds between status polls |
306306

307+
## Video generation
308+
309+
Use `experimental_generateVideo` to generate videos from text prompts or images. The Runpod provider supports 15 video models, including Sora, Wan, Seedance, and Kling.
310+
311+
Video generation is asynchronous—the SDK submits a job, polls for completion, and returns the video URL when ready.
312+
313+
### Text-to-video
314+
315+
Generate videos from text prompts:
316+
317+
```typescript
318+
import { runpod } from "@runpod/ai-sdk-provider";
319+
import { experimental_generateVideo as generateVideo } from "ai";
320+
321+
const { video } = await generateVideo({
322+
model: runpod.video("alibaba/wan-2.6-t2v"),
323+
prompt: "A golden retriever running on a sunny beach, cinematic, 4k",
324+
});
325+
326+
console.log(video.url);
327+
```
328+
329+
The response includes:
330+
- `video.url`: URL to the generated video
331+
- `video.mediaType`: Video MIME type (`video/mp4`)
332+
333+
### Image-to-video
334+
335+
Animate an existing image:
336+
337+
```typescript
338+
import { runpod } from "@runpod/ai-sdk-provider";
339+
import { experimental_generateVideo as generateVideo } from "ai";
340+
341+
const { video } = await generateVideo({
342+
model: runpod.video("alibaba/wan-2.6-i2v"),
343+
prompt: "Animate this scene with gentle camera movement",
344+
image: new URL("https://example.com/image.png"),
345+
});
346+
347+
console.log(video.url);
348+
```
349+
350+
### Video generation parameters
351+
352+
Control the video generation with additional parameters:
353+
354+
```typescript
355+
const { video } = await generateVideo({
356+
model: runpod.video("alibaba/wan-2.6-t2v"),
357+
prompt: "A serene mountain landscape with flowing water",
358+
duration: 5,
359+
aspectRatio: "16:9",
360+
seed: 42,
361+
});
362+
```
363+
364+
### Video provider options
365+
366+
Pass model-specific parameters using `providerOptions`:
367+
368+
```typescript
369+
const { video } = await generateVideo({
370+
model: runpod.video("alibaba/wan-2.6-t2v"),
371+
prompt: "A serene mountain landscape with flowing water",
372+
duration: 5,
373+
aspectRatio: "16:9",
374+
providerOptions: {
375+
runpod: {
376+
negative_prompt: "blurry, low quality",
377+
guidance_scale: 7.5,
378+
},
379+
},
380+
});
381+
```
382+
383+
| Option | Description |
384+
|--------|-------------|
385+
| `negative_prompt` | Elements to exclude from the video |
386+
| `guidance_scale` | How closely to follow the prompt |
387+
| `num_inference_steps` | Number of inference steps |
388+
| `style` | Style preset (model-specific) |
389+
| `maxPollAttempts` | Max polling attempts (default: 120) |
390+
| `pollIntervalMillis` | Milliseconds between status polls (default: 5000) |
391+
307392
## Supported models
308393

309394
### Text models
@@ -320,7 +405,29 @@ const { image } = await generateImage({
320405
| `black-forest-labs-flux-1-dev` | [Flux Dev](/public-endpoints/models/flux-dev). High quality, detailed images. |
321406
| `black-forest-labs-flux-1-schnell` | [Flux Schnell](/public-endpoints/models/flux-schnell). Fast generation, good for prototyping. |
322407
| `google-nano-banana-edit` | [Nano Banana Edit](/public-endpoints/models/nano-banana-edit). Supports multiple reference images. |
408+
| `google/nano-banana-2-edit` | [Nano Banana 2 Edit](/public-endpoints/models/nano-banana-2-edit). Image editing with 14 aspect ratios and resolution options (1k/2k/4k). |
323409
| `bytedance-seedream-4-0-t2i` | [Seedream 4.0](/public-endpoints/models/seedream-4-t2i). Text-to-image with good prompt adherence. |
410+
| `tongyi-mai/z-image-turbo` | [Z-Image Turbo](/public-endpoints/models/z-image-turbo). Fast 6B parameter model with text-to-image support. |
411+
412+
### Video models
413+
414+
| Model ID | Type | Resolution | Aspect Ratios | Duration |
415+
|----------|------|------------|---------------|----------|
416+
| `pruna/p-video` | t2v | 720p, 1080p | 16:9, 9:16 | 5s |
417+
| `vidu/q3-t2v` | t2v | 720p, 1080p | 16:9, 9:16, 1:1 | 5, 10s |
418+
| `vidu/q3-i2v` | i2v | 720p, 1080p | 16:9, 9:16, 1:1 | 5, 10s |
419+
| `kwaivgi/kling-v2.6-std-motion-control` | i2v + video | 720p | 16:9, 9:16, 1:1 | 5, 10s |
420+
| `kwaivgi/kling-video-o1-r2v` | i2v | 720p | 16:9, 9:16, 1:1 | 3–10s |
421+
| `kwaivgi/kling-v2.1-i2v-pro` | i2v | 720p | 16:9, 9:16, 1:1 | 5, 10s |
422+
| `alibaba/wan-2.6-t2v` | t2v | 720p, 1080p | 16:9, 9:16 | 5, 10, 15s |
423+
| `alibaba/wan-2.6-i2v` | i2v | 720p, 1080p | 16:9, 9:16 | 5, 10, 15s |
424+
| `alibaba/wan-2.5` | i2v | 480p, 720p, 1080p | 16:9, 9:16 | 5, 10s |
425+
| `alibaba/wan-2.2-t2v-720-lora` | i2v | 720p | 16:9 | 5, 8s |
426+
| `alibaba/wan-2.2-i2v-720` | i2v | 720p | 16:9 | 5, 8s |
427+
| `alibaba/wan-2.1-i2v-720` | i2v | 720p | 16:9 | 5s |
428+
| `bytedance/seedance-v1.5-pro-i2v` | i2v | 480p, 720p | 21:9, 16:9, 9:16, 1:1, 4:3, 3:4 | 4–12s |
429+
| `openai/sora-2-pro-i2v` | i2v | 720p, 1080p | 16:9, 9:16, 1:1 | 4, 8, 12s |
430+
| `openai/sora-2-i2v` | i2v | 720p, 1080p | 16:9, 9:16, 1:1 | 4, 8, 12s |
324431

325432
For a complete list of available models and their parameters, see the [model reference](/public-endpoints/reference).
326433

runpodctl/reference/runpodctl-serverless.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,10 @@ runpodctl serverless update <endpoint-id> --workers-max 5
173173
New name for the endpoint.
174174
</ResponseField>
175175

176+
<ResponseField name="--template-id" type="string">
177+
New template ID to swap to. Use this to change the template attached to an existing endpoint without recreating it.
178+
</ResponseField>
179+
176180
<ResponseField name="--workers-min" type="int">
177181
New minimum number of workers.
178182
</ResponseField>

runpodctl/reference/runpodctl-template.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,10 @@ New environment variables as a JSON object.
194194
New README content.
195195
</ResponseField>
196196

197+
<ResponseField name="--container-disk-in-gb" type="int">
198+
New container disk size in GB.
199+
</ResponseField>
200+
197201
### Delete a template
198202

199203
Delete a template:

sdks/graphql/manage-endpoints.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ Endpoints require the following fields:
2525
| `gpuIds` | String | GPU tier identifier. Options: `AMPERE_16` (16GB), `AMPERE_24` (24GB), `ADA_24` (24GB Ada), `AMPERE_48` (48GB), `ADA_48_PRO` (48GB Ada Pro), `AMPERE_80` (80GB), `ADA_80_PRO` (80GB Ada Pro). |
2626
| `name` | String | Endpoint name. |
2727
| `templateId` | String | ID of the Serverless template to use. |
28+
| `type` | String | Endpoint type. `QB` for queue-based (default), `LB` for [load balancing](/serverless/load-balancing/overview). |
2829

2930
## Create an endpoint
3031

0 commit comments

Comments
 (0)