You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -165,6 +165,7 @@ The functions include a built-in encryption mechanism for sensitive information:
165
165
- **Configurable Parameters**: Environment variables for image optimization (quality, max dimensions, format conversion).
166
166
- **Multi-Image History**: Configurable history image limit, hash-based deduplication, and automatic `[Image N]` labels so the model can reference earlier images.
167
167
- **Image Generation (Gemini 3)**: Configurable aspect ratio (e.g. `16:9`, `1:1`) and resolution (`1K`/`2K`/`4K`) for Gemini 3 image models; per-user valve overrides supported.
168
+
- **Video Generation (Veo)**: Generate videos with Google Veo models (3.1, 3, 2). Configurable aspect ratio, resolution, duration, negative prompt, and person generation controls. Supports text-to-video and image-to-video for all supported Veo models. Videos are automatically uploaded and embedded with playback controls.
168
169
- **Token Usage Tracking**: Returns prompt, completion, and total token counts to Open WebUI for automatic saving to the database.
169
170
- **Model Whitelist & Additional Models**: Restrict the visible model list via `GOOGLE_MODEL_WHITELIST` and add SDK-unsupported models via `GOOGLE_MODEL_ADDITIONAL`.
170
171
- Grounding with Google search with [google_search_tool.py filter](./filters/google_search_tool.py)
Copy file name to clipboardExpand all lines: docs/google-gemini-integration.md
+113Lines changed: 113 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,9 @@ This integration enables **Open WebUI** to interact with **Google Gemini** model
37
37
-**Advanced Image Generation**
38
38
Support for text-to-image and image-to-image generation with Gemini 2.5 Flash Image Preview models.
39
39
40
+
-**Video Generation with Google Veo**
41
+
Generate videos using Veo 3.1, 3, and 2 models with configurable aspect ratio, resolution, duration, and more. Supports text-to-video and image-to-video (Veo 3.1). Videos are automatically uploaded and embedded with playback controls.
42
+
40
43
-**Flexible Error Handling**
41
44
Retries failed requests and logs errors for transparency.
42
45
@@ -339,6 +342,116 @@ for part in response.parts:
339
342
| Other gemini-3-\* models | ❌ Not image generation models |
340
343
| Other models | ❌ Not image generation models |
341
344
345
+
## Video Generation Configuration
346
+
347
+
The Google Gemini pipeline supports video generation using **Google Veo models** (Veo 3.1, 3, and 2). Veo models appear automatically in the model list with a 🎬 indicator.
348
+
349
+
> [!IMPORTANT]
350
+
> Video generation uses a different API path than text/image generation. Requests are **always non-streaming** — the pipeline submits a video generation job, polls for completion, and returns the result with embedded video playback.
Not all parameters are supported by every Veo model. The pipeline automatically gates features based on the model used. Unsupported parameters are silently skipped to avoid API errors.
365
+
366
+
| Feature | Veo 3.1 | Veo 3.1 Fast | Veo 3 | Veo 3 Fast | Veo 2 |
> ¹ The Veo API supports up to 3 reference images for Veo 3.1, but the pipeline currently only forwards a single attached image via the `image` parameter.
382
+
>
383
+
> ² Last-frame interpolation and video extension are Veo API capabilities not yet exposed by the pipeline.
384
+
385
+
### Environment Variables
386
+
387
+
```bash
388
+
# Default aspect ratio for videos (16:9 landscape or 9:16 portrait)
389
+
# Supported by: all Veo models
390
+
# Default: "default" (API decides)
391
+
GOOGLE_VIDEO_GENERATION_ASPECT_RATIO="default"
392
+
393
+
# Default video resolution (720p, 1080p, or 4k)
394
+
# Supported by: Veo 3.1/3 only (ignored for Veo 2; 4k only on Veo 3.1)
Attach an image to your message when using any Veo model to use it as the starting frame for video generation. The pipeline automatically detects attached images and passes the first one to the Veo API via the `image` parameter.
439
+
440
+
> [!NOTE]
441
+
> All Veo models support single-image image-to-video. **Multi-reference images** (up to 3 style/content guides, Veo 3.1 only) and **last-frame interpolation** are Veo API capabilities not yet exposed by the pipeline.
442
+
443
+
### How It Works
444
+
445
+
1. Select a Veo model (marked with 🎬) from the model list
446
+
2. Type your video description prompt
447
+
3. Optionally attach an image for image-to-video (supported by all Veo models)
448
+
4. The pipeline submits the request and shows polling status updates
449
+
5. Once complete, the video is uploaded to Open WebUI and embedded with a `<video>` player
450
+
451
+
### Vertex AI Note
452
+
453
+
When using Vertex AI, video download via `files.download()` is not available. If the Veo API returns a GCS URI instead of raw bytes, the current pipeline does not yet surface that URI or attach the video output in the chat. You may need to retrieve the generated video directly from Vertex AI or the underlying GCS bucket.
454
+
342
455
## Model Configuration
343
456
344
457
The Google Gemini pipeline provides two complementary mechanisms for controlling which models appear in the model list: `MODEL_ADDITIONAL` and `MODEL_WHITELIST`.
0 commit comments