Skip to content

feat: support gemini imagen 4 generate model#4890

Open
SwillerDawn wants to merge 3 commits into
QuantumNous:mainfrom
NaciTech:feat/imagen-4-generate-001
Open

feat: support gemini imagen 4 generate model#4890
SwillerDawn wants to merge 3 commits into
QuantumNous:mainfrom
NaciTech:feat/imagen-4-generate-001

Conversation

@SwillerDawn
Copy link
Copy Markdown

@SwillerDawn SwillerDawn commented May 15, 2026

⚠️ 提交说明 / PR Notice

Important

  • 请提供人工撰写的简洁摘要,避免直接粘贴未经整理的 AI 输出。

📝 变更描述 / Description

本 PR 为 Gemini Imagen 4 模型接入支持,目标模型为 imagen-4.0-generate-001

主要变更:

  • imagen-4.0-generate-001 补充默认模型价格。
  • 复用现有 Gemini Imagen 图片生成链路,针对 Imagen 4 支持的尺寸比例做最小映射适配。
  • 补充 Gemini 原生 :predict 图片请求的解析、转发、响应处理与用量统计。
  • GeminiImageRequest 满足现有 relay 请求接口,保证模型映射、参数覆盖和配额消费流程可以复用原有逻辑。
  • Vertex Gemini 原生 :predict 图片响应复用同一处理逻辑。

这样可以同时支持:

  • OpenAI-compatible /v1/images/generations
  • Gemini native /v1/models/imagen-4.0-generate-001:predict

🚀 变更类型 / Type of change

  • 🐛 Bug 修复 (Bug fix) - 请关联对应 Issue,避免将设计取舍、理解偏差或预期不一致直接归类为 bug
  • ✨ 新功能 (New feature) - 重大特性建议先通过 Issue 沟通
  • ⚡ 性能优化 / 重构 (Refactor)
  • 📝 文档更新 (Documentation)

🔗 关联任务 / Related Issue

✅ 提交前检查项 / Checklist

  • 人工确认: 我已亲自整理并撰写此描述,没有直接粘贴未经处理的 AI 输出。
  • 非重复提交: 我已搜索现有的 Issues 与 PRs,确认不是重复提交。
  • [] Bug fix 说明: 若此 PR 标记为 Bug fix,我已提交或关联对应 Issue,且不会将设计取舍、预期不一致或理解偏差直接归类为 bug。
  • 变更理解: 我已理解这些更改的工作原理及可能影响。
  • 范围聚焦: 本 PR 未包含任何与当前任务无关的代码改动。
  • 本地验证: 已在本地运行并通过测试或手动验证,维护者可以据此复核结果。
  • 安全合规: 代码中无敏感凭据,且符合项目代码规范。

📸 运行证明 / Proof of Work

验证 curl
curl http://localhost:3000/v1/images/generations \ --request POST \ --header 'Authorization: Bearer YOUR_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "model": "imagen-4.0-generate-001", "prompt": "A clean product photo of a ceramic coffee mug on a walnut desk, soft daylight, realistic", "n": 1, "size": "1024x1024", "quality": "standard", "response_format": "b64_json" }'
curl http://localhost:3000/v1/models/imagen-4.0-generate-001:predict \ --request POST \ --header 'Authorization: Bearer YOUR_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "instances": [ { "prompt": "A clean product photo of a ceramic coffee mug on a walnut desk, soft daylight, realistic" } ], "parameters": { "sampleCount": 1, "aspectRatio": "1:1", "personGeneration": "allow_adult" } }'
本地验证通过:
屏幕截图 2026-05-15 175731

屏幕截图 2026-05-15 180115

Summary by CodeRabbit

  • New Features

    • Added support for Gemini/Imagen native image prediction flows with a dedicated predict handler.
    • Automatic size-to-aspect-ratio conversion for Imagen 4.0 model.
    • Added pricing entry for Imagen 4.0 image generation.
  • Bug Fixes / Validation

    • Reject empty image-generation requests and ensure image requests are not treated as streaming.
    • Improved image-generation usage reporting for accurate quota accounting.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Walkthrough

Adds end-to-end handling for Gemini native Imagen predict requests: DTO/validation for image requests, relay orchestration for predict flows, adaptor size→aspect mapping and routing, a native response handler counting non-RAI images and returning usage, and a pricing entry for imagen-4.0-generate-001.

Changes

Gemini Native Imagen Prediction

Layer / File(s) Summary
Gemini Image Request Contract and Validation
dto/gemini.go, relay/helper/valid_request.go
GeminiImageRequest gains GetTokenCountMeta(), IsStream(), and SetModelName(). Validators detect imagen*:predict paths, extract model names, and require non-empty instances.
Gemini Image Prediction Relay Handler
relay/gemini_handler.go
GeminiHelper routes imagen :predict requests to geminiImagePredictHelper, which maps channel models, initializes adaptors, forwards raw request bodies (with optional param overrides), executes upstream calls, handles responses, and posts quota usage with structured errors.
Channel Adaptor Size Conversion and Response Routing
relay/channel/gemini/adaptor.go, relay/channel/vertex/adaptor.go
Adaptor logic converts Imagen-4 size strings to aspect ratios via convertImagen4SizeToAspectRatio() for imagen-4.0-generate-001, preserves colon-formatted overrides for other models, switches vertex extra-body unmarshal to common.Unmarshal, and routes imagen :predict responses to GeminiNativeImagePredictHandler.
Native Imagen Prediction Response Handler
relay/channel/gemini/relay-gemini-native.go, relay/channel/gemini/relay-gemini.go
New GeminiNativeImagePredictHandler reads and optionally logs the raw upstream body, unmarshals into dto.GeminiImageResponse, counts generated images skipping RAI-filtered entries, copies the body to the client, and returns usage from buildGeminiImageUsage() (258 tokens per image).
Model Pricing Configuration
setting/ratio_setting/model_ratio.go
Adds defaultModelPrice["imagen-4.0-generate-001"] = 0.04.

Sequence Diagram

sequenceDiagram
  participant Client
  participant RequestValidator
  participant RelayHandler as geminiImagePredictHelper
  participant ChannelAdaptor as Gemini Adaptor
  participant UpstreamAPI as GoogleCloudImagen
  participant ResponseHandler as GeminiNativeImagePredictHandler

  Client->>RequestValidator: POST /models/imagen-4.0-generate-001:predict (GeminiImageRequest)
  RequestValidator->>RequestValidator: isGeminiImagePredictPath() & GetAndValidateGeminiImageRequest()
  RequestValidator->>RelayHandler: validated *GeminiImageRequest
  RelayHandler->>ChannelAdaptor: ConvertImageRequest (size → aspect ratio)
  ChannelAdaptor->>UpstreamAPI: POST upstream predict request
  UpstreamAPI-->>ChannelAdaptor: 200 + GeminiImageResponse
  ChannelAdaptor->>ResponseHandler: DoResponse routes to native handler
  ResponseHandler->>ResponseHandler: unmarshal, count non-RAI BytesBase64Encoded
  ResponseHandler-->>Client: raw body + Usage header
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • seefs001
  • creamlike1024

Poem

🐇 I nibble bytes and bind the frames,
From size to ratio, I hop and name,
Predicts return with images bright,
I count the seeds that passed the light,
A tiny rabbit cheers the pipeline's flight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main objective: adding support for the Gemini Imagen 4 generate model, which aligns with the extensive changes across multiple files implementing this feature.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant