Skip to content

Commit 5659331

Browse files
danishiclaude
andauthored
Add generate_image tool for Nanobanana Pro / Nanobanana 2 (#11)
* Add generate_image tool for Nanobanana Pro / Nanobanana 2 image generation Reference: https://github.com/danishi/slack-nano-banana-bot-on-google-cloud - Create app/tools/generate_image.py: ADK tool that calls Gemini image generation models via google-genai SDK - Supports gemini-3-pro-image-preview (Nanobanana Pro) for higher quality - Supports gemini-3.1-flash-image-preview (Nanobanana 2) for faster generation - Stores generated images in thread-safe dict for Slack upload - Update app/main.py: register tool, set session_id in state, upload generated images to Slack thread after agent response - Add google-genai>=1.56.0 dependency for image generation support - Add IMAGE_MODEL_NAME env var to .env.example https://claude.ai/code/session_01VycrggwZLpx8ZRV8AiWFpq * Update README with image generation feature and files:write scope https://claude.ai/code/session_01VycrggwZLpx8ZRV8AiWFpq * Fix: register generate_image directly on agent tools list SkillToolset only exposes skill-related tools (list_skills, load_skill, etc.). Move generate_image to the agent's tools list so it is directly available to the LLM. https://claude.ai/code/session_01VycrggwZLpx8ZRV8AiWFpq * Fix: use contextvars instead of session state for image keying tool_context.state did not reliably reflect session.state, causing generated images to be stored under "unknown" key. Switch to contextvars.ContextVar which is set by the request handler and read by the tool within the same async context. https://claude.ai/code/session_01VycrggwZLpx8ZRV8AiWFpq * Add IMAGE_MODEL_NAME env var to Cloud Run deploy script https://claude.ai/code/session_01VycrggwZLpx8ZRV8AiWFpq * Fix: use content=bytes instead of file=BytesIO for files_upload_v2 BytesIO may not report file length correctly in some slack_sdk versions, causing files.getUploadURLExternal to fail. Pass raw bytes via content parameter instead. https://claude.ai/code/session_01VycrggwZLpx8ZRV8AiWFpq --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 8da7d9b commit 5659331

6 files changed

Lines changed: 132 additions & 1 deletion

File tree

.env.example

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,6 @@ GOOGLE_CLOUD_PROJECT="your-gcp-project"
66
GOOGLE_CLOUD_LOCATION="global"
77
CLOUD_RUN_LOCATION="asia-northeast1"
88
MODEL_NAME="gemini-3.1-pro-preview"
9+
IMAGE_MODEL_NAME="gemini-3.1-flash-image-preview"
910
REACTION_PROCESSING="eyes"
1011
REACTION_COMPLETED="white_check_mark"

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ If you want a simpler, lightweight Slack bot without the ADK framework, check ou
1313
## Features
1414
- Responds to `@mention` messages in Slack channels.
1515
- Supports text, image, PDF, text file, video, and audio inputs from Slack messages. Files are fetched via authenticated URLs and sent to Gemini for multimodal understanding.
16+
- **Image generation** via `generate_image` tool using Gemini image generation models:
17+
- `gemini-3-pro-image-preview` ([Nanobanana Pro](https://github.com/danishi/slack-nano-banana-bot-on-google-cloud)) — higher quality
18+
- `gemini-3.1-flash-image-preview` ([Nanobanana 2](https://github.com/danishi/slack-nano-banana-bot-on-google-cloud)) — faster generation
19+
- Generated images are automatically uploaded to the Slack thread.
1620
- Maintains conversation context by retrieving prior messages in a thread and sending them as conversation history to Gemini.
1721
- Formats responses using Slack-compatible Markdown for rich text output.
1822
- FastAPI-based web server suitable for Cloud Run.
@@ -25,6 +29,7 @@ app/
2529
agents/
2630
comedian.py # ex: Comedian agent implementation
2731
tools/
32+
generate_image.py # ex: Image generation tool (Nanobanana Pro / Nanobanana 2)
2833
get_current_datetime.py # ex: Date/time utility tool
2934
skills/
3035
greeting-skill/ # ex: Greeting skill (file-based ADK Skill)
@@ -90,6 +95,7 @@ The Agent Development Kit includes a built-in web-based Development UI that you
9095
- `im:history`
9196
- `mpim:history`
9297
- `files:read`
98+
- `files:write`
9399
- `reactions:write`
94100
- `users:read`
95101
3. Install the app to your workspace to obtain `SLACK_BOT_TOKEN` and `SLACK_SIGNING_SECRET`.

app/main.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121

2222
from .agents.comedian import comedian_agent
2323
from .tools.get_current_datetime import get_current_datetime
24+
from .tools.generate_image import generate_image, get_and_clear_images, current_session_id
2425

2526
# Environment variables
2627
load_dotenv()
@@ -171,6 +172,15 @@ async def _populate_session_from_thread(
171172
- If asked to summarize a thread, list each person's key points by name.
172173
- Do NOT include the `[Speaker: ...]` tag in your replies.
173174
175+
### Image Generation
176+
- When the user asks you to create, draw, generate, or design an image, use the `generate_image` tool.
177+
- Available models:
178+
- `gemini-3.1-flash-image-preview` (Nanobanana 2): Fast generation (default)
179+
- `gemini-3-pro-image-preview` (Nanobanana Pro): Higher quality
180+
- If the user requests a specific model or quality level, set the `model` parameter accordingly.
181+
- Write a detailed, descriptive prompt for best results.
182+
- Generated images will be automatically uploaded to the Slack thread.
183+
174184
### Formatting Rules
175185
- **Headings / emphasis**: Use `*bold*` for section titles or important words.
176186
- *Italics*: Use `_underscores_` for emphasis when needed.
@@ -182,6 +192,7 @@ async def _populate_session_from_thread(
182192
Always structure your response clearly, using these rules so it renders correctly in Slack.""",
183193
tools=[
184194
skill_toolset,
195+
generate_image,
185196
],
186197
sub_agents=[
187198
comedian_agent,
@@ -237,6 +248,8 @@ async def handle_mention(body, say, client, logger, ack):
237248
session = await session_service.get_session(
238249
app_name=APP_NAME, user_id=user_id, session_id=thread_ts
239250
)
251+
# Set context var so generate_image tool can key images by session
252+
current_session_id.set(thread_ts)
240253
if session and not session.events:
241254
await _populate_session_from_thread(
242255
session=session,
@@ -267,6 +280,20 @@ async def handle_mention(body, say, client, logger, ack):
267280
thread_ts=thread_ts,
268281
)
269282

283+
# Upload any images generated by the generate_image tool
284+
generated_images = get_and_clear_images(thread_ts)
285+
for idx, image_bytes in enumerate(generated_images, start=1):
286+
try:
287+
await client.files_upload_v2(
288+
channel=channel,
289+
thread_ts=thread_ts,
290+
filename=f"generated-image-{idx}.png",
291+
title=f"Generated image {idx}",
292+
content=image_bytes,
293+
)
294+
except Exception:
295+
logger.exception("Failed to upload generated image %d", idx)
296+
270297
# Add ✅ reaction to indicate the reply is complete
271298
try:
272299
await client.reactions_add(channel=channel, timestamp=message_ts, name=REACTION_COMPLETED)

app/tools/generate_image.py

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
import asyncio
2+
import contextvars
3+
import os
4+
import threading
5+
from typing import List
6+
7+
from google import genai
8+
from google.adk.tools import ToolContext
9+
from google.genai.types import GenerateContentConfig, Modality
10+
11+
# Default image generation model (Nanobanana 2)
12+
DEFAULT_IMAGE_MODEL = "gemini-3.1-flash-image-preview"
13+
14+
# Thread-safe storage for generated images keyed by session_id
15+
_generated_images: dict[str, List[bytes]] = {}
16+
_images_lock = threading.Lock()
17+
18+
# ContextVar set by the request handler before running the agent
19+
current_session_id: contextvars.ContextVar[str] = contextvars.ContextVar(
20+
"current_session_id", default="unknown"
21+
)
22+
23+
24+
def get_and_clear_images(session_id: str) -> List[bytes]:
25+
"""Retrieve and remove generated images for a session."""
26+
with _images_lock:
27+
return _generated_images.pop(session_id, [])
28+
29+
30+
async def generate_image(prompt: str, tool_context: ToolContext, model: str = ""):
31+
"""Generates images using Gemini image generation models (Nanobanana Pro / Nanobanana 2).
32+
33+
Use this tool when the user asks you to create, draw, generate, or design an image.
34+
35+
Args:
36+
prompt: A detailed description of the image to generate.
37+
model: The model to use for image generation.
38+
Use "gemini-3-pro-image-preview" (Nanobanana Pro) for higher quality.
39+
Use "gemini-3.1-flash-image-preview" (Nanobanana 2) for faster generation.
40+
Defaults to Nanobanana 2 if not specified.
41+
"""
42+
image_model = model if model else os.environ.get(
43+
"IMAGE_MODEL_NAME", DEFAULT_IMAGE_MODEL
44+
)
45+
46+
project_id = os.environ.get("GOOGLE_CLOUD_PROJECT")
47+
location = os.environ.get("GOOGLE_CLOUD_LOCATION", "global")
48+
49+
def call_gemini():
50+
client = genai.Client(vertexai=True, project=project_id, location=location)
51+
response = client.models.generate_content(
52+
model=image_model,
53+
contents=prompt,
54+
config=GenerateContentConfig(
55+
response_modalities=[Modality.TEXT, Modality.IMAGE],
56+
),
57+
)
58+
return response
59+
60+
try:
61+
response = await asyncio.to_thread(call_gemini)
62+
except Exception as e:
63+
return {"error": f"Image generation failed: {e}"}
64+
65+
text_parts = []
66+
images = []
67+
68+
candidates = getattr(response, "candidates", None)
69+
if candidates:
70+
for part in candidates[0].content.parts or []:
71+
if getattr(part, "thought", None):
72+
continue
73+
if getattr(part, "text", None):
74+
text_parts.append(part.text)
75+
continue
76+
inline = getattr(part, "inline_data", None)
77+
if inline and getattr(inline, "data", None):
78+
images.append(inline.data)
79+
80+
if not images:
81+
return {
82+
"status": "no_image_generated",
83+
"text": "\n".join(text_parts) if text_parts else "No image was generated.",
84+
}
85+
86+
# Store images for the main handler to upload to Slack
87+
session_id = current_session_id.get()
88+
with _images_lock:
89+
_generated_images.setdefault(session_id, []).extend(images)
90+
91+
return {
92+
"status": "success",
93+
"model": image_model,
94+
"image_count": len(images),
95+
"text": "\n".join(text_parts) if text_parts else f"{len(images)} image(s) generated successfully.",
96+
}

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@ uvicorn[standard]>=0.41.0
55
google-adk>=1.25.1
66
httpx>=0.28.1
77
python-dotenv>=1.2.1
8+
google-genai>=1.56.0
89
aiohttp>=3.13.3
910
pytz>=2025.2

scripts/deploy.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ SERVICE_URL=$(gcloud run deploy "${SERVICE_NAME}" \
6565
--allow-unauthenticated \
6666
--no-cpu-throttling \
6767
--project "${PROJECT_ID}" \
68-
--set-env-vars "SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN},SLACK_SIGNING_SECRET=${SLACK_SIGNING_SECRET},GOOGLE_GENAI_USE_VERTEXAI=${GOOGLE_GENAI_USE_VERTEXAI},GOOGLE_CLOUD_PROJECT=${PROJECT_ID},GOOGLE_CLOUD_LOCATION=global,ALLOWED_SLACK_WORKSPACE=${ALLOWED_SLACK_WORKSPACE:-},MODEL_NAME=${MODEL_NAME:-gemini-3.1-pro-preview},REACTION_PROCESSING=${REACTION_PROCESSING:-},REACTION_COMPLETED=${REACTION_COMPLETED:-}" \
68+
--set-env-vars "SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN},SLACK_SIGNING_SECRET=${SLACK_SIGNING_SECRET},GOOGLE_GENAI_USE_VERTEXAI=${GOOGLE_GENAI_USE_VERTEXAI},GOOGLE_CLOUD_PROJECT=${PROJECT_ID},GOOGLE_CLOUD_LOCATION=global,ALLOWED_SLACK_WORKSPACE=${ALLOWED_SLACK_WORKSPACE:-},MODEL_NAME=${MODEL_NAME:-gemini-3.1-pro-preview},IMAGE_MODEL_NAME=${IMAGE_MODEL_NAME:-gemini-3.1-flash-image-preview},REACTION_PROCESSING=${REACTION_PROCESSING:-},REACTION_COMPLETED=${REACTION_COMPLETED:-}" \
6969
--format 'value(status.url)')
7070

7171
echo "--------------------------------------------"

0 commit comments

Comments
 (0)