Skip to content

Commit abe1095

Browse files
committed
Add gemini gen mcp
1 parent ef955aa commit abe1095

File tree

4 files changed

+219
-2
lines changed

4 files changed

+219
-2
lines changed

content/docs/mcp/fast_mcp.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: MCP Support
3-
description: The fast_mcp extension brings Model Context Protocol (MCP) support to `llms.py`, allowing you to extend LLM capabilities with a wide range of external tools and services.
3+
description: The fast_mcp extension brings Model Context Protocol (MCP) support to llms.py, allowing you to extend LLM capabilities with a wide range of external tools
44
---
55

66
## Install
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
---
2+
title: Gemini Gen MCP
3+
description: MCP Server for Gemini Image and Text to Speech (TTS) Audio generation
4+
---
5+
6+
### Using in llms .py
7+
8+
Paste server configuration into [llms .py MCP Servers](https://llmspy.org/docs/extensions/fast_mcp):
9+
10+
Name: `gemini-gen`
11+
12+
```json
13+
{
14+
"description": "Gemini Image and Audio TTS generation",
15+
"command": "uvx",
16+
"args": [
17+
"gemini-gen-mcp"
18+
],
19+
"env": {
20+
"GEMINI_API_KEY": "$GEMINI_API_KEY"
21+
}
22+
}
23+
```
24+
25+
You can either edit the `mcp.json` file directly to add your own servers or use the UI to **Add**, **Edit**, or **Delete** servers or use the **Copy** button to copy an individual server's configuration.
26+
27+
<div className="grid grid-cols-1 md:grid-cols-2 gap-4">
28+
<Screenshot src="/img/mcp-add.webp" />
29+
<Screenshot src="/img/mcp-edit.webp" />
30+
</div>
31+
32+
### Using with Claude Desktop
33+
34+
Add this to your or `claude_desktop_config.json`:
35+
36+
```json
37+
{
38+
"mcpServers": {
39+
"gemini-gen": {
40+
"description": "Gemini Image and Audio TTS generation",
41+
"command": "uvx",
42+
"args": [
43+
"gemini-gen-mcp"
44+
],
45+
"env": {
46+
"GEMINI_API_KEY": "$GEMINI_API_KEY"
47+
}
48+
}
49+
}
50+
}
51+
```
52+
53+
### Development Server
54+
55+
For development, you can run this server using `uv`:
56+
57+
```json
58+
{
59+
"mcpServers": {
60+
{
61+
"command": "uv",
62+
"args": [
63+
"run",
64+
"--directory",
65+
"/path/to/ServiceStack/gemini-gen-mcp",
66+
"gemini-gen-mcp"
67+
],
68+
"env": {
69+
"GEMINI_API_KEY": "$GEMINI_API_KEY"
70+
}
71+
}
72+
}
73+
}
74+
```
75+
76+
## Features
77+
78+
This MCP server provides tools to:
79+
- **Generate images from text** using Gemini's Flash Image model
80+
- **Generate audio from text** using Gemini 2.5 Flash Preview TTS model
81+
82+
<Screenshot src="/img/mcp/gemini-gen-mcp/tools_gemini_mcp_text2img.webp" />
83+
84+
#### Results
85+
86+
Upon execution, the tool's output is displayed in a results dialog with specific rendering based on the output type:
87+
88+
<Screenshot src="/img/tools/tools-exec-results.webp" />
89+
90+
When included, the same tools can be also be invoked indirectly by LLMs during chat sessions:
91+
92+
<Screenshot src="/img/tools/tools-chat-gemini-image.webp" />
93+
94+
## Prerequisites
95+
96+
You need a Google Gemini API key to use this server. Get one from [Google AI Studio](https://aistudio.google.com/apikey).
97+
98+
## Environment Variables
99+
100+
| Variable | Required | Default | Description |
101+
|----------|----------|---------|-------------|
102+
| `GEMINI_API_KEY` | Yes | - | Your Google Gemini API key |
103+
| `GEMINI_DOWNLOAD_PATH` | No | `/tmp/gemini_gen_mcp` | Directory where generated files are saved |
104+
105+
Set the environment variables:
106+
107+
```bash
108+
export GEMINI_API_KEY='your-api-key-here'
109+
export GEMINI_DOWNLOAD_PATH='/path/to/downloads' # optional
110+
```
111+
112+
Generated files are organized by type and date:
113+
- Images: `$GEMINI_DOWNLOAD_PATH/images/YYYY-MM-DD/`
114+
- Audio: `$GEMINI_DOWNLOAD_PATH/audios/YYYY-MM-DD/`
115+
116+
Each generated file includes a companion `.info.json` file with generation metadata.
117+
118+
## Usage
119+
120+
### Running the Server
121+
122+
Run the MCP server directly:
123+
124+
```bash
125+
gemini-gen-mcp
126+
```
127+
128+
Or as a Python module:
129+
130+
```bash
131+
python -m gemini_gen_mcp.server
132+
```
133+
134+
## Available Tools
135+
136+
### `text_to_image`
137+
138+
Generate images from text using Gemini's Flash (Nano Banana) Image models.
139+
140+
```python
141+
@mcp.tool()
142+
async def text_to_image(
143+
prompt: Annotated[str, "Text description of the image to generate"],
144+
model: ImageModels = ImageModels.NANO_BANANA,
145+
aspect_ratio: AspectRatio = AspectRatio.SQUARE,
146+
temperature: Annotated[
147+
float, "Sampling temperature for image generation (default: 1.0)"
148+
] = 1.0,
149+
top_p: Annotated[
150+
Optional[float], "Nucleus sampling parameter for image generation (optional)"
151+
] = None,
152+
) -> Image
153+
```
154+
155+
**Parameters:**
156+
- `prompt` (string, required): Text description of the image to generate
157+
- `model` (string, optional): Gemini model to use
158+
- `gemini-2.5-flash-image` (default)
159+
- `gemini-3-pro-image-preview`
160+
- `aspect_ratio` (string, optional): Aspect ratio for the generated image (default: "1:1")
161+
- Supported: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
162+
- `temperature` (float, optional): Sampling temperature for image generation (default: 1.0)
163+
- `top_p` (float, optional): Nucleus sampling parameter (optional)
164+
165+
**Example:**
166+
```json
167+
{
168+
"prompt": "A serene mountain landscape at sunset with a lake",
169+
"model": "gemini-2.5-flash-image",
170+
"aspect_ratio": "16:9",
171+
"temperature": 1.0
172+
}
173+
```
174+
175+
#### text_to_audio
176+
177+
Generate speech audio from text using Gemini Flash TTS model. Output is saved as WAV format.
178+
179+
```python
180+
@mcp.tool()
181+
async def text_to_speech(
182+
text: Annotated[str, "Text to convert to speech"],
183+
model: AudioModels = AudioModels.GEMINI_2_5_FLASH_PREVIEW_TTS,
184+
voice: VoiceName = VoiceName.KORE,
185+
) -> Audio
186+
```
187+
188+
**Parameters:**
189+
- `text` (string, required): Text to convert to speech
190+
- `model` (string, optional): Gemini TTS model to use
191+
- `gemini-2.5-flash-preview-tts` (default)
192+
- `gemini-2.5-pro-preview-tts`
193+
- `voice` (string, optional): Voice to use for speech generation (default: "Kore")
194+
195+
**Available Voices:**
196+
197+
| Voice | Style | Voice | Style | Voice | Style |
198+
|-----------|------------|---------------|---------------|--------------|-------------|
199+
| Zephyr | Bright | Puck | Upbeat | Charon | Informative |
200+
| Kore | Firm | Fenrir | Excitable | Leda | Youthful |
201+
| Orus | Firm | Aoede | Breezy | Callirrhoe | Easy-going |
202+
| Autonoe | Bright | Enceladus | Breathy | Iapetus | Clear |
203+
| Umbriel | Easy-going | Algieba | Smooth | Despina | Smooth |
204+
| Erinome | Clear | Algenib | Gravelly | Rasalgethi | Informative |
205+
| Laomedeia | Upbeat | Achernar | Soft | Alnilam | Firm |
206+
| Schedar | Even | Gacrux | Mature | Pulcherrima | Forward |
207+
| Achird | Friendly | Zubenelgenubi | Casual | Vindemiatrix | Gentle |
208+
| Sadachbia | Lively | Sadaltager | Knowledgeable | Sulafat | Warm |
209+
210+
**Example:**
211+
```json
212+
{
213+
"text": "Hello, this is a test of the Gemini text to speech system.",
214+
"model": "gemini-2.5-flash-preview-tts",
215+
"voice": "Kore"
216+
}
217+
```

content/docs/mcp/meta.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
"defaultOpen": true,
44
"pages": [
55
"fast_mcp",
6-
"gemini_mcp",
6+
"gemini_gen_mcp",
77
"omarchy_mcp"
88
]
99
}
92.5 KB
Loading

0 commit comments

Comments
 (0)