Skip to content

ciftcisametdev/Runpod_Qwen_Image_Edit-2511

 
 

Repository files navigation

Qwen-Image-Edit-2511 RunPod Serverless Deployment

Deploy the Qwen-Image-Edit-2511 model as a serverless endpoint on RunPod for scalable image editing capabilities.

Model Information

  • Model: Qwen/Qwen-Image-Edit-2511
  • Task: Image-to-Image editing with prompt guidance
  • Framework: Diffusers
  • GPU Required: CUDA-compatible GPU (16GB+ VRAM recommended)

Features

  • Auto-scaling serverless deployment
  • Base64 image input/output for easy API integration
  • Configurable inference parameters
  • Reproducible results with seed control
  • Fast cold starts with RunPod's FlashBoot
  • ⚡ Ultra-fast builds with uv - 10-100x faster than pip

Why uv?

This project uses uv, an extremely fast Python package installer written in Rust by Astral (creators of Ruff). Benefits:

  • 🚀 10-100x faster than pip/poetry - Docker builds complete in seconds
  • ⚡ Faster cold starts - Less time installing dependencies on RunPod
  • 🔧 Drop-in replacement - Works with existing requirements.txt
  • 📦 Modern tooling - Supports pyproject.toml
  • 💰 Lower costs - Faster builds = less build time charges

For serverless/containerized deployments, uv is significantly faster than Poetry while maintaining simplicity.

Repository Structure

Runpod_Qwen_Image_Edit-2511/
├── handler.py            # Main handler function for RunPod
├── Dockerfile            # Container configuration with uv
├── pyproject.toml        # Modern Python project config (recommended)
├── requirements.txt      # Python dependencies (legacy support)
├── build_and_push.sh     # Automated build script
├── test_client.py        # Python test client
├── example_request.json  # Sample API request
├── .dockerignore         # Files to exclude from Docker build
├── .gitignore            # Git ignore rules
└── README.md             # This file

Local Development (Optional)

If you want to test locally before deploying:

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv pip install -r requirements.txt

# Or using pyproject.toml (recommended)
uv sync

# Run handler locally (requires CUDA GPU)
python handler.py

Quick Start

1. Build the Docker Image

# Using the automated script (recommended)
./build_and_push.sh your-dockerhub-username

# Or manually
docker build -t your-username/qwen-image-edit:latest .

2. Push to Docker Registry

docker push your-username/qwen-image-edit:latest

3. Deploy on RunPod

  1. Go to RunPod Serverless Console

  2. Click "New Endpoint"

  3. Configure:

    • Name: Qwen-Image-Edit-2511
    • Container Image: your-username/qwen-image-edit:latest
    • GPU Type: Select GPU with 16GB+ VRAM (e.g., RTX 4090, A40, A100)
    • Max Workers: Based on your scaling needs
    • Idle Timeout: 60 seconds (recommended)
    • Execution Timeout: 120 seconds (recommended)
  4. Click "Deploy"

API Usage

Request Format

{
  "input": {
    "image": "base64_encoded_image_string",
    "prompt": "Add a sunset background",
    "true_cfg_scale": 4.0,
    "guidance_scale": 1.0,
    "num_inference_steps": 40,
    "seed": 42
  }
}

Parameters

Parameter Type Required Default Description
image string Yes - Base64 encoded input image
prompt string Yes - Text description of desired edits
true_cfg_scale float No 4.0 Classifier-free guidance scale
guidance_scale float No 1.0 Additional guidance scale
num_inference_steps int No 40 Number of denoising steps
seed int No None Random seed for reproducibility

Response Format

{
  "image": "base64_encoded_edited_image",
  "prompt": "Add a sunset background",
  "parameters": {
    "true_cfg_scale": 4.0,
    "guidance_scale": 1.0,
    "num_inference_steps": 40,
    "seed": 42
  }
}

Python Client Example

import requests
import base64
from PIL import Image
import io

def encode_image(image_path):
    with open(image_path, "rb") as f:
        return base64.b64encode(f.read()).decode()

def decode_image(base64_string):
    image_data = base64.b64decode(base64_string)
    return Image.open(io.BytesIO(image_data))

# Prepare request
endpoint_url = "https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/runsync"
api_key = "YOUR_RUNPOD_API_KEY"

image_base64 = encode_image("input.jpg")

payload = {
    "input": {
        "image": image_base64,
        "prompt": "Make it look like autumn",
        "true_cfg_scale": 4.0,
        "num_inference_steps": 40,
        "seed": 42
    }
}

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

# Make request
response = requests.post(endpoint_url, json=payload, headers=headers)
result = response.json()

# Save output image
if "output" in result and "image" in result["output"]:
    output_image = decode_image(result["output"]["image"])
    output_image.save("output.jpg")
    print("Image saved successfully!")
else:
    print("Error:", result)

Local Testing

Test the handler locally before deploying (see Local Development section for setup):

# Using test client
python test_client.py \
  --image input.jpg \
  --prompt "Add sunset background" \
  --endpoint YOUR_ENDPOINT_ID \
  --api-key YOUR_API_KEY

# Or test handler directly (requires GPU)
python handler.py

Performance Optimization

Model Caching (Recommended)

The default configuration uses Hugging Face model caching, which provides:

  • Fastest cold starts
  • No additional storage costs
  • Automatic model updates

Bake Model into Image

For faster initialization, uncomment this line in the Dockerfile:

RUN python3 -c "from diffusers import QwenImageEditPlusPipeline; QwenImageEditPlusPipeline.from_pretrained('Qwen/Qwen-Image-Edit-2511')"

Trade-offs:

  • Pros: Faster worker initialization
  • Cons: Larger image size (~10-15GB), longer build times

GPU Selection

Recommended GPUs for this model:

GPU VRAM Performance Cost
RTX 4090 24GB Excellent Medium
A40 48GB Excellent Medium
A100 40GB 40GB Best High
A100 80GB 80GB Best Highest

Cost Estimation

RunPod serverless pricing is pay-per-use:

  • Only charged for actual inference time
  • No charges during idle periods
  • Typical inference: 5-15 seconds depending on steps

Example: 1000 requests/day at 10s each = ~2.8 hours compute time

Troubleshooting

Out of Memory Errors

  • Use GPU with more VRAM (24GB+)
  • Reduce num_inference_steps
  • Lower max_workers setting

Slow Cold Starts

  • Enable model caching (default)
  • Consider baking model into image
  • Use RunPod's FlashBoot feature

Model Not Loading

  • Ensure GPU has CUDA support
  • Check Docker image has correct CUDA version
  • Verify model name is correct: Qwen/Qwen-Image-Edit-2511

Advanced Configuration

Environment Variables

Add to your RunPod endpoint configuration:

TRANSFORMERS_CACHE=/runpod-volume/huggingface
HF_HOME=/runpod-volume/huggingface

Multi-GPU Support

RunPod supports up to 4 GPUs per worker:

  • Useful for batch processing
  • Configure in endpoint settings
  • Update handler to use model parallelism

Resources

License

This deployment code is provided as-is. Please refer to the Qwen-Image-Edit-2511 model license for model usage terms.

Support

For issues related to:

About

Runpod serverless deployer for Qwen-Image-Edit-2511

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 73.3%
  • Shell 14.2%
  • Dockerfile 12.5%