Redis OpenAI Agents

Production-ready Redis integrations for the OpenAI Agents SDK

Documentation • Examples • GitHub

Introduction

Redis OpenAI Agents is a production-ready Python library that provides Redis-powered infrastructure for the OpenAI Agents SDK. Replace 5+ separate systems with a single Redis deployment.

Sessions & Memory	Caching & Search	Streaming & Coordination
AgentSession Persistent conversation storage	SemanticCache Reduce LLM costs by 25%+	RedisStreamTransport Reliable, replayable streaming
JSONSession Complex nested data storage	RedisVectorStore Fast similarity search	AgentCoordinator Multi-agent orchestration
SemanticRouter Intent-based agent routing	HybridSearchService BM25 + vector combined	RobustStreamProcessor Consumer groups & replay

Built for OpenAI Agents SDK

Drop-in Session Storage → Replace SQLite with distributed Redis sessions
Cost Reduction → Semantic caching reduces LLM API calls by 25%+
Production Streaming → Redis Streams for reliable token delivery
Multi-Agent Systems → Coordinate agents with atomic operations

Getting Started

Installation

Install redis-openai-agents into your Python (>=3.10) environment:

pip install redis-openai-agents

Redis

Choose from multiple Redis deployment options:

Redis Cloud: Managed cloud database (free tier available)
Redis (Docker): The official redis:8 image ships with Search, JSON, Time Series, and Bloom filters built in - no separate stack image required.
```
docker run -d --name redis -p 6379:6379 redis:8
```
Redis Enterprise: Commercial, self-hosted database

Want a GUI? Run Redis Insight separately: docker run -d --name redisinsight -p 5540:5540 redis/redisinsight:latest.

Overview

Agent Sessions

Replace SQLite sessions with Redis for persistent, distributed conversation storage:

from agents import Agent, Runner
from redis_openai_agents import AgentSession

# Create a session
session = AgentSession.create(
    user_id="user_123",
    redis_url="redis://localhost:6379"
)

# Define your agent
agent = Agent(name="assistant", instructions="You are a helpful assistant.")

# Run the agent
result = await Runner.run(agent, input="Hello!")

# Store the conversation
session.store_agent_result(result)

# Later: Load and continue the conversation
session = AgentSession.load(
    conversation_id=session.conversation_id,
    user_id="user_123",
    redis_url="redis://localhost:6379"
)

# Get conversation history in SDK format
history = session.to_agent_inputs()
result = await Runner.run(agent, input=history + [{"role": "user", "content": "Follow up"}])

An async-compatible JSON session is also available: JSONSession for complex nested data.

Semantic Caching

Reduce LLM costs by caching responses for similar queries:

from redis_openai_agents import SemanticCache

cache = SemanticCache(
    redis_url="redis://localhost:6379",
    distance_threshold=0.1,  # Similarity threshold (lower = stricter)
    ttl=3600                  # 1 hour TTL
)

# Check cache before calling LLM
result = cache.check(query="What is the capital of France?")
if result:
    print(f"Cache hit: {result.response}")
else:
    # Call LLM and store result
    response = "Paris is the capital of France."
    cache.store(query="What is the capital of France?", response=response)

Learn more about semantic caching.

Semantic Routing

Route queries to the appropriate agent using vector similarity - no LLM calls required:

from redis_openai_agents import SemanticRouter, Route

router = SemanticRouter(
    name="support-router",
    redis_url="redis://localhost:6379",
    routes=[
        Route(
            name="billing",
            references=["payment issue", "invoice", "refund request"],
            metadata={"agent": "billing_agent"},
            distance_threshold=0.3
        ),
        Route(
            name="technical",
            references=["bug report", "error message", "not working"],
            metadata={"agent": "tech_agent"},
            distance_threshold=0.3
        ),
    ]
)

# Route a query (vector lookup, not LLM call)
match = router.route("I need help with my payment")
print(f"Route to: {match.name}")  # "billing"

Learn more about semantic routing.

Vector Search (RAG)

Build RAG applications with Redis vector search:

from redis_openai_agents import RedisVectorStore

store = RedisVectorStore(
    name="knowledge-base",
    redis_url="redis://localhost:6379"
)

# Add documents
store.add_documents([
    {"content": "Redis is an in-memory data store.", "source": "docs"},
    {"content": "Python is a programming language.", "source": "wiki"},
])

# Search with metadata filtering
results = store.search(
    query="What is Redis?",
    k=5,
    filter={"source": "docs"}
)

for result in results:
    print(f"{result.content} (score: {result.score})")

Hybrid Search

Combine vector similarity with BM25 full-text search for better retrieval:

from redis_openai_agents import HybridSearchService

search = HybridSearchService(
    name="hybrid-search",
    redis_url="redis://localhost:6379"
)

# Search with both vector and text matching
results = search.search(
    query="Redis performance optimization",
    k=10,
    vector_weight=0.7,  # 70% vector similarity
    text_weight=0.3     # 30% BM25 text match
)

Token Streaming

Reliable, replayable token streaming via Redis Streams:

from redis_openai_agents import RedisStreamTransport, RobustStreamProcessor

# Publisher side
transport = RedisStreamTransport(
    stream_name="agent-output",
    redis_url="redis://localhost:6379"
)

await transport.publish({"type": "token", "data": {"text": "Hello"}})
await transport.publish({"type": "token", "data": {"text": " world!"}})
await transport.publish({"type": "complete", "data": {}})

# Consumer side with automatic recovery
processor = RobustStreamProcessor(
    stream_name="agent-output",
    consumer_group="clients",
    redis_url="redis://localhost:6379"
)

async for event in processor.process():
    if event["type"] == "token":
        print(event["data"]["text"], end="")

Supports consumer groups, automatic acknowledgment, and replay from any position.

Agent Coordination

Coordinate multiple agents with Redis pub/sub and atomic operations:

from redis_openai_agents import AgentCoordinator, EventType

coordinator = AgentCoordinator(
    session_id="multi-agent-session",
    redis_url="redis://localhost:6379"
)

# Agent 1: Signal handoff ready
await coordinator.publish(EventType.HANDOFF_READY, {
    "from_agent": "triage",
    "to_agent": "specialist",
    "context": {"topic": "billing"}
})

# Agent 2: Listen for handoffs
async for event in coordinator.subscribe():
    if event.type == EventType.HANDOFF_READY:
        print(f"Handoff from {event.data['from_agent']}")

Middleware for the Model Call

Compose cross-cutting concerns around the agent's LLM call with an around-style middleware protocol modelled on LangChain's AgentMiddleware:

from agents import Agent, Runner
from agents.models.openai_responses import OpenAIResponsesModel
from openai import AsyncOpenAI

from redis_openai_agents import (
    MiddlewareStack, Route, SemanticCache, SemanticRouter,
)
from redis_openai_agents.middleware import (
    SemanticCacheMiddleware, SemanticRouterMiddleware,
)

router = SemanticRouter(
    name="support-router", redis_url="redis://localhost:6379",
    routes=[Route(name="greeting", references=["hello", "hi"])],
)
router_mw = SemanticRouterMiddleware(router=router, responses={"greeting": "Hi!"})

cache = SemanticCache(redis_url="redis://localhost:6379", similarity_threshold=0.92)
cache_mw = SemanticCacheMiddleware(cache=cache)

stack = MiddlewareStack(
    model=OpenAIResponsesModel(model="gpt-4o-mini", openai_client=AsyncOpenAI()),
    middlewares=[router_mw, cache_mw],  # outer-to-inner
)

agent = Agent(name="assistant", instructions="Be concise.", model=stack)
result = await Runner.run(agent, "hello")  # short-circuited by router

Ships with SemanticCacheMiddleware, SemanticRouterMiddleware, and ConversationMemoryMiddleware. Write your own: any object with an async awrap_model_call(request, handler) coroutine is a middleware.

Tool Result Caching

Memoize a tool's Python callable in Redis, keyed by argument hash. Side-effect prefixes (send_, delete_, ...) and volatile args (timestamp, now, ...) bypass the cache automatically.

from agents import function_tool
from redis_openai_agents import cached_tool


@function_tool
@cached_tool(name="lookup_company", redis_url="redis://localhost:6379", ttl=3600)
async def lookup_company(ticker: str) -> str:
    return await _hit_paid_api(ticker)

Metrics & Observability

Built-in observability with RedisTimeSeries and Prometheus:

from redis_openai_agents import AgentMetrics, PrometheusExporter

metrics = AgentMetrics(redis_url="redis://localhost:6379")

# Record metrics
await metrics.record_latency("agent_run", 150.5)
await metrics.record_tokens("gpt-4", input_tokens=100, output_tokens=50)
await metrics.record_cache_hit("semantic_cache")

# Get statistics
stats = await metrics.get_stats("latency", aggregation="avg", time_range="1h")

# Prometheus export (http://localhost:9090/metrics)
exporter = PrometheusExporter(metrics)
await exporter.start_server(port=9090)

Components

Sessions & Memory

Component	Description
`AgentSession`	Hash-based session storage built on RedisVL MessageHistory
`JSONSession`	JSON document storage for complex nested session data
`SemanticRouter`	Vector-based intent routing without LLM calls

Caching & Search

Component	Description
`SemanticCache`	Two-level cache (exact match + semantic similarity)
`RedisCachingModel`	Model wrapper with automatic response caching
`RedisVectorStore`	HNSW vector search for RAG applications
`RedisFullTextSearch`	BM25 full-text search with filters
`HybridSearchService`	Combined vector + text search with configurable weights

Streaming & Coordination

Component	Description
`RedisStreamTransport`	Redis Streams-based event transport
`RobustStreamProcessor`	Consumer groups with automatic recovery
`ResumableStreamRunner`	Checkpoint-based stream resumption
`AgentCoordinator`	Multi-agent coordination via pub/sub
`AtomicOperations`	Lua script-based atomic Redis operations

Observability

Component	Description
`AgentMetrics`	RedisTimeSeries metrics collection
`PrometheusExporter`	Prometheus metrics endpoint
`RedisTracingProcessor`	SDK-compatible trace storage in Redis Streams

SDK Integration

Component	Description
`RedisAgentRunner`	Enhanced runner with caching and metrics
`RedisFileSearchTool`	Drop-in replacement for OpenAI file search
`RedisRateLimitGuardrail`	SDK guardrail with Redis-backed rate limiting
`MiddlewareStack`	Around-style middleware wrapping the SDK `Model` interface
`SemanticCacheMiddleware`	Cache LLM responses by input similarity
`SemanticRouterMiddleware`	Short-circuit matched intents with canned responses
`ConversationMemoryMiddleware`	Inject semantically relevant past messages
`cached_tool`	Decorator that memoizes a tool callable's result in Redis

Advanced Features

Component	Description
`RankedOperations`	Sorted set rankings for agents and tools
`DeduplicationService`	Bloom filter request deduplication
`RedisConnectionPool`	Connection pooling with retry logic

Examples

Example	Description
01-routing-agents	Multi-agent routing with handoffs
02-semantic-cache	Reduce LLM costs with caching
03-vector-search	Build RAG applications
04-full-text-search	BM25 full-text search
05-token-streaming	Real-time streaming with Redis Streams
06-time-series-metrics	Observability with TimeSeries
07-full-stack-integration	Complete integration example
08-runner-integration	RedisAgentRunner usage
09-hybrid-search	Combined vector + full-text search
10-agent-ranking	Sorted set rankings
11-deduplication	Bloom filter deduplication
12-agent-coordinator	Multi-agent orchestration
13-robust-streaming	Consumer groups & recovery
14-atomic-operations	Lua script atomicity
15-semantic-router	Intent-based routing
16-middleware	Cache + router + composition around the Model
17-tool-caching	`@cached_tool` for idempotent tools

Why Redis OpenAI Agents?

Challenge	Without Redis	With Redis OpenAI Agents
Session Storage	SQLite (single-node)	Distributed Redis sessions
Caching	None or external service	Built-in semantic cache
Vector Search	Pinecone, Qdrant ($70+/mo)	Redis Vector Search (free)
Streaming	Custom WebSocket code	Redis Streams (reliable)
Metrics	Prometheus + Grafana setup	Built-in TimeSeries
Total Services	5+ separate systems	1 Redis deployment

Development

This project uses uv for dependency management.

# Install dependencies
uv sync --all-extras --group dev

# Run tests
uv run pytest --run-api-tests

# Format and lint
make format
make lint

# Type check
make mypy

# Build documentation
make docs

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Built with ❤️ by Redis for the OpenAI Agents SDK community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
.github		.github
docs		docs
examples		examples
scripts		scripts
src/redis_openai_agents		src/redis_openai_agents
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Redis OpenAI Agents

Introduction

Built for OpenAI Agents SDK

Getting Started

Installation

Redis

Overview

Agent Sessions

Semantic Caching

Semantic Routing

Vector Search (RAG)

Hybrid Search

Token Streaming

Agent Coordination

Middleware for the Model Call

Tool Result Caching

Metrics & Observability

Components

Sessions & Memory

Caching & Search

Streaming & Coordination

Observability

SDK Integration

Advanced Features

Examples

Why Redis OpenAI Agents?

Development

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages