This sample demonstrates how to use semantic caching with ADK agents using the community package. Semantic caching stores LLM responses and retrieves them for semantically similar queries, reducing latency and API costs.
- Python 3.9+ (Python 3.11+ recommended)
- Redis server (local or cloud)
- ADK and adk-redis installed
- Google API key (for the LLM)
First, install uv if you haven't already:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or with pip
pip install uvThen install the package:
uv pip install "adk-redis[all]"Option A: Automated setup (recommended)
# Run from the repository root
./scripts/start-redis.shThis script will automatically start Redis 8.4 with health checks and verify it's running correctly.
Option B: Manual setup
docker run -d --name redis -p 6379:6379 redis:8.4-alpineVerify Redis is running:
docker ps | grep redis
# Or test the connection
docker exec redis redis-cli ping
# Should return: PONGNote: Redis 8.4 includes the Redis Query Engine (evolved from RediSearch) with native support for vector search, full-text search, and JSON operations. Docker will automatically download the image (~40MB) on first run.
Create a .env file in this directory (or copy from .env.example):
# Required: Google API key for the agent
GOOGLE_API_KEY=your-google-api-key
# Optional: Redis URL (defaults to redis://localhost:6379)
REDIS_URL=redis://localhost:6379Note: This example uses the redis/langcache-embed-v1 embedding model which runs locally and doesn't require an API key. RedisVL supports many other vectorizers including OpenAI, Cohere, HuggingFace, Mistral, and more. See the RedisVL Vectorizers documentation for all options.
uv run python main.pyThis runs a demo that:
- Creates an agent with semantic caching enabled
- Sends multiple queries, including semantically similar ones
- Shows cache hits for similar queries
adk web .Then open http://localhost:8000 to interact with the cached agent.
semantic_cache/
├── main.py # Demo script
├── semantic_cache_agent/
│ ├── __init__.py # Agent package initialization
│ └── agent.py # Agent with caching callbacks
└── README.md # This file
-
Before Model Callback: Checks if a semantically similar prompt exists in the cache. If found, returns the cached response immediately.
-
After Model Callback: Stores the prompt-response pair in the cache for future similar queries.
-
Semantic Similarity: Uses vector embeddings to find similar prompts, not exact string matching. "What is Python?" and "Tell me about Python" would match.
redis_url(str): Redis connection stringname(str): Cache index namettl(int): Time-to-live in seconds for cached entriesdistance_threshold(float): Semantic similarity threshold (0-2 for COSINE)
first_message_only(bool): Only cache first message in sessioninclude_app_name(bool): Include app name in cache keyinclude_user_id(bool): Include user ID in cache keyinclude_session_id(bool): Include session ID in cache key
You can also cache tool results:
from adk_redis.cache import (
ToolCache,
ToolCacheConfig,
create_tool_cache_callbacks,
)
tool_cache = ToolCache(
provider=provider,
config=ToolCacheConfig(
tool_names={"search_web", "get_weather"}, # Tools to cache
),
)
before_tool_cb, after_tool_cb = create_tool_cache_callbacks(tool_cache)
agent = Agent(
name="my_agent",
before_tool_callback=before_tool_cb,
after_tool_callback=after_tool_cb,
)