-
Notifications
You must be signed in to change notification settings - Fork 54
Feat/42 redis search tools #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nkanu17
wants to merge
16
commits into
google:main
Choose a base branch
from
nkanu17:feat/redis-search-tools
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
8b6ad10
feat: add RedisVL search tools
nkanu17 e3ecb52
chore: update redisvl dependency to >=0.13.2
nkanu17 c427907
feat(redis): add RedisVL 0.13.2 HybridQuery parameters
nkanu17 baa5ff5
docs: update GitHub issue with redisvl>=0.13.2 requirement
nkanu17 c4d3247
fix: update default model to gemini-2.5-flash
nkanu17 cd22ab4
refactor: introduce two-tier hierarchy for Redis search tools
nkanu17 8618ceb
style: run autoformat (isort + pyink)
nkanu17 e0636fd
feat(redis): add Pydantic config classes for Redis search tools
nkanu17 8e42778
refactor(redis): use config objects in Redis search tool constructors
nkanu17 cb26dbb
feat(redis): export config classes from tools module
nkanu17 b4e81d0
test(redis): update tests to use config-based API
nkanu17 8bd8dc3
docs(samples): update redis_vl_search sample to use config-based API
nkanu17 de730ed
feat(redis): add version-aware hybrid search with dual config support
nkanu17 25eec89
chore(deps): remove upper bound from redis dependency
nkanu17 17f2f0f
style: run autoformat (isort + pyink)
nkanu17 fa3b9fd
feat(redis): check both redisvl and Redis server versions for native …
nkanu17 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,215 @@ | ||
| # RedisVL Search Agent | ||
|
|
||
| This sample demonstrates using Redis search tools to give an ADK agent | ||
| access to a Redis-based knowledge base with multiple search capabilities. | ||
|
|
||
| ## What This Sample Shows | ||
|
|
||
| - Setting up a Redis vector index with a schema | ||
| - Using 3 Redis search tools in one agent (4th requires Redis 8.4+): | ||
| - **RedisVectorSearchTool**: Semantic similarity search (KNN) - finds conceptually similar content | ||
| - **RedisTextSearchTool**: Full-text keyword search (BM25) - matches exact terms and phrases | ||
| - **RedisRangeSearchTool**: Distance threshold search - returns ALL docs within a relevance radius | ||
| - **RedisHybridSearchTool**: Combined vector + text search (requires Redis 8.4+ and redis-py 7.1+) | ||
| - Integrating RedisVL with an ADK agent | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| 1. **Redis Stack** running locally (or Redis Cloud with Search capability) | ||
| ```bash | ||
| # Using Docker | ||
| docker run -d --name redis-stack -p 6379:6379 redis/redis-stack:latest | ||
| ``` | ||
|
|
||
| 2. **No API keys needed for embeddings** - uses Redis' open-source `redis/langcache-embed-v2` model (768 dimensions) | ||
|
|
||
| ## Setup | ||
|
|
||
| 1. Install dependencies: | ||
| ```bash | ||
| pip install "google-adk-community[redis-vl]" | ||
| ``` | ||
|
|
||
| 2. Download NLTK stopwords (required for keyword search): | ||
| ```bash | ||
| python -c "import nltk; nltk.download('stopwords')" | ||
| ``` | ||
|
|
||
| 3. Set environment variables (or create a `.env` file): | ||
| ```bash | ||
| export REDIS_URL=redis://localhost:6379 | ||
| export GOOGLE_API_KEY=your-google-api-key # For Gemini LLM | ||
| ``` | ||
|
|
||
| 4. Load sample data into Redis: | ||
| ```bash | ||
| cd contributing/samples/redis_vl_search | ||
| python load_data.py | ||
| ``` | ||
|
|
||
| 5. Run the agent: | ||
| ```bash | ||
| cd contributing/samples/redis_vl_search | ||
| adk web | ||
| ``` | ||
|
|
||
| ## Files | ||
|
|
||
| | File | Description | | ||
| |------|-------------| | ||
| | `schema.yaml` | Redis index schema defining document structure | | ||
| | `load_data.py` | Script to populate Redis with sample documents | | ||
| | `redis_vl_search_agent/agent.py` | Agent definition with all Redis search tools | | ||
|
|
||
| ## How It Works | ||
|
|
||
| 1. **Schema Definition** (`schema.yaml`): Defines the index structure with fields | ||
| for title, content, URL, category, and a vector embedding field. | ||
|
|
||
| 2. **Data Loading** (`load_data.py`): Populates Redis with sample documents about | ||
| Redis and ADK, embedding the content using Redis' langcache-embed-v2 model. | ||
|
|
||
| 3. **Agent** (`redis_vl_search_agent/agent.py`): Creates an agent with access to | ||
| multiple search tools for different use cases. | ||
|
|
||
| ## Search Tools | ||
|
|
||
| ### semantic_search (RedisVectorSearchTool) | ||
| **Best for:** Conceptual questions, natural language queries, finding similar content. | ||
|
|
||
| **How it works:** Converts query to vector embedding, finds K nearest neighbors by cosine similarity. | ||
|
|
||
| **Returns:** Top-K most similar documents (default: 5). | ||
|
|
||
| **Example queries:** | ||
| - "What is Redis?" → finds docs about Redis even if they don't say "What is Redis" | ||
| - "How do I build a chatbot?" → finds docs about "intelligent assistants", "conversational AI" | ||
| - "Fast database for caching" → finds Redis docs even without exact keyword match | ||
|
|
||
| ### keyword_search (RedisTextSearchTool) | ||
| **Best for:** Exact terms, acronyms, technical jargon, API names, error messages. | ||
|
|
||
| **How it works:** BM25 text scoring algorithm - matches exact tokens, weighs by term frequency. | ||
|
|
||
| **Returns:** Top-K documents ranked by keyword relevance. | ||
|
|
||
| **Example queries:** | ||
| - "HNSW algorithm" → exact match on "HNSW" acronym | ||
| - "BM25 formula" → finds docs containing "BM25" | ||
| - "VectorQuery class" → API/class name lookup | ||
| - "RRF ranking" → technical term that needs exact match | ||
|
|
||
| ### range_search (RedisRangeSearchTool) | ||
| **Best for:** Exhaustive retrieval, comprehensive coverage, finding ALL related documents. | ||
|
|
||
| **How it works:** Returns ALL documents within a distance threshold (not just top-K). | ||
|
|
||
| **Returns:** Variable number - every document above the relevance bar. | ||
|
|
||
| **Use when:** | ||
| - User wants "everything" about a topic | ||
| - Comprehensive research needed | ||
| - Quality filtering (only highly relevant docs) | ||
| - Clustering/grouping similar content | ||
|
|
||
| **Example queries:** | ||
| - "Tell me everything about RAG pipelines" → returns all RAG-related docs | ||
| - "All Redis data structures" → comprehensive list | ||
| - "Complete guide to embeddings" → exhaustive retrieval | ||
|
|
||
| ### hybrid_search (RedisHybridSearchTool) | ||
| **Best for:** Queries that benefit from both semantic understanding AND exact keyword matching. | ||
|
|
||
| **How it works:** Combines vector similarity + BM25 text scores using RRF or linear weighting. | ||
|
|
||
| **Requires:** Redis 8.4+ and redis-py 7.1+ | ||
|
|
||
| ## Example Queries | ||
|
|
||
| Once running, try asking the agent: | ||
|
|
||
| | Query | Expected Tool | Why | | ||
| |-------|---------------|-----| | ||
| | "What is Redis?" | semantic_search | Conceptual question | | ||
| | "HNSW algorithm details" | keyword_search | Technical acronym | | ||
| | "Tell me everything about RAG" | range_search | Exhaustive retrieval | | ||
| | "How do I build a chatbot?" | semantic_search | Natural language | | ||
| | "BM25 formula" | keyword_search | Exact term lookup | | ||
| | "All vector search methods" | range_search | Comprehensive coverage | | ||
|
|
||
| ## Customization | ||
|
|
||
| ### Using a Different Vectorizer | ||
|
|
||
| ```python | ||
| from redisvl.utils.vectorize import HuggingFaceTextVectorizer | ||
|
|
||
| vectorizer = HuggingFaceTextVectorizer(model="sentence-transformers/all-MiniLM-L6-v2") | ||
| ``` | ||
|
|
||
| Note: Update `dims` in `schema.yaml` to match your model's embedding dimensions. | ||
|
|
||
| ### Adding Filters | ||
|
|
||
| You can add filter expressions to narrow search results: | ||
|
|
||
| ```python | ||
| from redisvl.query.filter import Tag | ||
| from google.adk_community.tools.redis import RedisVectorSearchTool, RedisVectorQueryConfig | ||
|
|
||
| config = RedisVectorQueryConfig(num_results=5) | ||
| redis_search = RedisVectorSearchTool( | ||
| index=index, | ||
| vectorizer=vectorizer, | ||
| config=config, | ||
| return_fields=["title", "content", "url", "category"], | ||
| filter_expression=Tag("category") == "redis", # Only search Redis docs | ||
| ) | ||
| ``` | ||
|
|
||
| See [RedisVL Filter documentation](https://docs.redisvl.com/api/filter.html) for more filter options. | ||
|
|
||
| ### Advanced Query Options | ||
|
|
||
| `RedisVectorSearchTool` uses a `RedisVectorQueryConfig` object for query parameters: | ||
|
|
||
| ```python | ||
| from google.adk_community.tools.redis import RedisVectorSearchTool, RedisVectorQueryConfig | ||
| from redisvl.query.filter import Tag | ||
|
|
||
| # Configure query parameters via config object | ||
| config = RedisVectorQueryConfig( | ||
| num_results=10, | ||
| # Query tuning | ||
| dtype="float32", # Vector dtype | ||
| return_score=True, # Include similarity score | ||
| normalize_vector_distance=True, # Convert to 0-1 similarity | ||
| # Hybrid filtering | ||
| hybrid_policy="BATCHES", # or "ADHOC_BF" | ||
| batch_size=100, # For BATCHES policy | ||
| # HNSW tuning | ||
| ef_runtime=150, # Higher = better recall, slower | ||
| epsilon=0.01, # Range search approximation | ||
| # SVS-VAMANA tuning | ||
| search_window_size=20, # Search window size | ||
| use_search_history="AUTO", # "OFF", "ON", or "AUTO" | ||
| search_buffer_capacity=30, # 2-level compression tuning | ||
| ) | ||
|
|
||
| redis_search = RedisVectorSearchTool( | ||
| index=index, | ||
| vectorizer=vectorizer, | ||
| config=config, | ||
| return_fields=["title", "content"], | ||
| filter_expression=Tag("category") == "redis", | ||
| ) | ||
| ``` | ||
|
|
||
| See [RedisVL Query documentation](https://docs.redisvl.com/api/query.html) for details. | ||
|
|
||
| ### Connecting to Redis Cloud | ||
|
|
||
| ```bash | ||
| export REDIS_URL=redis://default:password@your-redis-cloud-host:port | ||
| ``` | ||
|
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sample is very thorough and easy to follow. Providing a
load_data.pyscript makes it much easier for users to get started.