Skip to content

Commit 9dfa134

Browse files
Merge pull request #46 from authzed/milvus
feat: migrate agentic-rag-authorization from Weaviate to Milvus
2 parents 9dbe7fb + 23b4586 commit 9dfa134

20 files changed

Lines changed: 428 additions & 276 deletions

agentic-rag-authorization/README.md

Lines changed: 53 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,24 @@
11
# Agentic RAG with Fine-Grained Authorization
22

3+
> **Also available:** [Weaviate version](https://github.com/authzed/examples/tree/weaviate/agentic-rag-authorization) (BM25 keyword search)
34
4-
This repository demonstrates how to combine agentic behavior with deterministic fine-grained authorization using LangGraph, SpiceDB, and Weaviate. You'll learn to build RAG systems where a user can view information only based on the documents they have access to.
5+
This repository demonstrates how to combine agentic behavior with deterministic fine-grained authorization using LangGraph, SpiceDB, and [Milvus](https://github.com/milvus-io/milvus). You'll learn to build RAG systems where a user can only see information from the documents they have access to.
56

6-
This project uses the [LangChain SpiceDB](https://pypi.org/project/langchain-spicedb/) library
7+
This project uses the [LangChain SpiceDB](https://pypi.org/project/langchain-spicedb/) library.
78

89
![screengrab](agentic-rag.gif)
910

11+
12+
## TL;DR (human-written)
13+
14+
RAG systems typically focus on the retrieval mechanisms, but don't have fine-grained access control to check if the information retrieved is accessible to the user asking the query. This demo shows the setup for a prod-like Agentic RAG. It has a corpus of 50 documents with complex sharing requirements that span individual, departments and exceptions.
15+
16+
The two takeaways from this demo are:
17+
18+
1. Using ReBAC makes it simple to model complex hierarchal permissions. The complexity increases in the context of RAG and AI Applications as there are 10x more principals, so traditional authorization methods such as RBAC fall flat.
19+
20+
2. Never ever let an AI Agent *decide* if it needs to check for authorization. Gen AI is inherently probabilistic so you have to ensure that permission checks are deterministic and cannot be skipped.
21+
1022
## Documentation Navigation
1123

1224
- **[README.md](README.md)** (you are here) - Overview, quick start, core concepts
@@ -17,12 +29,12 @@ This project uses the [LangChain SpiceDB](https://pypi.org/project/langchain-spi
1729

1830
This repo demonstrates:
1931

20-
1. **Fine-grained authorization in RAG** - How to enforce document-level permissions with SpiceDB to ensure the user only information based on what they have access to
21-
2. **Security architecture** - Deterministic authorization boundary that cannot be bypassed
32+
1. **Fine-grained authorization in RAG** - How to enforce document-level permissions with SpiceDB so users only see what they're allowed to see
33+
2. **Security architecture** - A deterministic authorization boundary that cannot be bypassed by the agent
2234
3. **Production features** - Structured logging, connection pooling, batch operations, error handling
23-
4. **Real-world complexity** - 50 documents, 4 permission patterns with hierarchies.
35+
4. **Real-world complexity** - 50 documents, 4 permission patterns with hierarchies
2436

25-
Note: Despite the "agentic RAG" name, the default mode is intentionally simple and deterministic (3 nodes: retrieve → authorize → generate). This provides fast, predictable behavior suitable for most use cases.
37+
Note: Despite the "agentic RAG" name, the default mode is intentionally simple and deterministic (3 nodes: retrieve → authorize → generate). This provides fast, predictable behavior suitable for most use cases. There is a `MAX_RETRIES` option where the AI Agent can reason if it has to retrieve more data.
2638

2739
## The Problem This Solves
2840

@@ -31,14 +43,14 @@ Traditional RAG retrieves documents by semantic similarity without considering p
3143
1. **Security risk**: Users might see documents they shouldn't access
3244
2. **Poor UX**: Silent failures when documents are denied, with no explanation
3345

34-
Read the [OWASP Top 10 for LLM](https://owasp.org/www-project-top-10-for-large-language-model-applications/) and [OWASP Top 10 Risks to Web Apps](https://owasp.org/Top10/2025/A01_2025-Broken_Access_Control/) for more information on why access control matters.
46+
Read the [OWASP Top 10 for LLM](https://owasp.org/www-project-top-10-for-large-language-model-applications/) and [OWASP Top 10 Risks to Web Apps](https://owasp.org/Top10/2025/A01_2025-Broken_Access_Control/) for more on why access control matters.
3547

3648
## The Solution
3749

3850
This implementation shows how to combine:
39-
- **Retrieval-first approach**: Direct semantic/keyword search without upfront planning overhead
51+
- **Retrieval-first approach**: Semantic vector search without upfront planning overhead
4052
- **Deterministic security**: SpiceDB authorization that cannot be bypassed
41-
- **Transparency**: Users understand what they can/can't access and why
53+
- **Transparency**: Users understand what they can and can't access, and why
4254

4355
```
4456
Traditional RAG: Query → Retrieve → Generate
@@ -123,29 +135,30 @@ pip install -r requirements.txt # Includes fastapi and uvicorn
123135
python3 run_ui.py
124136
```
125137

126-
The `setup-environment.py` file sets up Weaviate as the vector DB and SpiceDB with sample documents and department-based access control for the agentic RAG system.
127-
128-
We're creating a schema and writing relationships for a hierarchical permission model with users assigned to departments, department-wide document access, 3 cross-department collaboration grants, and 3 individual user exceptions.
138+
The `setup_environment.py` script sets up Milvus as the vector database and SpiceDB with sample documents and department-based access control. It embeds all 50 documents using OpenAI's `text-embedding-3-small` and inserts them into Milvus, then writes a hierarchical permission model to SpiceDB: users assigned to departments, department-wide document access, 3 cross-department collaboration grants, and 3 individual user exceptions.
129139

130140
The UI launcher will:
131-
- Verify documents are loaded
132-
- Starts the FastAPI server
141+
- Verify documents are loaded in Milvus
142+
- Start the FastAPI server
133143
- Open your browser to http://localhost:8000
134144

135-
Here are few sample prompts you can run:
145+
Here are a few sample prompts to try:
136146

137-
Choose "Bob" from "Sales" as the user and the query as "What are the company handbook guidelines?"
147+
Choose "Bob" from "Sales" as the user and run the query "What are the company handbook guidelines?"
138148

139-
You should see:
149+
You should see:
150+
```
140151
📊 Retrieved: 5
141152
✅ Authorized: 3
142153
❌ Denied: 2
154+
```
143155

144-
Now run the same query as the "HR Manager". You should see:
156+
Now run the same query as "HR Manager":
157+
```
145158
📊 Retrieved: 5
146159
✅ Authorized: 5
147160
❌ Denied: 0
148-
161+
```
149162

150163
### Manual Start
151164

@@ -162,7 +175,7 @@ open http://localhost:8000
162175

163176
## Run Without UI
164177

165-
```
178+
```bash
166179
# Initialize data
167180
python3 examples/setup_environment.py
168181

@@ -182,8 +195,11 @@ definition department {
182195
}
183196
184197
definition document {
198+
relation owner: user
185199
relation viewer: user | department#member
186-
permission view = viewer
200+
201+
permission view = viewer + owner
202+
permission edit = owner
187203
}
188204
```
189205

@@ -198,7 +214,7 @@ definition document {
198214
```
199215
User Query
200216
201-
Retrieval Node ← Weaviate BM25 keyword search
217+
Retrieval Node ← Milvus semantic vector search (text-embedding-3-small)
202218
203219
Authorization Node ← SpiceDB filters (SECURITY BOUNDARY - cannot be bypassed)
204220
@@ -225,7 +241,7 @@ Reasoning Node ← LLM decides: retry with different query, or give up?
225241
Generation Node ← explains the denial
226242
```
227243

228-
For example, if Bob (sales) asks about "microservices architecture" and the first retrieval returns only engineering-only docs, the reasoning node might try a broader query that surfaces a shared architecture doc Bob can actually access.
244+
For example, if Bob (sales) asks about "microservices architecture" and the first retrieval returns only engineering-restricted docs, the reasoning node might try a broader query that surfaces a shared architecture doc Bob can actually access.
229245

230246
Enable it by setting `MAX_RETRIEVAL_ATTEMPTS` in `.env` (or passing `max_attempts` directly):
231247

@@ -241,31 +257,30 @@ result = run_agentic_rag(query="...", subject_id="bob", max_attempts=3)
241257

242258
### 3. Security Guarantees
243259

244-
- **Authorization always runs**: Hardcoded in LangGraph workflow, agent cannot skip
245-
- **Deterministic checks**: SpiceDB enforces permissions (no LLM involved)
260+
- **Authorization always runs**: Hardcoded in the LangGraph workflow — the agent cannot skip it
261+
- **Deterministic checks**: SpiceDB enforces permissions (no LLM involved in the decision)
246262
- **Fail closed**: Access denied unless explicitly granted
247263
- **Observable**: Full audit trail in state
248264

249265
## Project Structure
250266

251267
```
252-
agentic-rag-weaviate/
268+
agentic-rag-authorization/
253269
├── agentic_rag/
254270
│ ├── graph.py # LangGraph state machine
255271
│ ├── state.py # State schema
256272
│ ├── config.py # Configuration management
257273
│ ├── nodes/
258-
│ │ ├── retrieval_node.py # Weaviate BM25 search
274+
│ │ ├── retrieval_node.py # Milvus semantic vector search
259275
│ │ ├── authorization_node.py # SpiceDB filtering (security boundary)
260-
│ │ ├── reasoning_node.py # Optional: adaptive retry logic
261-
│ │ └── generation_node.py # Final answer with context
262-
│ ├── authorization_helpers.py # Batch permission checking
263-
│ ├── weaviate_client.py # Connection pooling for Weaviate
276+
│ │ ├── reasoning_node.py # Optional: adaptive retry logic
277+
│ │ └── generation_node.py # Final answer with context
278+
│ ├── milvus_client.py # Connection pooling for Milvus
264279
│ ├── grpc_helpers.py # Connection pooling for SpiceDB
265280
│ ├── logging_config.py # Structured JSON logging
266281
│ └── validation.py # Input validation and sanitization
267282
├── examples/
268-
│ ├── setup_environment.py # Initialize data (loads 50 documents)
283+
│ ├── setup_environment.py # Initialize data (embeds and loads 50 documents)
269284
│ └── basic_example.py # 8 demo scenarios
270285
├── scripts/
271286
│ ├── generate_documents.py # Generate 50 .txt files
@@ -275,7 +290,7 @@ agentic-rag-weaviate/
275290
│ ├── documents/ # 50 .txt files (5 departments)
276291
│ ├── schema.zed # SpiceDB permission schema
277292
│ └── PERMISSIONS.md # Permission matrix
278-
└── docker-compose.yml # Weaviate + SpiceDB
293+
└── docker-compose.yml # Milvus + SpiceDB
279294
```
280295

281296
## Configuration
@@ -287,10 +302,11 @@ Environment variables (`.env`):
287302
OPENAI_API_KEY=sk-...
288303

289304
# Optional (defaults shown)
290-
WEAVIATE_URL=http://localhost:8080
305+
MILVUS_URI=http://localhost:19530
306+
MILVUS_TOKEN=
291307
SPICEDB_ENDPOINT=localhost:50051
292308
SPICEDB_TOKEN=devtoken
293-
MAX_RETRIEVAL_ATTEMPTS=3
309+
MAX_RETRIEVAL_ATTEMPTS=1
294310
```
295311

296312
## Dataset Overview
@@ -318,14 +334,6 @@ The `examples/basic_example.py` demonstrates 8 scenarios:
318334
7. **HR Department** - hr_manager queries HR policies
319335
8. **Transparent Explanations** - Agent explains why access was denied
320336

321-
## Contributing & Extending
322-
323-
See [CONTRIBUTING.md](CONTRIBUTING.md) for:
324-
- Development setup
325-
- Adding documents and permissions
326-
- Customizing agent behavior
327-
- Extending the system
328-
329337
## Testing
330338

331339
```bash
@@ -339,8 +347,9 @@ pytest tests/test_basic_flow.py::test_authorized_access
339347
## Learn More
340348

341349
- **SpiceDB**: https://authzed.com/docs
342-
- **Weaviate**: https://weaviate.io/developers/weaviate
350+
- **Milvus**: https://milvus.io/docs
343351
- **LangGraph**: https://langchain-ai.github.io/langgraph/
352+
- **langchain-spicedb**: https://github.com/authzed/langchain-spicedb
344353

345354
## License
346355

agentic-rag-authorization/agentic_rag/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"""Agentic RAG with fine-grained authorization using Weaviate and SpiceDB."""
1+
"""Agentic RAG with fine-grained authorization using Milvus and SpiceDB."""
22

33
__version__ = "0.1.0"
44

agentic-rag-authorization/agentic_rag/config.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
from dataclasses import dataclass
44
from functools import lru_cache
5-
from typing import Optional
65
import os
76
from dotenv import load_dotenv
87

@@ -13,9 +12,9 @@
1312
class Config:
1413
"""Configuration for agentic RAG system."""
1514

16-
# Weaviate
17-
weaviate_url: str
18-
weaviate_api_key: Optional[str]
15+
# Milvus
16+
milvus_uri: str
17+
milvus_token: str
1918

2019
# SpiceDB
2120
spicedb_endpoint: str
@@ -34,8 +33,8 @@ class Config:
3433
def from_env(cls):
3534
"""Load configuration from environment variables."""
3635
return cls(
37-
weaviate_url=os.getenv("WEAVIATE_URL", "http://localhost:8080"),
38-
weaviate_api_key=os.getenv("WEAVIATE_API_KEY"),
36+
milvus_uri=os.getenv("MILVUS_URI", "http://localhost:19530"),
37+
milvus_token=os.getenv("MILVUS_TOKEN", ""),
3938
spicedb_endpoint=os.getenv("SPICEDB_ENDPOINT", "localhost:50051"),
4039
spicedb_token=os.getenv("SPICEDB_TOKEN", "devtoken"),
4140
openai_api_key=os.getenv("OPENAI_API_KEY", ""),

agentic-rag-authorization/agentic_rag/graph.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ def build_agentic_rag_graph():
5555
"""Build the agentic RAG graph with deterministic authorization.
5656
5757
Simplified Flow:
58-
1. Retrieval: Fetch documents from Weaviate
58+
1. Retrieval: Fetch documents from Milvus
5959
2. Authorization: Deterministic permission check (security boundary)
6060
3. Conditional:
6161
- If authorized docs exist: Generate answer

agentic-rag-authorization/agentic_rag/logging_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ def setup_logging(level: str = "INFO") -> None:
104104
logging.getLogger("httpx").setLevel(logging.WARNING)
105105
logging.getLogger("httpcore").setLevel(logging.WARNING)
106106
logging.getLogger("openai").setLevel(logging.WARNING)
107-
logging.getLogger("weaviate").setLevel(logging.WARNING)
107+
logging.getLogger("pymilvus").setLevel(logging.WARNING)
108108
logging.getLogger("grpc").setLevel(logging.WARNING)
109109

110110

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""Milvus client connection pooling."""
2+
3+
from pymilvus import MilvusClient
4+
from threading import Lock
5+
from typing import Optional
6+
7+
_milvus_client: Optional[MilvusClient] = None
8+
_milvus_lock = Lock()
9+
10+
11+
def get_milvus_client(uri: str, token: str = "") -> MilvusClient:
12+
"""Get or create reusable MilvusClient (singleton, thread-safe)."""
13+
global _milvus_client
14+
if _milvus_client is not None:
15+
return _milvus_client
16+
with _milvus_lock:
17+
if _milvus_client is None:
18+
_milvus_client = MilvusClient(uri=uri, token=token)
19+
return _milvus_client
20+
21+
22+
def reset_milvus_client():
23+
"""Reset singleton (useful for testing)."""
24+
global _milvus_client
25+
with _milvus_lock:
26+
_milvus_client = None

0 commit comments

Comments
 (0)