Agentic RAG with Fine-Grained Authorization

Also available: Weaviate version (BM25 keyword search)

This repository demonstrates how to combine agentic behavior with deterministic fine-grained authorization using LangGraph, SpiceDB, and Milvus. You'll learn to build RAG systems where a user can only see information from the documents they have access to.

This project uses the LangChain SpiceDB library.

TL;DR (human-written)

RAG systems typically focus on the retrieval mechanisms, but don't have fine-grained access control to check if the information retrieved is accessible to the user asking the query. This demo shows the setup for a prod-like Agentic RAG. It has a corpus of 50 documents with complex sharing requirements that span individual, departments and exceptions.

The two takeaways from this demo are:

Using ReBAC makes it simple to model complex hierarchal permissions. The complexity increases in the context of RAG and AI Applications as there are 10x more principals, so traditional authorization methods such as RBAC fall flat.
Never ever let an AI Agent decide if it needs to check for authorization. Gen AI is inherently probabilistic so you have to ensure that permission checks are deterministic and cannot be skipped.

Documentation Navigation

README.md (you are here) - Overview, quick start, core concepts
ARCHITECTURE.md - Deep dive into system design, security model, and trade-offs
data/PERMISSIONS.md - Permission matrix and authorization patterns

What You'll Learn

This repo demonstrates:

Fine-grained authorization in RAG - How to enforce document-level permissions with SpiceDB so users only see what they're allowed to see
Security architecture - A deterministic authorization boundary that cannot be bypassed by the agent
Production features - Structured logging, connection pooling, batch operations, error handling
Real-world complexity - 50 documents, 4 permission patterns with hierarchies

Note: Despite the "agentic RAG" name, the default mode is intentionally simple and deterministic (3 nodes: retrieve → authorize → generate). This provides fast, predictable behavior suitable for most use cases. There is a MAX_RETRIES option where the AI Agent can reason if it has to retrieve more data.

The Problem This Solves

Traditional RAG retrieves documents by semantic similarity without considering permissions. This creates two issues:

Security risk: Users might see documents they shouldn't access
Poor UX: Silent failures when documents are denied, with no explanation

Read the OWASP Top 10 for LLM and OWASP Top 10 Risks to Web Apps for more on why access control matters.

The Solution

This implementation shows how to combine:

Retrieval-first approach: Semantic vector search without upfront planning overhead
Deterministic security: SpiceDB authorization that cannot be bypassed
Transparency: Users understand what they can and can't access, and why

Traditional RAG:  Query → Retrieve → Generate
                           ↓
                    (no permission checks)

This approach (default):  Query → Retrieve → [SpiceDB Authorizes] → Generate
                                               ↓
                                       Security boundary

Quick Example

# Alice (engineering department) queries engineering docs
Query: "What are our system architecture best practices?"
User: alice

Result:
✅ Retrieved: 3 documents via semantic search
✅ Authorized: 2 documents (eng-001, eng-002)
❌ Denied: 1 document (hr-001)

Answer: "Based on the engineering documents, our system uses microservices
architecture with event-driven patterns..."

# Bob (sales department) queries engineering docs
Query: "What are our system architecture best practices?"
User: bob

Result:
✅ Retrieved: 3 documents
❌ Authorized: 0 documents
❌ Denied: 3 documents

Answer: "I don't have access to the engineering documents needed to answer
this question. This information is restricted to the engineering department."

The agent transparently explains access limitations instead of failing silently.

Setup (5 minutes)

Prerequisites

Docker & Docker Compose
Python 3.11+
OpenAI API key

Steps

# 1. Configure
cp .env.example .env
# Edit .env with your actual OpenAI API key

# 2. Start services
docker-compose up -d

# 3. Install dependencies
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Web UI

A web interface is available to demonstrate the authorization capabilities interactively.

Quick Start

# 1. Ensure services are running and data is initialized
docker-compose up -d
python3 examples/setup_environment.py

# 2. Install web dependencies
pip install -r requirements.txt  # Includes fastapi and uvicorn

# 3. Launch UI (includes pre-flight checks)
python3 run_ui.py

The setup_environment.py script sets up Milvus as the vector database and SpiceDB with sample documents and department-based access control. It embeds all 50 documents using OpenAI's text-embedding-3-small and inserts them into Milvus, then writes a hierarchical permission model to SpiceDB: users assigned to departments, department-wide document access, 3 cross-department collaboration grants, and 3 individual user exceptions.

The UI launcher will:

Verify documents are loaded in Milvus
Start the FastAPI server
Open your browser to http://localhost:8000

Here are a few sample prompts to try:

Choose "Bob" from "Sales" as the user and run the query "What are the company handbook guidelines?"

You should see:

📊 Retrieved: 5
✅ Authorized: 3
❌ Denied: 2

Now run the same query as "HR Manager":

📊 Retrieved: 5
✅ Authorized: 5
❌ Denied: 0

Manual Start

# Terminal 1: Start services (if not running)
docker-compose up -d

# Terminal 2: Start API server
uvicorn api.main:app --reload --host 0.0.0.0 --port 8000

# Browser
open http://localhost:8000

Run Without UI

# Initialize data
python3 examples/setup_environment.py

# Run demo
python3 examples/basic_example.py

How It Works

1. Authorization Model (SpiceDB)

definition user {}

definition department {
    relation member: user
}

definition document {
    relation owner: user
    relation viewer: user | department#member

    permission view = viewer + owner
    permission edit = owner
}

Relationships:

alice is a member of engineering department
eng-001 document has viewer = engineering#member
Result: alice can view eng-001 ✅

2. State Flow

Default Mode (max_attempts=1)

User Query
    ↓
Retrieval Node ← Milvus semantic vector search (text-embedding-3-small)
    ↓
Authorization Node ← SpiceDB filters (SECURITY BOUNDARY - cannot be bypassed)
    ↓
Generation Node ← Answer with authorized context + explanations

Adaptive Mode (max_attempts > 1)

When max_attempts is set above 1, a reasoning node activates if authorization fails. The LLM analyzes why access was denied and decides whether a different retrieval strategy might find documents the user can access:

User Query
    ↓
Retrieval Node
    ↓
Authorization Node ← still deterministic, still non-bypassable
    ↓
 some docs authorized? → Yes → Generation Node
    ↓ No
Reasoning Node ← LLM decides: retry with different query, or give up?
    ↓
 attempts left? → Yes → Retrieval Node (loop)
    ↓ No
Generation Node ← explains the denial

For example, if Bob (sales) asks about "microservices architecture" and the first retrieval returns only engineering-restricted docs, the reasoning node might try a broader query that surfaces a shared architecture doc Bob can actually access.

Enable it by setting MAX_RETRIEVAL_ATTEMPTS in .env (or passing max_attempts directly):

MAX_RETRIEVAL_ATTEMPTS=3  # default is 1

Or in code:

result = run_agentic_rag(query="...", subject_id="bob", max_attempts=3)

3. Security Guarantees

Authorization always runs: Hardcoded in the LangGraph workflow — the agent cannot skip it
Deterministic checks: SpiceDB enforces permissions (no LLM involved in the decision)
Fail closed: Access denied unless explicitly granted
Observable: Full audit trail in state

Project Structure

agentic-rag-authorization/
├── agentic_rag/
│   ├── graph.py               # LangGraph state machine
│   ├── state.py               # State schema
│   ├── config.py              # Configuration management
│   ├── nodes/
│   │   ├── retrieval_node.py      # Milvus semantic vector search
│   │   ├── authorization_node.py  # SpiceDB filtering (security boundary)
│   │   ├── reasoning_node.py      # Optional: adaptive retry logic
│   │   └── generation_node.py     # Final answer with context
│   ├── milvus_client.py       # Connection pooling for Milvus
│   ├── grpc_helpers.py        # Connection pooling for SpiceDB
│   ├── logging_config.py      # Structured JSON logging
│   └── validation.py          # Input validation and sanitization
├── examples/
│   ├── setup_environment.py   # Initialize data (embeds and loads 50 documents)
│   └── basic_example.py       # 8 demo scenarios
├── scripts/
│   ├── generate_documents.py  # Generate 50 .txt files
│   ├── parse_documents.py     # Parse documents into objects
│   └── verify_permissions.py  # Test authorization patterns
├── data/
│   ├── documents/             # 50 .txt files (5 departments)
│   ├── schema.zed             # SpiceDB permission schema
│   └── PERMISSIONS.md         # Permission matrix
└── docker-compose.yml         # Milvus + SpiceDB

Configuration

Environment variables (.env):

# Required
OPENAI_API_KEY=sk-...

# Optional (defaults shown)
MILVUS_URI=http://localhost:19530
MILVUS_TOKEN=
SPICEDB_ENDPOINT=localhost:50051
SPICEDB_TOKEN=devtoken
MAX_RETRIEVAL_ATTEMPTS=1

Dataset Overview

The repository includes a realistic 50-document dataset across 5 departments.

Authorization Patterns:

Department-based access (primary pattern)
Cross-department collaboration (3 shared documents)
Individual user exceptions (3 special grants)
Public documents (accessible to all users)

See data/PERMISSIONS.md for the complete permission matrix.

Sample Scenarios

The examples/basic_example.py demonstrates 8 scenarios:

Department Access - alice queries engineering documents
Access Denial - bob attempts to access engineering documents
Cross-Department - bob accesses shared architecture document
Individual Exception - alice accesses sales proposal (special grant)
Public Access - Anyone can access company handbooks
Finance Department - finance_manager queries financial reports
HR Department - hr_manager queries HR policies
Transparent Explanations - Agent explains why access was denied

Testing

# Run all tests
pytest tests/

# Run specific test
pytest tests/test_basic_flow.py::test_authorized_access

Learn More

SpiceDB: https://authzed.com/docs
Milvus: https://milvus.io/docs
LangGraph: https://langchain-ai.github.io/langgraph/
langchain-spicedb: https://github.com/authzed/langchain-spicedb

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agentic RAG with Fine-Grained Authorization

TL;DR (human-written)

Documentation Navigation

What You'll Learn

The Problem This Solves

The Solution

Quick Example

Setup (5 minutes)

Prerequisites

Steps

Web UI

Quick Start

Manual Start

Run Without UI

How It Works

1. Authorization Model (SpiceDB)

2. State Flow

3. Security Guarantees

Project Structure

Configuration

Dataset Overview

Sample Scenarios

Testing

Learn More

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Agentic RAG with Fine-Grained Authorization

TL;DR (human-written)

Documentation Navigation

What You'll Learn

The Problem This Solves

The Solution

Quick Example

Setup (5 minutes)

Prerequisites

Steps

Web UI

Quick Start

Manual Start

Run Without UI

How It Works

1. Authorization Model (SpiceDB)

2. State Flow

3. Security Guarantees

Project Structure

Configuration

Dataset Overview

Sample Scenarios

Testing

Learn More

License