CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a Retrieval-Augmented Generation (RAG) chatbot system that answers questions about DeepLearning.AI course materials. It uses ChromaDB for vector storage, Anthropic's Claude API with tool calling, and provides a web interface for conversational queries.

Development Commands

Setup

# Install dependencies (uses uv package manager)
uv sync

# Set up environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY

Running the Application

# Quick start (recommended)
./run.sh

# Manual start (from project root)
cd backend && uv run uvicorn app:app --reload --port 8000

# Access points
# - Web UI: http://localhost:8000
# - API docs: http://localhost:8000/docs

Adding Course Documents

Place text files in the docs/ folder. Files are automatically loaded on server startup. See "Document Format Requirements" below.

Architecture

RAG Pipeline Flow

The system implements a tool-based RAG architecture where Claude decides when to search:

User Query → FastAPI → RAGSystem → AIGenerator → Claude API
                                                      ↓
                                            (Claude calls tool)
                                                      ↓
                                    SearchTool → VectorStore → ChromaDB
                                                      ↓
                                         (search results returned)
                                                      ↓
                                            Claude synthesizes response
                                                      ↓
                                          SessionManager stores history
                                                      ↓
                                            Return answer + sources

Key architectural decision: Claude has search as a callable tool, not an always-on feature. The system prompt instructs Claude to call search_course_content when needed, making searches contextual rather than automatic.

Dual Collection Strategy

ChromaDB uses two separate collections with different purposes:

course_catalog (vector_store.py:51)
- Purpose: Fuzzy matching of course names for search filtering
- Documents: Course titles only
- Metadata: Full course info (instructor, links, lesson metadata)
- IDs: Course title (serves as unique identifier)
- Usage: When user specifies a course name, semantic search finds the best match
course_content (vector_store.py:52)
- Purpose: Actual semantic search of course material
- Documents: Text chunks with enriched context
- Metadata: {course_title, lesson_number, chunk_index}
- IDs: "{course_title_snake_case}_{chunk_index}"
- Usage: Primary search collection for answering queries

This separation enables fuzzy course name matching (e.g., "MCP" → "MCP: Build Rich-Context AI Apps") before searching content.

Component Relationships

rag_system.py is the orchestration layer that:

Coordinates all components (VectorStore, AIGenerator, SearchTool, SessionManager)
Manages document ingestion and deduplication
Handles query flow from input to response

ai_generator.py handles Claude API interactions:

Builds API requests with system prompt, history, and tool definitions
Processes tool calls from Claude
Extracts responses and sources from Claude's output
Uses temperature=0 for deterministic responses

session_manager.py maintains conversation state:

Thread-safe session storage with dict-based in-memory storage
Automatically trims history to last MAX_HISTORY exchanges (default: 2)
Each session tracks conversation context for multi-turn queries

Text Chunking Strategy

document_processor.py:25-91 implements sentence-aware chunking:

Sentence splitting using regex that handles abbreviations (Mr., Dr., etc.)
Chunk building up to 800 characters per chunk
Overlap calculation - 100 characters shared between consecutive chunks by counting backwards from chunk end
Context enrichment - First chunk of each lesson prefixed with "Lesson N content: ...", last lesson chunks include course title

This preserves semantic boundaries and context across chunk boundaries.

Document Format Requirements

Course documents must follow this structure:

Course Title: [title]
Course Link: [url]
Course Instructor: [name]

Lesson 0: [title]
Lesson Link: [url]
[content...]

Lesson 1: [title]
Lesson Link: [url]
[content...]

Processing behavior:

Lines 1-3: Metadata extraction with regex matching
Remaining lines: Parsed for ^Lesson\s+(\d+):\s*(.+)$ markers
Content between lesson markers becomes lesson content
Lesson links (optional) must appear immediately after lesson headers
If no lesson markers found, entire file treated as single document

Configuration

All configuration in backend/config.py as a dataclass:

CHUNK_SIZE: 800 characters (sentence-aware, not hard cutoff)
CHUNK_OVERLAP: 100 characters between chunks
MAX_RESULTS: 5 search results per query
MAX_HISTORY: 2 conversation exchanges retained
EMBEDDING_MODEL: "all-MiniLM-L6-v2" (384-dimensional embeddings)
ANTHROPIC_MODEL: "claude-sonnet-4-20250514"
CHROMA_PATH: "./chroma_db" (persistent vector storage)

Important Patterns

Document Deduplication

rag_system.py:76 checks existing course titles before processing. If a course with the same title already exists in the vector store, it's skipped. To reload a course, clear the vector store first.

Tool Definition

search_tools.py defines the search_course_content tool with three parameters:

query (required): What to search for
course_name (optional): Fuzzy-matched against course_catalog
lesson_number (optional): Filter to specific lesson

The system prompt instructs Claude to use this tool strategically, not for every query.

Search Filtering

vector_store.py:118-133 builds ChromaDB filters:

Both course + lesson: {"$and": [{"course_title": "..."}, {"lesson_number": N}]}
Course only: {"course_title": "..."}
Lesson only: {"lesson_number": N}
Neither: No filter (search all content)

Session Management

Sessions are created implicitly if no session_id is provided. Frontend passes session_id back to maintain conversation context. Sessions are stored in-memory (lost on restart).

Key Files

app.py: FastAPI application, startup document loading, API endpoints
rag_system.py: Main orchestration, coordinates all components
vector_store.py: ChromaDB wrapper, dual collection management, search logic
ai_generator.py: Claude API integration, tool call handling
document_processor.py: Metadata extraction, chunking algorithm
search_tools.py: Tool definitions for Claude function calling
session_manager.py: Conversation history management
config.py: Centralized configuration
models.py: Pydantic data models (Course, Lesson, CourseChunk)

Frontend

Vanilla JavaScript application (frontend/) with no framework dependencies:

index.html: Chat UI structure
script.js: API communication, message handling
style.css: Responsive styling

Frontend communicates with backend via /api/query POST endpoint, receives responses with {answer, sources, session_id}.

Extending the System

Adding New Course Sources

Place files in docs/ folder matching the required format. Supported extensions: .txt, .pdf, .docx. Server automatically loads on startup.

Modifying Chunking Behavior

Edit CHUNK_SIZE and CHUNK_OVERLAP in config.py. Larger chunks provide more context but reduce granularity. More overlap improves context preservation but increases storage.

Changing Search Results1

Modify MAX_RESULTS in config.py to return more/fewer chunks per search. More results give Claude more context but increase token usage.

Adjusting Conversation Memory

Change MAX_HISTORY in config.py. Higher values retain more context but increase token costs. Each exchange = 2 messages (user + assistant).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Development Commands

Setup

Running the Application

Adding Course Documents

Architecture

RAG Pipeline Flow

Dual Collection Strategy

Component Relationships

Text Chunking Strategy

Document Format Requirements

Configuration

Important Patterns

Document Deduplication

Tool Definition

Search Filtering

Session Management

Key Files

Frontend

Extending the System

Adding New Course Sources

Modifying Chunking Behavior

Changing Search Results1

Adjusting Conversation Memory

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Development Commands

Setup

Running the Application

Adding Course Documents

Architecture

RAG Pipeline Flow

Dual Collection Strategy

Component Relationships

Text Chunking Strategy

Document Format Requirements

Configuration

Important Patterns

Document Deduplication

Tool Definition

Search Filtering

Session Management

Key Files

Frontend

Extending the System

Adding New Course Sources

Modifying Chunking Behavior

Changing Search Results1

Adjusting Conversation Memory