This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is a Retrieval-Augmented Generation (RAG) chatbot system that answers questions about DeepLearning.AI course materials. It uses ChromaDB for vector storage, Anthropic's Claude API with tool calling, and provides a web interface for conversational queries.
# Install dependencies (uses uv package manager)
uv sync
# Set up environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY# Quick start (recommended)
./run.sh
# Manual start (from project root)
cd backend && uv run uvicorn app:app --reload --port 8000
# Access points
# - Web UI: http://localhost:8000
# - API docs: http://localhost:8000/docsPlace text files in the docs/ folder. Files are automatically loaded on server startup. See "Document Format Requirements" below.
The system implements a tool-based RAG architecture where Claude decides when to search:
User Query → FastAPI → RAGSystem → AIGenerator → Claude API
↓
(Claude calls tool)
↓
SearchTool → VectorStore → ChromaDB
↓
(search results returned)
↓
Claude synthesizes response
↓
SessionManager stores history
↓
Return answer + sources
Key architectural decision: Claude has search as a callable tool, not an always-on feature. The system prompt instructs Claude to call search_course_content when needed, making searches contextual rather than automatic.
ChromaDB uses two separate collections with different purposes:
-
course_catalog(vector_store.py:51)- Purpose: Fuzzy matching of course names for search filtering
- Documents: Course titles only
- Metadata: Full course info (instructor, links, lesson metadata)
- IDs: Course title (serves as unique identifier)
- Usage: When user specifies a course name, semantic search finds the best match
-
course_content(vector_store.py:52)- Purpose: Actual semantic search of course material
- Documents: Text chunks with enriched context
- Metadata:
{course_title, lesson_number, chunk_index} - IDs:
"{course_title_snake_case}_{chunk_index}" - Usage: Primary search collection for answering queries
This separation enables fuzzy course name matching (e.g., "MCP" → "MCP: Build Rich-Context AI Apps") before searching content.
rag_system.py is the orchestration layer that:
- Coordinates all components (VectorStore, AIGenerator, SearchTool, SessionManager)
- Manages document ingestion and deduplication
- Handles query flow from input to response
ai_generator.py handles Claude API interactions:
- Builds API requests with system prompt, history, and tool definitions
- Processes tool calls from Claude
- Extracts responses and sources from Claude's output
- Uses temperature=0 for deterministic responses
session_manager.py maintains conversation state:
- Thread-safe session storage with dict-based in-memory storage
- Automatically trims history to last
MAX_HISTORYexchanges (default: 2) - Each session tracks conversation context for multi-turn queries
document_processor.py:25-91 implements sentence-aware chunking:
- Sentence splitting using regex that handles abbreviations (Mr., Dr., etc.)
- Chunk building up to 800 characters per chunk
- Overlap calculation - 100 characters shared between consecutive chunks by counting backwards from chunk end
- Context enrichment - First chunk of each lesson prefixed with
"Lesson N content: ...", last lesson chunks include course title
This preserves semantic boundaries and context across chunk boundaries.
Course documents must follow this structure:
Course Title: [title]
Course Link: [url]
Course Instructor: [name]
Lesson 0: [title]
Lesson Link: [url]
[content...]
Lesson 1: [title]
Lesson Link: [url]
[content...]
Processing behavior:
- Lines 1-3: Metadata extraction with regex matching
- Remaining lines: Parsed for
^Lesson\s+(\d+):\s*(.+)$markers - Content between lesson markers becomes lesson content
- Lesson links (optional) must appear immediately after lesson headers
- If no lesson markers found, entire file treated as single document
All configuration in backend/config.py as a dataclass:
CHUNK_SIZE: 800 characters (sentence-aware, not hard cutoff)CHUNK_OVERLAP: 100 characters between chunksMAX_RESULTS: 5 search results per queryMAX_HISTORY: 2 conversation exchanges retainedEMBEDDING_MODEL: "all-MiniLM-L6-v2" (384-dimensional embeddings)ANTHROPIC_MODEL: "claude-sonnet-4-20250514"CHROMA_PATH: "./chroma_db" (persistent vector storage)
rag_system.py:76 checks existing course titles before processing. If a course with the same title already exists in the vector store, it's skipped. To reload a course, clear the vector store first.
search_tools.py defines the search_course_content tool with three parameters:
query(required): What to search forcourse_name(optional): Fuzzy-matched against course_cataloglesson_number(optional): Filter to specific lesson
The system prompt instructs Claude to use this tool strategically, not for every query.
vector_store.py:118-133 builds ChromaDB filters:
- Both course + lesson:
{"$and": [{"course_title": "..."}, {"lesson_number": N}]} - Course only:
{"course_title": "..."} - Lesson only:
{"lesson_number": N} - Neither: No filter (search all content)
Sessions are created implicitly if no session_id is provided. Frontend passes session_id back to maintain conversation context. Sessions are stored in-memory (lost on restart).
- app.py: FastAPI application, startup document loading, API endpoints
- rag_system.py: Main orchestration, coordinates all components
- vector_store.py: ChromaDB wrapper, dual collection management, search logic
- ai_generator.py: Claude API integration, tool call handling
- document_processor.py: Metadata extraction, chunking algorithm
- search_tools.py: Tool definitions for Claude function calling
- session_manager.py: Conversation history management
- config.py: Centralized configuration
- models.py: Pydantic data models (Course, Lesson, CourseChunk)
Vanilla JavaScript application (frontend/) with no framework dependencies:
- index.html: Chat UI structure
- script.js: API communication, message handling
- style.css: Responsive styling
Frontend communicates with backend via /api/query POST endpoint, receives responses with {answer, sources, session_id}.
Place files in docs/ folder matching the required format. Supported extensions: .txt, .pdf, .docx. Server automatically loads on startup.
Edit CHUNK_SIZE and CHUNK_OVERLAP in config.py. Larger chunks provide more context but reduce granularity. More overlap improves context preservation but increases storage.
Modify MAX_RESULTS in config.py to return more/fewer chunks per search. More results give Claude more context but increase token usage.
Change MAX_HISTORY in config.py. Higher values retain more context but increase token costs. Each exchange = 2 messages (user + assistant).