A production-grade Multi-Agent AI System for conversational data analysis. Upload a CSV, ask questions in natural language β specialized AI agents collaborate to analyze data, generate interactive visualizations, and create professional PDF/PowerPoint reports in real time.
Key Feature: Agents maintain context across the conversation. Generate charts, then ask for a report with "these charts" β the system remembers and reuses them intelligently.
User: "Analyze this sales data and create some insightful charts"
β Code Interpreter Agent computes metrics (revenue, trends, categories)
β Visualization Agent creates interactive Plotly charts
β Charts are persisted to session for future use
User: "Create a PDF and PPT report from these charts with executive summary"
β Orchestrator recognizes existing charts, skips regeneration
β Presentation Agent generates professional reports with AI summaries
β PDF and PowerPoint files ready for download
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Browser (index.html) β
β β’ Single-page chat interface β
β β’ Session ID persisted in memory β
β β’ Real-time SSE streaming β
β β’ Interactive Plotly charts β
ββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β
POST /api/chat?session_id=xxx&file_id=xxx
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (api container) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β π§ Orchestrator Agent β β
β β β β
β β β’ Receives user query + conversation history β β
β β β’ Loads session artifacts (charts) from Redis β β
β β β’ Builds system prompt with context β β
β β β’ Uses Gemini function calling to route to specialists β β
β β β’ Implements smart chart reuse logic β β
β β β’ Streams SSE events back to browser β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββΌβββββββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ β
β β π§ Code β β π Visualization β β π Presentation β β
β β Interpreter β β Agent β β Agent β β
β β β β β β β β
β β β’ Gemini writes β β β’ Gemini writes β β β’ Synthesizes β β
β β Python code β β Plotly specs β β findings β β
β β β’ Executor runs β β β’ Multi-chart β β β’ AI executive β β
β β in sandbox β β support β β summaries β β
β β β’ Self-corrects β β β’ Persists to β β β’ Generates β β
β β on errors β β session β β PDF/PPTX β β
β ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ β
β β β β β
βββββββββββββΌββββββββββββββββββββββΌββββββββββββββββββββββΌβββββββββββββββββββββ
β β β
βΌ β β
ββββββββββββββββββββββββ β β
β Executor Sidecar β β β
β (Docker sandbox) β β β
β β β β
β β’ 256MB RAM limit β β β
β β’ No network access β β β
β β’ Read-only FS β β β
β β’ 30s timeout β β β
β β’ pandas + numpy β β β
ββββββββββββββββββββββββ β β
β β
βββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββ΄βββββββββββββββββββββ
β Redis 7 β
β β
β session:{sid}:meta β Session metadata (30-day TTL) β
β session:{sid}:messages β Conversation history β
β session:{sid}:artifacts β Generated charts (for cross-request reuse) β
β session:{sid}:active_fileβ Current file reference β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββ
β Shared Volume (/data) β
β β
β /data/uploads/{file_id}.parquet β Uploaded CSVs converted to Parquet β
β /data/reports/{session_id}/ β Generated PDF/PPTX reports β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This diagram shows how a multi-turn conversation flows through the system:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TURN 1: "Analyze the data and create some insightful charts" β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββ
β Orchestrator Agent β
β β’ Loads empty session β
β β’ Builds system prompt β
β β’ Calls Gemini with tools β
βββββββββββββββββββββββββββββββββ
β
Gemini decides: "Need to analyze data first"
β
βΌ
βββββββββββββββββββββββββββββββββ
β FunctionCall: β
β run_code_interpreter β
β {task: "analyze sales data, β
β compute key metrics"} β
βββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CODE INTERPRETER AGENT β
β β
β 1. Build schema context (columns, dtypes, row count) β
β 2. Ask Gemini to write Python analysis code β
β 3. Send code to Executor sidecar β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β EXECUTION (in sandbox) β β
β β β β
β β df = pd.read_parquet('/data/uploads/{file_id}.parquet') β β
β β # ... analysis code ... β β
β β result = {"total_revenue": 150000, "top_products": [...]} β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββ΄ββββββββββββ β
β β β β
β SUCCESS ERROR β
β β β β
β β ββββββββββΌβββββββββ β
β β β SELF-CORRECTION ββββββββββββββββ β
β β β (max 2 retries) β β β
β β β β β β
β β β Feed error back ββββΊ Retry ββββΊβ β
β β β to Gemini β β β
β β βββββββββββββββββββ β β
β β β β
β βΌ β β
β Return: stdout, result_dict, code β β
β β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββ΄ββββββββββββββββ
β FunctionResponse to Gemini β
β {analysis_output: "...", β
β computed_metrics: {...}} β
βββββββββββββββββββββββββββββββββ
β
Gemini decides: "Now create visualizations"
β
βΌ
βββββββββββββββββββββββββββββββββ
β FunctionCall: β
β run_visualization_agent β
β {chart_type: "bar", β
β chart_title: "Revenue..."} β
βββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VISUALIZATION AGENT β
β β
β 1. Build prompt with analysis results + chart requirements β
β 2. Ask Gemini to write Plotly chart specifications β
β 3. Parse and validate JSON specs β
β 4. Convert to PNG for reports (using Kaleido) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CHART GENERATION β β
β β β β
β β {"data": [{"type": "bar", "x": [...], "y": [...]}], β β
β β "layout": {"title": {"text": "Monthly Revenue Trend"}}} β β
β β β β
β β β Emitted to browser as SSE event (interactive) β β
β β β Saved to Redis session (for future requests) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββ΄ββββββββββββββββββββββββ
β β
SSE: chart_plotly Save to Redis
(to browser) session:{sid}:artifacts
β β
β ββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββ
β FunctionResponse to Gemini β
β {chart_generated: true, β
β chart_title: "Monthly Revenue..."} β
βββββββββββββββββββββββββββββββββββββββββββββ
β
Gemini decides: "Analysis complete, return summary"
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββ
β FINAL TEXT RESPONSE β
β "I've analyzed the data and created 2 β
β insightful charts: 1. Monthly Revenue β
β Trend... 2. Revenue by Category..." β
βββββββββββββββββββββββββββββββββββββββββββββ
β
SSE: text (streamed)
β
βΌ
ββββββββββββββββββββββββ
β END OF TURN 1 β
ββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TURN 2: "Create a PDF and PPT report from these charts with executive summary" β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββ
β Orchestrator Agent β
β β
β π¦ LOADS SESSION ARTIFACTS β
β β’ 2 charts from Turn 1 β
β β’ Adds to system prompt β
β β
β System prompt now includes: β
β "Session has 2 charts: β
β 1. Monthly Revenue Trend β
β 2. Revenue by Category" β
βββββββββββββββββββββββββββββββββ
β
Gemini sees: "User wants report with EXISTING charts"
Gemini decides: "Skip visualization, go to presentation"
β
βΌ
βββββββββββββββββββββββββββββββββ
β FunctionCall: β
β run_presentation_agent β
β {instructions: "Create PDF β
β and PPTX with executive β
β summary using existing β
β charts"} β
βββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SMART CHART REUSE β
β β
β Orchestrator checks: Should we reuse existing charts? β
β β
β Query analysis: β
β β’ "these charts" β REUSE INDICATOR β β
β β’ "create a report" β REUSE INDICATOR β β
β β’ No "new chart", "different", "another" β NOT regenerate β β
β β
β Decision: β»οΈ REUSE EXISTING CHARTS β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PRESENTATION AGENT β
β β
β 1. Validate data sufficiency β
β β’ Has 2 charts from session artifacts β β
β β’ Sufficient for report generation β
β β
β 2. Detect format intent: PDF + PPTX β
β β
β 3. Generate AI Executive Summary β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Gemini generates business-focused summary: β β
β β "This analysis reveals strong revenue growth with Electronics leading..." β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β 4. Generate Reports β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PDF GENERATION (ReportLab) β β
β β β’ Title page with timestamp β β
β β β’ Executive summary section β β
β β β’ Chart images (PNG via Kaleido) β β
β β β’ Key metrics and insights β β
β β β β
β β PPTX GENERATION (python-pptx) β β
β β β’ Title slide β β
β β β’ Executive summary slide β β
β β β’ One slide per chart β β
β β β’ Key findings slide β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β 5. Save to /data/reports/{session_id}/ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββ
β SSE Events Emitted β
β β
β β’ agent_switch: "Presentation Agent..." β
β β’ text: "I'm generating the PDF and β
β PowerPoint reports..." β
β β’ report_files: [{pdf_url}, {pptx_url}] β
β β’ done β
βββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββ
β END OF TURN 2 β
ββββββββββββββββββββββββ
User Query
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Is there an active dataset? β
β OR analytical keywords? β
β OR existing session artifacts? β
βββββββββββββββββββββββββββββββββββ
β β
YES NO
β β
βΌ βΌ
βββββββββββββββ βββββββββββββββ
β Tool Loop β β Direct Chat β
β (Agents) β β Response β
βββββββββββββββ βββββββββββββββ
Visualization Agent Called
β
βΌ
ββββββββββββββββββββββββββββββββββββββ
β Do we have existing session charts?β
ββββββββββββββββββββββββββββββββββββββ
β β
YES NO
β β
βΌ βΌ
βββββββββββββββββββ β
β Analyze Query: β β
β Reuse or New? β β
βββββββββββββββββββ β
β β β
REUSE NEW β
β β β
βΌ ββββββββββββ΄βββββββ
βββββββββββββββββββ β
β Return existing β β
β charts, skip β βΌ
β regeneration β βββββββββββββββββββ
βββββββββββββββββββ β Generate new β
β charts via β
β Gemini + Plotly β
βββββββββββββββββββ
REUSE INDICATORS: NEW CHART INDICATORS:
β’ "these charts" β’ "new chart"
β’ "the charts" β’ "different chart"
β’ "create a report" β’ "show me a histogram"
β’ "generate presentation" β’ "create a pie chart"
β’ "pdf with" β’ "visualize X"
Execute Code
β
βΌ
ββββββββββββ
β Success? β
ββββββββββββ
β β
YES NO
β β
β βΌ
β βββββββββββββββββββββββ
β β Attempt < 3? β
β βββββββββββββββββββββββ
β β β
β YES NO
β β β
β βΌ βΌ
β βββββββββββ βββββββββββ
β βFeed err β β Return β
β βto Geminiβ β Error β
β βfor fix β β Result β
β ββββββ¬βββββ βββββββββββ
β β
β ββββββββΊ Execute Fixed Code
β β
β βΌ
β [Loop back to "Success?"]
β
βΌ
βββββββββββββββββββ
β Return Results: β
β stdout, result, β
β code β
βββββββββββββββββββ
The chat endpoint streams Server-Sent Events with typed JSON payloads:
| Event Type | Description | Example |
|---|---|---|
agent_switch |
Agent started working | {"type": "agent_switch", "content": "Code Interpreter Agent is working..."} |
text |
Markdown text chunk | {"type": "text", "content": "The analysis shows..."} |
code |
Generated Python code | {"type": "code", "content": "df.groupby('category')..."} |
chart_plotly |
Interactive chart JSON | {"type": "chart_plotly", "content": "{\"data\": [...]}"} |
report_files |
Generated report URLs | {"type": "report_files", "content": "[{\"type\": \"pdf\", \"url\": \"...\"}]"} |
error |
Error message | {"type": "error", "content": "Execution failed: ..."} |
done |
Stream complete | {"type": "done", "content": ""} |
- Python 3.12+
- Docker + Docker Compose
- Gemini API key from Google AI Studio
# 1. Clone the repository
git clone <repo>
cd agentic-data-analysis
# 2. Create .env file
echo "GEMINI_API_KEY=your_key_here" > .env
# 3. Start all services
docker compose up --build
# 4. Open browser
open http://localhost:8000This example demonstrates the key feature of cross-request artifact persistence. Charts generated in the first request are automatically reused in the follow-up request for report generation.
Sample Data: Upload tests/sample_data/sales_transactions.csv
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π€ TURN 1 β
β β
β User: "Analyse the data and create some insightful charts from this data" β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Agent Flow: β β
β β β β
β β π§ Code Interpreter β π Visualization β β
β β β’ Computes metrics β’ Creates "Monthly Revenue Trend" chart β β
β β β’ Analyzes trends β’ Creates "Revenue by Category" chart β β
β β β’ Aggregates data β’ Charts saved to session β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Response: Interactive Plotly charts displayed + analysis summary β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β Session persists 2 charts in Redis
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π€ TURN 2 (Follow-up) β
β β
β User: "Can you create a PDF and PPT report from these charts? β
β Do include some descriptive executive summary in each" β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Agent Flow: β β
β β β β
β β π¦ Load Session Artifacts β π Presentation Agent β β
β β β’ Finds 2 existing charts β’ Generates AI executive summary β β
β β β’ Skips regeneration β»οΈ β’ Creates PDF with charts + summary β β
β β β’ Passes to presentation β’ Creates PPTX with charts + summaryβ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Response: PDF and PowerPoint download links + brief summary β
β β
β β¨ KEY: Charts from Turn 1 were REUSED, not regenerated! β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Check out example outputs generated by the system in the sample_reports/ folder:
| File | Description |
|---|---|
Executive Data Report.pdf |
PDF report with charts, executive summary, and key insights |
Executive Data Summary Report.pptx |
PowerPoint presentation with chart slides and summary |
These were generated using the demo conversation above with sales_transactions.csv.
| Query | Agent Flow |
|---|---|
"Summarize this dataset" |
Code Interpreter β Presentation |
"What are the top 5 products by revenue?" |
Code Interpreter |
"Show a bar chart of sales by category" |
Code Interpreter β Visualization |
"Create a new pie chart of market share" |
Visualization (generates new chart) |
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Chat UI |
/api/sessions |
POST | Create new session |
/api/sessions/{id}/messages |
GET | Get conversation history |
/api/upload |
POST | Upload CSV file |
/api/chat |
POST | Chat with SSE streaming |
/api/health |
GET | Health check |
/api/reports/{session_id}/{filename} |
GET | Download generated report |
curl -X POST "http://localhost:8000/api/chat?session_id=xxx&file_id=xxx" \
-H "Content-Type: application/json" \
-d '{"query": "Analyze this data and create charts"}'Response: SSE stream with typed events (see SSE Event Types above).
The session_id is returned in the X-Session-ID response header on first request and should be passed on subsequent requests to maintain conversation context.
| Variable | Default | Description |
|---|---|---|
GEMINI_API_KEY |
(required) | Google AI Studio API key |
GEMINI_MODEL |
gemini-2.5-flash |
Gemini model to use |
REDIS_URL |
redis://localhost:6379 |
Redis connection URL |
EXECUTOR_URL |
http://localhost:8080 |
Code executor URL |
DATA_DIR |
./uploads |
Upload directory |
MAX_UPLOAD_SIZE_MB |
50 |
Max file size |
agentic-data-analysis/
βββ app/
β βββ agents/
β β βββ base.py # BaseAgent abstract class
β β βββ orchestrator.py # π§ Main coordinator + tool loop
β β βββ code_interpreter.py # π§ Python code generation & execution
β β βββ visualization.py # π Plotly chart generation
β β βββ presentation.py # π Report synthesis & PDF/PPTX
β βββ api/routes/
β β βββ chat.py # SSE streaming chat endpoint
β β βββ files.py # File upload endpoint
β β βββ sessions.py # Session management
β βββ models/
β β βββ handoff.py # AgentHandoff, AgentResult, GeneratedArtifact
β β βββ file.py # UploadedFile model
β β βββ schemas.py # API schemas
β βββ services/
β β βββ gemini_client.py # Gemini API wrapper
β β βββ redis_client.py # Session & artifact storage
β β βββ executor_client.py # Executor HTTP client
β β βββ file_manager.py # CSV β Parquet conversion
β β βββ report_manager.py # Report orchestration
β β βββ pdf_generator.py # ReportLab PDF generation
β β βββ pptx_generator.py # python-pptx PowerPoint generation
β βββ static/
β β βββ index.html # Single-page chat UI
β βββ config.py # Settings
β βββ dependencies.py # FastAPI DI
β βββ main.py # App entry point
βββ executor/
β βββ server.py # Flask sandbox server
β βββ Dockerfile # Executor container
βββ docker-compose.yml
βββ Dockerfile
βββ pyproject.toml
- Sandboxed Execution: Code runs in isolated Docker container
- 256MB RAM limit
- 0.5 CPU limit
- No network access
- Read-only filesystem (except /tmp)
- 30-second timeout
- Whitelisted imports only (pandas, numpy, etc.)
- Session Isolation: Each session has isolated artifacts
- File Validation: CSV-only uploads with size limits
| Concern | Decision |
|---|---|
| Agent Dispatch | Gemini function calling with AUTO mode β model decides routing |
| Chart Persistence | Redis session artifacts β charts survive across requests |
| Smart Reuse | Query analysis determines reuse vs. regenerate charts |
| Self-Correction | Up to 2 automatic retry attempts on code execution errors |
| Streaming | SSE with typed chunks for real-time UX |
| Reports | ReportLab (PDF) + python-pptx (PPTX) with Kaleido PNG conversion |
| Executive Summary | AI-generated business-focused summary via Gemini |