- Project: RethinkAI Dorchester community assistant combining SQL, RAG, and hybrid LLM routing.
- Key code:
main_chat/main_chat.py(routing + cache),api/api.py(Flask API v2.0), static frontendspublic/(prod-style) - Data: MySQL tables (311/911/events) and Chroma-style vector DB in
main_chat/vectordb. Ingestion lives inmain_chat/data_ingestion/. - Auth: API key header
RethinkAI-API-Keyis required
- Client: Vanilla HTML/CSS/JS frontends (
public) call REST API athttp://127.0.0.1:8888with API key; show chat, events, API tester. - API layer: Flask app (
api/api.py) exposes/chat,/log,/events,/health. Uses session cookie + per-session in-memory cache. CORS enabled for all origins; credentials allowed. - Orchestration:
chat_route.pyroutes a user message to SQL, RAG, or hybrid using LLM-based planner. Maintains retrieval cache (SQL rows, RAG chunks, metadata, answer) and can answer from history/cache. - Structured data path (SQL):
_run_sqlgenerates/executed SQL (MySQL viamysql-connector-python) againstrethink_ai_bostonDB, returns rows/columns plus generated answer. - Unstructured path (RAG):
retrievalmodule (inmain_chat/rag_pipeline) queries vector DB (vectordb) and feeds chunks/metadata to Gemini to compose answer. - Hybrid path: Executes both SQL and RAG; merges answers and sources.
- Logging:
/logand/chatcalllog_interactionto store query/response/mode in MySQLinteraction_log;/eventsreadsweekly_events. - Event data:
/eventsreturns upcoming events (start_date within configurable days ahead) limited by query params.
- Backend: Flask app (
api/api.py), MySQL connection pool, session cache, endpoints/chat,/log,/events,/health. - Core logic:
main_chat/chat_route.py(env bootstrap, vectordb path fix, LLM client setup, routing, cache mgmt, history reuse). - Data ingestion:
main_chat/data_ingestion/(Google Drive/email sync, DB setup, vector rebuild) — not part of runtime API but seeds data. - Frontends:
public/(production-like static UI); configurableAPI_BASE_URLandAPI_KEYin JS.
- Python 3.11+, Flask 3.x, mysql-connector-python, google-genai, dotenv, pandas/numpy, plotly/dash (legacy), httpx/requests.
- MySQL 8.x for structured data.
- Vector store: Chroma-style files under
main_chat/vectordb. - Frontend: vanilla JS/CSS, served via
python -m http.server(no build).
- Browser sends chat message with API key →
/chat. - Middleware validates key, sets session_id (cookie).
- Per-session cache loaded;
_check_if_needs_new_datamay reuse cached SQL/RAG data for follow-ups. _route_questionselectssql|rag|hybrid; executes_run_sql(MySQL) or_run_rag(vector DB + Gemini) or_run_hybrid(both).- Response assembled with answer + sources (table names or rag metadata); cache stored per session.
- Interaction logged to MySQL
interaction_log. Frontend displays response, sources, mode; events panel uses/events.
.envat repo root providesGEMINI_API_KEY,RETHINKAI_API_KEYS, MySQL creds,VECTORDB_DIR(overridden tomain_chat/vectordb_new), etc.- config.py loads all .env constants and is imported where needed.
- Vector DB already populated by ingestion; MySQL has required tables (
weekly_events, 311/911 tables,interaction_log). - API and frontend run on same machine; CORS open but auth enforced by header key.
- Session cookies persist a week; in-memory caches acceptable for single-instance runs.
- Missing/invalid API key → 401.
- MySQL unavailable → degraded health, 500s on
/chat//events//log. - Vector DB path mis-set if
main_chat/vectordb_newmissing;_fix_retrieval_vectordb_pathattempts to correct. - Cache growth:
_cleanup_old_cachestrims stale sessions (>60 min) and caps at 100; still process memory bound. - Large conversations: history limited to last 10 messages when checking reuse.
- RAG metadata may lack
source/doc_type; sources fallback to "Unknown". /eventsclampslimit(1–100) anddays_ahead(1–30).
- Dev:
python api/api.py(debug=True). Frontend:cd public && python -m http.server 8000. - Production: run Flask via WSGI (e.g., gunicorn), set
FLASK_SESSION_COOKIE_SECURE=True, restrict CORS origins, store secrets via env vars, add HTTPS terminator/reverse proxy. Configure DB backups/monitoring. - Scaling: in-memory session cache is per-process; use shared cache (Redis) or disable caching for multi-instance deployments. Ensure vector DB path accessible to all workers.
- Externalize session cache to Redis and add TTL metrics; evict by LRU.
- Add rate limiting per API key/session and structured audit logging.
- Strengthen
/eventsquery with pagination and date filters by end_date/timezones. - Add unit/integration tests for routing decisions and cache reuse; stub LLM.
- Harden CORS to allowed origins and rotate API keys; consider JWT instead of static keys.
- Add health subchecks (vector DB, Gemini key validity).
- Provide Docker Compose for full stack (API + MySQL + static server) beyond demo DB.
- Document schema for 311/911 tables in
dataset-documentation/and link from API docs.
- LLM defaults to
gemini-2.5-flash-lite; override via envGEMINI_MODEL.