PytorchConnectomics · akgohain · Apr 7, 2026 · Apr 3, 2026 · Apr 7, 2026 · Apr 7, 2026
diff --git a/.gitignore b/.gitignore
@@ -13,3 +13,4 @@ __pycache__
 sql_app.db
 uploads
 pytorch_connectomics
+server_api/chatbot/faiss_index/
diff --git a/README.md b/README.md
@@ -37,6 +37,23 @@ PYTC_ALLOWED_ORIGINS=http://localhost:3000,http://127.0.0.1:3000,null
 PYTC_NEUROGLANCER_PUBLIC_BASE=http://localhost:4244
 ```
 
+## Chatbot Docs Index
+
+The chatbot's FAISS index is generated locally from the markdown files in
+`server_api/chatbot/file_summaries/` and should not be committed to git.
+
+When you update those markdown docs, rebuild the generated index with:
+
+```
+uv run python server_api/chatbot/update_faiss.py
+```
+
+You can override the embeddings endpoint if needed:
+
+```
+OLLAMA_BASE_URL=http://cscigpu08.bc.edu:4443 uv run python server_api/chatbot/update_faiss.py
+```
+
 If restarting after a crash or interrupted session, kill any lingering processes first:
 
 ```

diff --git a/scripts/dev.sh b/scripts/dev.sh
@@ -4,7 +4,7 @@ set -euo pipefail
 
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 CLIENT_DIR="${ROOT_DIR}/client"
-DEFAULT_OLLAMA_BASE_URL="http://localhost:11434"
+DEFAULT_OLLAMA_BASE_URL="http://cscigpu08.bc.edu:4443"
 DEFAULT_OLLAMA_MODEL="llama3.1:8b"
 
 if ! command -v uv >/dev/null 2>&1; then

diff --git a/scripts/start.sh b/scripts/start.sh
@@ -4,7 +4,7 @@ set -euo pipefail
 
 # Prefer Homebrew's Node over nvm to avoid version conflicts
 export PATH="/opt/homebrew/bin:$PATH"
-export OLLAMA_BASE_URL="http://cscigpu08.bc.edu:11434"
+export OLLAMA_BASE_URL="http://cscigpu08.bc.edu:4443"
 export OLLAMA_MODEL="gpt-oss:20b"
 export OLLAMA_EMBED_MODEL="qwen3-embedding:8b"
 

diff --git a/server_api/chatbot/chatbot.py b/server_api/chatbot/chatbot.py
@@ -14,6 +14,7 @@
 from langchain_core.tools import tool
 from langchain.agents import create_agent
 from server_api.utils.utils import process_path
+from server_api.chatbot.update_faiss import ensure_faiss_index
 from server_api.chatbot.tools import (
     list_training_configs,
     read_config,
@@ -92,18 +93,19 @@
 You help end users navigate and use the PyTC Client application.
 
 ROUTING — decide which tool to use BEFORE calling anything:
-- **UI, navigation, features, shortcuts, workflows** → search_documentation
-- **Training config, hyperparameters, training commands** → delegate_to_training_agent
-- **Inference, checkpoints, evaluation commands** → delegate_to_inference_agent
+- **UI, navigation, features, shortcuts, workflows, "how do I..." questions** → search_documentation
+- **Generate a specific training/inference command** → delegate_to_training_agent or delegate_to_inference_agent
 - **General/greeting/off-topic** → answer directly, no tool needed
 
 CRITICAL RULES:
-1. **For application questions, ground answers in retrieved documentation.** Call search_documentation and base your answer on the returned text. Do NOT invent features, shortcuts, buttons, or workflows.
-2. **Do not fabricate specifics.** Never make up keyboard shortcuts, button labels, or step-by-step instructions unless they come from retrieved docs or a sub-agent response.
-3. **Answer every part of the user's question.** If they ask about two things, address both.
-4. **Use retrieved content even if wording differs.** If the documentation describes relevant features or workflows, use that information to answer the question. Don't claim something isn't documented just because it uses different terminology than the user's question.
-5. **HARD LIMIT: You may call search_documentation EXACTLY 2 times per user question.** After the second call, you MUST answer with the information already retrieved. Do NOT attempt a third search. If the tool returns "Search limit reached", immediately stop and answer based on what you already have.
-6. **Delegate, don't search, for training/inference tasks.** If the user asks for a training command or inference command, use the appropriate sub-agent directly.
+1. **ALWAYS search documentation first for UI questions.** If the user asks "how do I train", "what do I need to provide", "how do I start", "where do I configure", etc., use search_documentation to find the UI workflow in the docs. Do NOT delegate to sub-agents for these questions.
+2. **Only delegate to sub-agents when the user explicitly asks you to generate a command.** Examples: "give me the training command for...", "write the inference command for...", "what's the CLI command to...". If they're asking HOW to use the UI, search the docs instead.
+3. **For application questions, ground answers in retrieved documentation.** Call search_documentation and base your answer on the returned text. Do NOT invent features, shortcuts, buttons, or workflows.
+4. **Do not fabricate specifics.** Never make up keyboard shortcuts, button labels, or step-by-step instructions unless they come from retrieved docs or a sub-agent response.
+4a. **NEVER use command-line instructions for UI questions.** The PyTC Client is a desktop GUI application. If the user asks how to do something, explain the UI workflow (buttons, tabs, forms) from the documentation. Do NOT provide Python scripts, bash commands, or CLI examples unless the sub-agent explicitly generates them.
+5. **Answer every part of the user's question.** If they ask about two things, address both.
+6. **Use retrieved content even if wording differs.** If the documentation describes relevant features or workflows, use that information to answer the question. Don't claim something isn't documented just because it uses different terminology than the user's question.
+7. **HARD LIMIT: You may call search_documentation EXACTLY 2 times per user question.** After the second call, you MUST answer with the information already retrieved. Do NOT attempt a third search. If the tool returns "Search limit reached", immediately stop and answer based on what you already have.
 
 Sub-agents:
 - **Training Agent**: Config selection, training job setup, hyperparameter overrides
@@ -117,12 +119,17 @@
 
 def build_chain():
     """Build the multi-agent system with supervisor, training, and inference agents."""
-    ollama_base_url = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
+    ollama_base_url = os.getenv("OLLAMA_BASE_URL", "http://cscigpu08.bc.edu:4443")
     ollama_model = os.getenv("OLLAMA_MODEL", "gpt-oss:20b")
     ollama_embed_model = os.getenv("OLLAMA_EMBED_MODEL", "qwen3-embedding:8b")
     llm = ChatOllama(model=ollama_model, base_url=ollama_base_url, temperature=0)
     embeddings = OllamaEmbeddings(model=ollama_embed_model, base_url=ollama_base_url)
     faiss_path = process_path("server_api/chatbot/faiss_index")
+    if ensure_faiss_index(
+        model=ollama_embed_model,
+        base_url=ollama_base_url,
+    ):
+        print(f"[SEARCH] Generated chatbot FAISS index at {faiss_path}")
     vectorstore = FAISS.load_local(
         faiss_path,
         embeddings,
@@ -288,18 +295,23 @@ def delegate_to_inference_agent(task: str) -> str:
 
 
 def build_helper_chain():
-    """Build a lightweight RAG-only agent for inline field help.
-
-    Shares the same FAISS vectorstore and keyword-fallback docs as the main
+    """
+    Build a lightweight helper agent for inline field-level help.
+    This agent has access to the same search_documentation tool as the main
     chatbot but has NO access to training/inference sub-agents.
     Returns ``(agent, reset_search_counter)`` — same interface as ``build_chain``.
     """
-    ollama_base_url = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
+    ollama_base_url = os.getenv("OLLAMA_BASE_URL", "http://cscigpu08.bc.edu:4443")
     ollama_model = os.getenv("OLLAMA_MODEL", "gpt-oss:20b")
     ollama_embed_model = os.getenv("OLLAMA_EMBED_MODEL", "qwen3-embedding:8b")
     llm = ChatOllama(model=ollama_model, base_url=ollama_base_url, temperature=0)
     embeddings = OllamaEmbeddings(model=ollama_embed_model, base_url=ollama_base_url)
     faiss_path = process_path("server_api/chatbot/faiss_index")
+    if ensure_faiss_index(
+        model=ollama_embed_model,
+        base_url=ollama_base_url,
+    ):
+        print(f"[SEARCH] Generated chatbot FAISS index at {faiss_path}")
     vectorstore = FAISS.load_local(
         faiss_path,
         embeddings,

diff --git a/server_api/chatbot/faiss_index/index.faiss b/server_api/chatbot/faiss_index/index.faiss
diff --git a/server_api/chatbot/faiss_index/index.pkl b/server_api/chatbot/faiss_index/index.pkl
diff --git a/server_api/chatbot/file_summaries/ErrorHandlingTool.md b/server_api/chatbot/file_summaries/ErrorHandlingTool.md
diff --git a/server_api/chatbot/file_summaries/GettingStarted.md b/server_api/chatbot/file_summaries/GettingStarted.md
@@ -27,13 +27,33 @@ The application has three main areas:
 
 The chat panel appears as a sliding drawer on the right. To use it:
 
-1. Click the **AI Chat** button in the top navigation bar to open the drawer.
+1. Click the **AI Chat** button (message icon) in the top navigation bar to open the drawer.
 2. Type your question in the text input at the bottom and press **Enter** or click **Send**.
 3. The assistant will respond with guidance based on the application's documentation.
-4. Click **Clear Chat** to start a new conversation.
 
 The chat supports markdown formatting, including tables, code blocks, and lists.
 
+### Conversation History
+
+The chat drawer includes a **sidebar** on the left that lists your saved conversations:
+
+- **New chat (+)** — Click the **+** button at the top to start a fresh conversation.
+- **Switch conversations** — Click any conversation in the sidebar to load it.
+- **Delete a conversation** — Click the trash icon next to a conversation and confirm.
+- **Collapse/expand sidebar** — Use the fold/unfold button to hide or show the conversation list.
+
+Conversations are saved automatically as you chat. When you reopen the drawer, your past chats are still available.
+
+### Inline Help ("?" Buttons)
+
+Throughout the training and inference configuration forms, you will see small **?** buttons next to input fields. Clicking a **?** button opens a floating chat panel that:
+
+1. Automatically asks the AI to explain that specific setting and recommend a value.
+2. Lets you ask follow-up questions about the setting.
+3. Can be dragged around the screen and resized.
+
+This provides context-aware help without leaving the configuration page.
+
 ## Keyboard Shortcuts (Global)
 
 These standard editing shortcuts work throughout the application: