FileShot
diff --git a/‎.github/copilot-instructions.md‎
Lines changed: 18 additions & 0 deletions b/‎.github/copilot-instructions.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎main/agenticChat.js‎
Lines changed: 8 additions & 1 deletion b/‎main/agenticChat.js‎
Lines changed: 8 additions & 1 deletion
diff --git a/‎main/constants.js‎
Lines changed: 1 addition & 0 deletions b/‎main/constants.js‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎main/llmEngine.js‎
Lines changed: 24 additions & 11 deletions b/‎main/llmEngine.js‎
Lines changed: 24 additions & 11 deletions
diff --git a/‎package.json‎
Lines changed: 1 addition & 1 deletion b/‎package.json‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎preload.js‎
Lines changed: 1 addition & 0 deletions b/‎preload.js‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎src/components/Chat/ChatPanel.tsx‎
Lines changed: 71 additions & 1 deletion b/‎src/components/Chat/ChatPanel.tsx‎
Lines changed: 71 additions & 1 deletion
@@ -20,6 +20,23 @@ When the user says "read your instructions", "read the instructions", "read copi
 
 ---
 
+## ⚠ CRITICAL — THIS MACHINE IS THE DEV MACHINE ONLY — NEVER TOUCH THE WEBSITE SERVER FROM HERE
+
+**This computer (`C:\Users\brend\IDE`) is the DEVELOPMENT machine. It is NOT the web server.**
+
+- The live website (`graysoft.dev`) runs on a SEPARATE server.
+- That server is kept in sync via **Syncthing** — file changes pushed from this machine are automatically picked up by the server.
+- **NEVER run `npm run build` in `website/` from this machine** — the server handles its own build.
+- **NEVER run `pm2 restart`, `pm2 start`, or any PM2 command from a local terminal** on this machine.
+- **NEVER run `start-graysoft.bat` or `restart-graysoft.bat`** from this machine.
+- **NEVER attempt to restart the Cloudflare tunnel** from this machine.
+- To deploy a website change: edit the source file here → Syncthing syncs it to the server → then trigger rebuild via the control panel below.
+- **AUTHORIZED server control panel: https://cp.graysoft.dev** — login password: `diggabyte2026`
+- Use the control panel to trigger npm build, PM2 restart, or any server-side action needed to deploy changes.
+- After triggering via the panel, verify graysoft.dev visually to confirm the change is live.
+
+---
+
 ## ⚠ MANDATORY — CLEAR LOGS AFTER EVERY BUILD/TEST ITERATION
 
 After EVERY build, test run, or iteration where the user is about to test the app:
@@ -36,6 +53,7 @@ After EVERY build, test run, or iteration where the user is about to test the ap
 Read this list first. Every item has a full section below.
 
 - **TRIPWIRE** — First line of EVERY response must be `[Task: X | Last: Y]`
+- **DEV MACHINE ONLY** — NEVER run `npm run build`, PM2, or any server command in `website/` — server is separate, updated via Syncthing
 - **Read full instructions** — SEE TOP OF FILE. Every single time, no exceptions, no "I already remember them"
 - **No green checkmarks** — NEVER use ✅ ✔️ or say "ready", "working", "all set" to describe a fix
 - **Read code before responding** — Never assume. Verify everything with actual file reads
 
@@ -346,7 +346,7 @@ function register(ctx) {
                     const toolPrompt = mcpToolServer.getToolPromptForTask(cloudTaskType);
           const isBundledCloudProvider = cloudLLM._isBundledProvider(context.cloudProvider) && !cloudLLM.isUsingOwnKey(context.cloudProvider);
           const _brevityDirective = isBundledCloudProvider
-            ? '\n\nStyle rules (apply silently — never mention these rules to the user):\n- Always respond in a professional, clear, and articulate style with proper grammar, capitalization, and punctuation regardless of how the user writes.\n- Keep responses concise. For conversational or informational questions, use no more than 3 paragraphs. Never exceed 3 paragraphs for non-code responses.\n- For code or technical output, always provide the complete solution without padding or filler text.'
+            ? `\n\n## Style Rules (apply silently — never mention, reference, or apologize for these rules to the user)\n\n### Response length — hard limit\n- **Maximum 3 paragraphs** for any prose response. This limit is unconditional and applies to ALL non-code content: explanations, answers, summaries, stories, essays, creative writing, descriptions, and conversational replies.\n- If the user asks for something long or detailed (e.g. "write me a long story", "explain in depth", "be thorough") — write the best possible 3-paragraph version and stop. Do NOT explain the length, apologize for it, or mention that you are constrained. Simply deliver the best complete answer in 3 paragraphs.\n- Bullet lists count as prose when each bullet is a full sentence or longer. Keep bullet lists to a maximum of 5 items unless they are discrete technical items (file names, commands, error codes, parameters). Never use bullets as a way to extend past the 3-paragraph limit.\n- Code blocks, terminal output, file contents, structured data (tables, JSON, numbered technical steps), and inline code snippets are fully exempt from this limit. Always provide complete and correct code — never truncate.\n\n### Tone and style\n- Always write in a professional, clear, and articulate style with proper grammar, capitalization, and punctuation — regardless of how the user writes. Never mirror informal tone, typos, or lowercase writing.\n- Be direct. Lead with the answer. Never open with filler phrases like "That's a great question!", "Certainly!", "Of course!", "Absolutely!", or "Sure!".\n- Never end a response with hollow sign-offs like "I hope this helps!", "Let me know if you need anything else!", or "Feel free to ask!".`
             : '';
           const cloudSystemPrompt = systemPrompt + (toolPrompt ? '\n\n' + toolPrompt : '') + _brevityDirective;
 
@@ -1415,6 +1415,13 @@ function register(ctx) {
                   // llm-token text stream caused duplicate code bubbles when parseToolCall failed
                   // on aliased or alternate-format tool calls from small models. Suppressed here.
                   void funcCall;
+                },
+                (toolChunk) => {
+                  // Stream tool generation progress to the renderer for live bubble display.
+                  // The renderer shows a CollapsibleToolBlock with partial params as they stream in.
+                  if (mainWindow && !mainWindow.isDestroyed()) {
+                    mainWindow.webContents.send('llm-tool-generating', toolChunk);
+                  }
                 }
               );
               result = nativeResult;
 
@@ -88,6 +88,7 @@ const DEFAULT_COMPACT_PREAMBLE = `You are a local AI coding assistant with tools
 - **You do not know today's date or current real-world state. If asked for the date, time, or any live or time-sensitive information — call web_search immediately. Never state a current date, time, or real-world value from memory.**
 - Acknowledge the user's request, then call the tools needed — you have no knowledge of file contents until you read them
 - After tools return, explain what you found — don't just say a tool ran
+- After completing a tool call, always write at least one sentence confirming what was done — never end your response on a bare tool call with no acknowledgment
 - Never copy or repeat sentences you have already written in this response.
 - Ask a specific follow-up if you need more context
 
 
@@ -73,9 +73,10 @@ class LLMEngine extends EventEmitter {
       seed: -1,
     };
 
-    // User-configurable generation timeout (ms). Default 120s.
+    // User-configurable generation timeout (ms). Default 0 = no timeout (user cancels manually).
     // Can be updated live via Settings without reloading the model.
-    this.generationTimeoutMs = 120_000;
+    // Set > 0 in Settings to re-enable a hard timeout.
+    this.generationTimeoutMs = 0;
   }
 
   /**
@@ -973,13 +974,13 @@ After your brief acknowledgment, output ONLY the tool call blocks — no extra t
     this.abortController = new AbortController();
 
     // Generation safety timeout: abort if generation exceeds configured limit.
-    // Configurable via Settings UI — default 120s. Updates live without model reload.
+    // 0 = no timeout (users can cancel manually). Configurable in Settings.
     const GEN_TIMEOUT_MS = this.generationTimeoutMs;
-    const genTimeoutTimer = setTimeout(() => {
+    const genTimeoutTimer = GEN_TIMEOUT_MS > 0 ? setTimeout(() => {
       console.log(`[LLM] Generation timeout (${GEN_TIMEOUT_MS / 1000}s) — aborting to prevent hang`);
       this._lastAbortReason = 'timeout';
       this.cancelGeneration('timeout');
-    }, GEN_TIMEOUT_MS);
+    }, GEN_TIMEOUT_MS) : null;
 
     let fullResponse = '';
     let rawResponse = '';
@@ -1722,7 +1723,7 @@ After your brief acknowledgment, output ONLY the tool call blocks — no extra t
    * @param {Function} onFunctionCall - Called when a function call is generated
    * @returns {Object} {text, functionCalls: [{functionName, params}], stopReason}
    */
-  async generateWithFunctions(input, functions, params = {}, onToken, onThinkingToken, onFunctionCall) {
+  async generateWithFunctions(input, functions, params = {}, onToken, onThinkingToken, onFunctionCall, onToolGenerating) {
     if (!this.isReady || !this.chat) {
       throw new Error('Model not loaded. Please load a model first.');
     }
@@ -1758,15 +1759,18 @@ After your brief acknowledgment, output ONLY the tool call blocks — no extra t
     this.abortController = new AbortController();
 
     // Safety timeout — uses same configurable limit as generateStream()
+    // 0 = no timeout (user can cancel manually).
     const GEN_TIMEOUT_MS = this.generationTimeoutMs;
-    const genTimeoutTimer = setTimeout(() => {
+    const genTimeoutTimer = GEN_TIMEOUT_MS > 0 ? setTimeout(() => {
       console.log(`[LLM] Function-calling generation timeout — aborting`);
       this._lastAbortReason = 'timeout';
       this.cancelGeneration('timeout');
-    }, GEN_TIMEOUT_MS);
+    }, GEN_TIMEOUT_MS) : null;
 
     let fullResponse = '';
     let collectedFunctionCalls = [];
+    // Accumulate paramsChunk text per callIndex for live streaming to UI
+    const _paramsChunkBufs = {};
 
     try {
       this._compactHistory();
@@ -1821,9 +1825,18 @@ After your brief acknowledgment, output ONLY the tool call blocks — no extra t
             if (onFunctionCall) onFunctionCall(funcCall);
           },
           onFunctionCallParamsChunk: (chunk) => {
-            // Stream function call params as they generate (for UI feedback)
-            if (chunk.done && onToken) {
-              onToken(`\n\`\`\`json\n{"tool":"${chunk.functionName}","params":...}\n\`\`\`\n`);
+            // Accumulate paramsChunk text per callIndex and stream live to UI.
+            // This powers the streaming tool generation bubble in the renderer
+            // so users can see what the model is writing instead of a blank screen.
+            if (!_paramsChunkBufs[chunk.callIndex]) _paramsChunkBufs[chunk.callIndex] = '';
+            if (chunk.paramsChunk) _paramsChunkBufs[chunk.callIndex] += chunk.paramsChunk;
+            if (onToolGenerating) {
+              onToolGenerating({
+                callIndex: chunk.callIndex,
+                functionName: chunk.functionName,
+                paramsText: _paramsChunkBufs[chunk.callIndex],
+                done: !!chunk.done,
+              });
             }
           },
         } : {}),
 
@@ -1,6 +1,6 @@
 {
   "name": "guide-ide",
-  "version": "1.6.10",
+  "version": "1.6.11",
   "description": "guIDE - AI-Powered Offline IDE with local LLM, RAG, MCP tools, browser automation, and integrated terminal",
   "author": {
     "name": "Brendan Gray",
 
@@ -91,6 +91,7 @@ contextBridge.exposeInMainWorld('electronAPI', {
   onLlmReplaceLast: (callback) => _on('llm-replace-last', callback),
   onLlmStreamReset: (callback) => _on('llm-stream-reset', callback),
   onLlmIterationBegin: (callback) => _on('llm-iteration-begin', callback),
+  onLlmToolGenerating: (callback) => _on('llm-tool-generating', callback),
   onDevLog: (callback) => _on('dev-log', callback),
 
   // ── Model Management ──
 
@@ -153,7 +153,8 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
 
   const streaming = useChatStreaming();
   const { streamingText, thinkingSegments, setStreamingText, setThinkingSegments,
-    streamBufferRef, thinkingSegmentsRef, wasRespondingRef, streamEpochRef, activeEpochRef } = streaming;
+    streamBufferRef, thinkingSegmentsRef, wasRespondingRef, streamEpochRef, activeEpochRef,
+    waitForTypewriterDone } = streaming;
 
   const addSystemMessage = useCallback((content: string) => {
     setMessages(prev => [...prev, {
@@ -346,6 +347,9 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
     let executingTimeout: ReturnType<typeof setTimeout> | null = null;
 
     const cleanupExecuting = api.onToolExecuting?.((data: { tool: string; params: any }) => {
+      // Tool is now executing — clear the generating-phase bubble
+      generatingToolCallsRef.current = [];
+      setGeneratingToolCalls([]);
       const updated = [...executingToolsRef.current, { tool: data.tool, params: data.params }];
       executingToolsRef.current = updated;
       setExecutingTools(updated);
@@ -359,6 +363,9 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
 
     const cleanupResults = api.onMcpToolResults?.(() => {
       if (executingTimeout) clearTimeout(executingTimeout);
+      // Clear any lingering generating-phase bubbles
+      generatingToolCallsRef.current = [];
+      setGeneratingToolCalls([]);
       // BUG-NEW-A: Move currently-executing tools to completed so their pills stay visible
       // as ✓ checkmarks instead of vanishing the instant the tool finishes.
       const finished = executingToolsRef.current;
@@ -371,6 +378,16 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
       refreshPendingChanges();
     });
 
+    const cleanupToolGenerating = api.onLlmToolGenerating?.((data: { callIndex: number; functionName: string; paramsText: string; done: boolean }) => {
+      // Update or remove the entry for this callIndex
+      const filtered = generatingToolCallsRef.current.filter(t => t.callIndex !== data.callIndex);
+      if (!data.done) {
+        filtered.push({ callIndex: data.callIndex, functionName: data.functionName, paramsText: data.paramsText });
+      }
+      generatingToolCallsRef.current = filtered;
+      setGeneratingToolCalls([...filtered]);
+    });
+
     const cleanupProgress = api.onAgenticProgress?.((data: { iteration: number; maxIterations: number }) => {
       setAgenticProgress(data);
     });
@@ -430,6 +447,7 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
       if (executingTimeout) clearTimeout(executingTimeout);
       cleanupExecuting?.();
       cleanupResults?.();
+      cleanupToolGenerating?.();
       cleanupProgress?.();
       cleanupPhase?.();
       cleanupTodo?.();
@@ -708,6 +726,8 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
     setStreamingText('');
     setThinkingSegments([]);
     setCompletedStreamingTools([]);
+    generatingToolCallsRef.current = [];
+    setGeneratingToolCalls([]);
     executingToolsRef.current = [];
 
     try {
@@ -781,6 +801,16 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
         await new Promise(r => setTimeout(r, 0));
       }
 
+      // Wait for the typewriter to finish revealing all buffered chars before committing
+      // the assistant message bubble. Prevents wall-of-text flash on fast cloud responses
+      // where dispose() flushes all remaining chars in a single IPC call, delivering them
+      // to the renderer faster than the 100 chars/sec typewriter can reveal them.
+      // For local models this resolves instantly (typewriter always caught up in real-time).
+      // Only runs when a buffer is present and the generation epoch is still valid.
+      if (result.success && streamBufferRef.current.length > 0 && streamEpochRef.current === activeEpochRef.current) {
+        await waitForTypewriterDone();
+      }
+
       // BUG-026: If model is unavailable, clear the queue — retrying queued messages
       // is pointless until the user loads a model, and draining them causes a stampede.
       if (!result.success && /model not loaded|no model loaded/i.test(result.error || '')) {
@@ -878,6 +908,8 @@ export const ChatPanel: React.FC<ChatPanelProps> = ({
       setStreamingText('');
       setThinkingSegments([]);
       setCompletedStreamingTools([]);
+      generatingToolCallsRef.current = [];
+      setGeneratingToolCalls([]);
       executingToolsRef.current = [];
       setAgenticProgress(null);
       setAgenticPhases([]);
@@ -2448,6 +2480,44 @@ ${e.message}`,
                         </div>
                       </div>
                     )}
+                    {generatingToolCalls.length > 0 && (
+                      <div className="mt-2">
+                        <ToolCallGroup count={generatingToolCalls.length}>
+                          {generatingToolCalls.map((tc) => {
+                            // Extract meaningful detail from partial params text as it streams
+                            let partialDetail = '';
+                            try {
+                              const fpMatch = tc.paramsText.match(/"filePath"\s*:\s*"([^"]+)"/);
+                              const urlMatch = tc.paramsText.match(/"url"\s*:\s*"([^"]+)"/);
+                              const qMatch = tc.paramsText.match(/"query"\s*:\s*"([^"]+)"/);
+                              if (fpMatch) {
+                                const fp = fpMatch[1];
+                                partialDetail = fp.includes('/') ? fp.split('/').pop() || fp : fp.includes('\\') ? fp.split('\\').pop() || fp : fp;
+                              } else if (urlMatch) {
+                                try { partialDetail = new URL(urlMatch[1]).hostname; } catch { partialDetail = urlMatch[1].substring(0, 30); }
+                              } else if (qMatch) {
+                                partialDetail = qMatch[1].substring(0, 25) + (qMatch[1].length > 25 ? '\u2026' : '');
+                              }
+                            } catch {}
+                            const genLabel = partialDetail ? `${tc.functionName}: ${partialDetail}` : tc.functionName;
+                            const displayText = tc.paramsText.length > 1500 ? tc.paramsText.substring(0, 1500) + '\n\u2026[truncated]' : tc.paramsText;
+                            return (
+                              <CollapsibleToolBlock key={`gen-${tc.callIndex}`} label={genLabel} icon="\u29d7">
+                                <div>
+                                  <div className="flex items-center gap-2 mb-2">
+                                    <Loader2 size={12} className="animate-spin text-[#007acc]" />
+                                    <span className="text-[11px] text-[#858585]">Generating tool call\u2026</span>
+                                  </div>
+                                  {tc.paramsText && (
+                                    <pre className="whitespace-pre-wrap text-[11px] font-mono text-[#d4d4d4] bg-[#1e1e1e] rounded-md p-2 max-h-[180px] overflow-y-auto">{displayText}</pre>
+                                  )}
+                                </div>
+                              </CollapsibleToolBlock>
+                            );
+                          })}
+                        </ToolCallGroup>
+                      </div>
+                    )}
                     {(completedStreamingTools.length > 0 || executingTools.length > 0) && (
                       <div className="mt-2">
                         <ToolCallGroup count={completedStreamingTools.length + executingTools.length}>
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "guide-ide",`
`3`		`- "version": "1.6.10",`
	`3`	`+ "version": "1.6.11",`
`4`	`4`	`"description": "guIDE - AI-Powered Offline IDE with local LLM, RAG, MCP tools, browser automation, and integrated terminal",`
`5`	`5`	`"author": {`
`6`	`6`	`"name": "Brendan Gray",`