This document exists because Brendan has had to repeat himself hundreds of times. If you are an AI agent working on this project, READ THIS BEFORE DOING ANYTHING.
A desktop IDE where users load ANY local GGUF model and it just works. Chat, tool calling, browsing, code generation — powered by whatever model the user chose. Model-agnostic. The app adapts at runtime via dynamic model profiles.
- User loads any model. Asks it to do something. It does it coherently to the best of that model's actual ability.
- If a model produces good output in LM Studio, it must produce equally good or better output in guIDE. The pipeline helps, never hinders.
- Works out of the box. No hand-tuning per model.
- Tailoring code to specific model names
- Benchmarking one model and declaring victory
- Guardrails/quality gates/kill switches that prevent models from working
- Timeouts that mask underlying problems (timeouts = failure)
The profile system (family + size tier) IS the right approach. Different size models genuinely need different parameters. A 0.6B model needs different sampling than a 30B. This is NOT "hand-tuning per model" — it's per-family-per-size-tier configuration, which scales. The profile system is NOT a fallback — it IS the runtime.
Unknown models get sensible defaults derived from the closest matching tier.
- 0.6B models: CAN make tool calls, CAN chain a couple of them. They hallucinate and repeat themselves but they ARE capable. Don't restrict them to single calls without testing first. They've proven they can do it.
- 1-4B models: Should handle multi-step tasks reliably.
- 4B+: Should handle complex chains.
- ALL models must produce COHERENT output. Even if smaller ones do less, they must not produce gibberish.
- Test before implementing. Prove a problem exists before fixing it.
- When shown a failing interaction, analyze what ACTUALLY happened.
- If something works, leave it alone. "Looks good" is a valid answer.
- Say "Brendan you're wrong" or "there's nothing else to do" when that's the truth.
- Give honest opinions, even if they disagree with what Brendan said.
- Find ROOT CAUSES, not bandaids.
- Be concise. Do the work. Stop narrating.
- Manufacture problems. If there's nothing to fix, SAY SO.
- Cheerleader language: "smoking gun", "this changes everything", "game changer"
- Agree with everything. Brendan needs honest pushback.
- Run audit/fix loops that create new problems to fix later.
- Implement changes based on hypotheses — test first.
- Reference specific model names when discussing general architecture.
- Apologize repeatedly. Just work.
- Throw bandaids. If you can't find the root cause, say so.
- Root cause found and fixed:
projectPathwas null at startup because it's only set when user opens a folder via File > Open Folder._writeFilejoined basename with''→ wrote to process CWD. Orphaned files confirmed at D:\models\models, C:\Users\brend\IDE, etc. - Fix:
_writeFileand_createDirectorynow return clear error when no project is open. Removed|| ''fallback. Addedfiles-changedIPC notification so FileTree auto-refreshes. - Note: File Explorer New Folder/New File buttons — not yet investigated.
- Root cause:
onHeadersReceivedcallback wasasyncwithawaitinside, which caused timing issues with Electron's webRequest callback mechanism. Thecallback()was delayed whileactivateWithTokenran, potentially blocking the OAuth redirect. Multiple strategies (4) all failed due to race conditions. - Fix (v4): Replaced
onHeadersReceivedwithsession.cookies.on('changed')event. This is Electron's native cookie change event — fires synchronously when any cookie is set in the session, no timing race possible. Fallback: if cookie event doesn't fire within 2s of landing on /account, tries direct cookie read. - Caveat: Cannot test OAuth end-to-end in this environment. If it fails again,
check logs at %APPDATA%/guIDE/logs/guide-main.log for
[OAuth]entries.
- Root cause: chatHistory persisted intermediate agentic turns (injected tool feedback,
continue instructions) across separate user messages. For 0.6B models with limited
attention, the pattern
user: [tool feedback]→model: "No further action"was strongly reinforced, causing the model to repeat it regardless of new input. - Fix: After agentic loop completes, chatHistory is condensed to system + original user message + final model response. KV cache invalidated.
- Root cause:
thinkTokens.mode = 'none'in llama profile suppressed thinking tokens for ALL llama models. Thinking-variant models (trained with chain-of-thought) NEED to generate<think>...</think>before answering — without it, their logits produce gibberish. - Fix:
_getModelSpecificParams()now detects "thinking", "cot", "r1-distill", "reasoning" in the model name and overrides thinkTokens to budget mode.
- Root cause: Grammar-constrained generation hung (0 tokens in rejection sampling).
After 2 grammar timeouts + 1 text-mode timeout, rollback budget exhaustion RESET
consecutiveEmptyGrammarRetriesto 0, re-enabling grammar for next iteration. With 3 nudges × (5s+5s+120s) = 7.5+ minutes of dead time. - Fix: Don't reset
consecutiveEmptyGrammarRetrieson rollback budget exhaustion. Once grammar fails, it stays disabled. Grammar timeout reduced from 15s → 5s.
- Root cause:
initialize()calledloadModel()(180s timeout) but had no way to know it was superseded. Secondinitialize()call ran concurrently, both wrote tothis.model/this.context, wrong model ended up loaded. - Fix: Added
_loadGenerationmonotonic counter. Eachinitialize()gets a unique ID and callscheckSuperseded()after every heavy await. Superseded loads throw immediately.
- File Explorer New Folder / New File buttons don't work
- Tool call dropdowns expanding during streaming (code defaults to collapsed — may be streaming render issue where JSON isn't parsed as a tool call block)
- System may be over-engineered — Brendan suspects too many moving parts actively hindering
- When investigating issues, consider whether existing code is CAUSING the problem before adding more code on top.
- Simplicity > cleverness. If a simpler approach works, use it.
- Only implement fixes you are CERTAIN will solve the problem.
- If you cannot determine the root cause, say "I don't know" — this is always acceptable.
- Never implement a guess and call it a fix. Bandaids waste Brendan's time.
- If a fix requires testing you can't do (e.g., OAuth), SAY SO explicitly.
- When asked to find problems, genuinely look. If there are none, say "I found nothing."
- Do not fabricate issues to appear helpful. Brendan catches this every time.
- "I don't know" is always better than a wrong answer.
- "There's nothing to fix" is always better than a fake fix.
- "I can't test this" is always better than claiming something works when you haven't verified it.
- Never claim a fix works unless you have proof (build output, test result, etc.).
- Persistent file logs exist at %APPDATA%/guIDE/logs/guide-main.log
- All info/warn/error logs are written to file automatically
- Set LOG_LEVEL=debug for verbose output
- Always check log files first when diagnosing issues
- Electron + Vite + React
- node-llama-cpp for local inference
- Main process: main/ directory (agenticChat.js, llmEngine.js, modelProfiles.js, etc.)
- Frontend: src/ directory
- Website: website/ directory (Next.js)
- Models on D:\models
LM Studio: simple prompt, no grammar constraining, default sampling → coherent output. guIDE: system prompt + tool definitions + few-shot examples + grammar constraining + custom sampling → potentially degraded output.
The pipeline must HELP models, not fight them.