A powerful VS Code extension for chatting with local Large Language Models directly in your editor—completely private and secure
Features • Installation • Getting Started • Commands • Configuration
- Privacy-First: All data stays on your machine—no cloud APIs required
- Multiple LLM Support: Works with Ollama, LM Studio, vLLM, and OpenAI-compatible endpoints
- Persistent Conversations: Chat history is maintained throughout your session
- Streaming Responses: See AI responses in real-time as they're generated
- Read Files: Load any workspace file into the conversation context
- List Directories: Browse your project structure directly from chat
- Search Files: Find files using glob patterns (e.g.,
**/*.ts) - Workspace Info: Get metadata about your project (Git status, dependencies, etc.)
- Smart Context: Send active file or selected code to the AI instantly
- AI-Powered File Generation: Let the AI create complete files based on your requirements
- Syntax Detection: Automatically detects file types and applies proper formatting
- Safe Operations: Confirmation prompts before creating/overwriting files
- Selection to File: Convert any code selection into a new file
Built-in slash commands for quick actions:
/read <file-path>- Read a file and add to conversation/list [directory]- List files in a directory/search <pattern>- Search for files with glob patterns/workspace- Show workspace information/help- Display all available commands
- Download the
.vsixfile from the releases page - Open VS Code
- Go to Extensions view (
Ctrl+Shift+XorCmd+Shift+X) - Click the
...menu →Install from VSIX... - Select the downloaded
.vsixfile
git clone https://github.com/markusbegerow/local-llm-vscode.git
cd local-llm-vscode
npm install
npm run compileThen press F5 in VS Code to launch the extension in debug mode.
Option A: Ollama (Recommended)
# Install Ollama from https://ollama.ai
ollama pull llama3.1
ollama serveOption B: LM Studio
- Download from lmstudio.ai
- Load a model
- Start the local server (default:
http://localhost:1234)
Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and run:
Local LLM: Configure API Settings
Or manually configure in VS Code settings:
{
"localLLM.apiUrl": "http://localhost:11434",
"localLLM.model": "llama3.1",
"localLLM.apiCompat": "openai",
"localLLM.temperature": 0.7
}- Open Command Palette
- Run:
Local LLM: Open Chat - Start asking questions!
/read src/extension.ts
Can you explain how this extension works and suggest improvements?
/workspace
/list src
What is the architecture of this project?
Create a TypeScript utility function that validates email addresses with proper error handling and unit tests.
The AI will suggest a file with path and content. Click "Create" to save it!
/search **/*.ts
Find all TypeScript files, then help me refactor the error handling patterns across the project.
- Open any file in the editor
- Press
Ctrl+Shift+P - Run:
Local LLM: Send Active File to Chat - Ask questions about the code
| Command | Description |
|---|---|
Local LLM: Open Chat |
Open the chat panel |
Local LLM: Configure API Settings |
Set up your LLM connection |
Local LLM: Send Active File to Chat |
Send current file to chat |
Local LLM: Send File to Chat |
Browse and send any file to chat |
Local LLM: List Workspace Files |
List files in workspace |
Local LLM: Get Workspace Info |
Show workspace metadata |
Local LLM: Search Files |
Search files with glob patterns |
Local LLM: Clear Conversation History |
Reset the chat |
Local LLM: Create File From Selection |
Create new file from selection |
| Command | Description | Example |
|---|---|---|
/read <path> |
Read file contents | /read package.json |
/list [dir] |
List directory files | /list src |
/search <pattern> |
Find files by pattern | /search **/*.json |
/workspace |
Show workspace info | /workspace |
/help |
Show all commands | /help |
/write <path> |
Create file inline | /write test.js |
| Setting | Default | Description |
|---|---|---|
localLLM.apiUrl |
http://localhost:11434 |
Base URL of your LLM API |
localLLM.model |
llama3.1 |
Model name to use |
localLLM.apiCompat |
openai |
API compatibility (openai or ollama) |
localLLM.customEndpoint |
"" |
Full endpoint URL (optional) |
localLLM.temperature |
0.7 |
Sampling temperature (0.0-2.0) |
localLLM.maxTokens |
2048 |
Maximum response tokens |
localLLM.systemPrompt |
(default) | System prompt for the AI |
localLLM.maxHistoryMessages |
50 |
Max messages in history |
localLLM.requestTimeout |
120000 |
Request timeout (ms) |
localLLM.maxFileSize |
1048576 |
Max file size (bytes) |
OpenAI Compatible (Recommended)
- LM Studio
- vLLM
- text-generation-webui
- Most modern LLM servers
{
"localLLM.apiCompat": "openai",
"localLLM.apiUrl": "http://localhost:1234"
}Ollama Native
{
"localLLM.apiCompat": "ollama",
"localLLM.apiUrl": "http://localhost:11434"
}- ✅ Path Traversal Protection: Prevents access outside workspace
- ✅ File Size Limits: Configurable maximum file sizes
- ✅ Content Security Policy: XSS protection in webviews
- ✅ Secure Token Storage: API keys stored in VS Code secrets
- ✅ Confirmation Prompts: Review before creating/overwriting files
- ✅ Local-Only: No external API calls unless you configure them
- VS Code: Version 1.85.0 or higher
- Node.js: Version 20.x or higher (for development)
- Local LLM: Ollama, LM Studio, or compatible server
# Clone repository
git clone https://github.com/markusbegerow/local-llm-vscode.git
cd local-llm-vscode
# Install dependencies
npm install
# Compile TypeScript
npm run compile
# Watch mode for development
npm run watch
# Package as VSIX
npm install -g @vscode/vsce
vsce packagelocal-llm-vscode/
├── src/
│ ├── extension.ts # Extension entry point
│ ├── chatPanel.ts # Chat UI and logic
│ ├── llm.ts # LLM API integration
│ ├── utils.ts # Workspace utilities
│ └── types.ts # TypeScript types
├── media/
│ └── webview.js # Chat UI JavaScript
├── out/ # Compiled JavaScript
├── package.json # Extension manifest
└── tsconfig.json # TypeScript config
- Verify your LLM server is running:
curl http://localhost:11434/api/tags(Ollama) orcurl http://localhost:1234/v1/models(LM Studio) - Check the API URL in settings matches your server
- Try switching API compatibility mode
- Check the Output panel (View → Output → Local LLM Chat) for errors
- Increase the request timeout in settings
- Verify your model is loaded in the LLM server
- Ensure you have a workspace/folder open in VS Code
- Check file paths are relative (no absolute paths allowed)
- Verify file size limits in settings
- Llama 3.1 8B: Fast, good for general coding
- CodeLlama 13B: Specialized for code generation
- Qwen 2.5 Coder: Excellent code understanding
- DeepSeek Coder: Strong at complex algorithms
- Llama 3.1: Best all-around performance
- Mistral 7B: Fast and efficient
- Phi-3: Compact but capable
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to the Ollama team for making local LLMs accessible
- LM Studio for providing an excellent local inference platform
- The VS Code extension API team for comprehensive documentation
If you encounter any issues or have questions:
- 🐛 Report bugs
- 💡 Request features
- ⭐ Star the repo if you find it useful!
If you like this project, support further development with a repost or coffee:
- 🧑💻 Markus Begerow
- 💾 GitHub
Privacy Notice: This extension operates entirely locally. No data is sent to external servers unless you explicitly configure it to use a remote API endpoint.