Complete guide to using Cursor IDE with Lynkr for cost savings, provider flexibility, and local model support.
Lynkr provides full Cursor IDE support through OpenAI-compatible API endpoints, enabling you to use Cursor with any provider (Databricks, Bedrock, OpenRouter, Ollama, etc.) while maintaining all Cursor features.
- 💰 60-80% cost savings vs Cursor's default GPT-4 pricing
- 🔓 Provider choice - Use Claude, local models, or any supported provider
- 🏠 Self-hosted - Full control over your AI infrastructure
- ✅ Full compatibility - All Cursor features work (chat, autocomplete, @Codebase search)
- 🔒 Privacy - Option to run 100% locally with Ollama
# Navigate to Lynkr directory
cd /path/to/Lynkr
# Start with any provider (Databricks, Bedrock, OpenRouter, Ollama, etc.)
npm start
# Wait for: "Server listening at http://0.0.0.0:8081" (or your configured PORT)Note: Lynkr runs on port 8081 by default (configured in .env as PORT=8081)
-
Open Cursor Settings
- Mac: Click Cursor menu → Settings (or press
Cmd+,) - Windows/Linux: Click File → Settings (or press
Ctrl+,)
- Mac: Click Cursor menu → Settings (or press
-
Navigate to Models Section
- In the Settings sidebar, find Features section
- Click on Models
-
Configure OpenAI API Settings
Fill in these three fields:
API Key:
sk-lynkr(Cursor requires a non-empty value, but Lynkr ignores it. You can use any text like "dummy" or "lynkr")
Base URL:
http://localhost:8081/v1⚠️ Critical:- Use port 8081 (or your configured PORT in .env)
- Must end with
/v1 - Include
http://prefix - ✅ Correct:
http://localhost:8081/v1 - ❌ Wrong:
http://localhost:8081(missing/v1) - ❌ Wrong:
localhost:8081/v1(missinghttp://)
Model:
Choose based on your
MODEL_PROVIDERin.env:- Bedrock:
claude-3.5-sonnetorclaude-sonnet-4.5 - Databricks:
claude-sonnet-4.5 - OpenRouter:
anthropic/claude-3.5-sonnet - Ollama:
qwen2.5-coder:latest(or your OLLAMA_MODEL) - Azure OpenAI:
gpt-4oor your deployment name - OpenAI:
gpt-4oor your model
-
Save Settings (auto-saves in Cursor)
┌─────────────────────────────────────────────────────────┐
│ Cursor Settings → Models → OpenAI API │
├─────────────────────────────────────────────────────────┤
│ │
│ API Key: sk-lynkr │
│ (or any non-empty value) │
│ │
│ Base URL: http://localhost:8081/v1 │
│ ⚠️ Must include /v1 │
│ │
│ Model: claude-3.5-sonnet │
│ (or your provider's model) │
│ │
└─────────────────────────────────────────────────────────┘
Test 1: Basic Chat (Cmd+L / Ctrl+L)
You: "Hello, can you see this?"
Expected: Response from your provider via Lynkr ✅
Test 2: Inline Edits (Cmd+K / Ctrl+K)
Select code → Press Cmd+K → "Add error handling"
Expected: Code modifications from your provider ✅
Test 3: Verify Health
curl http://localhost:8081/v1/health
# Expected response:
{
"status": "ok",
"provider": "bedrock",
"openai_compatible": true,
"cursor_compatible": true,
"timestamp": "2026-01-11T12:00:00.000Z"
}| Feature | Without Embeddings | With Embeddings |
|---|---|---|
| Cmd+L chat | ✅ Works | ✅ Works |
| Inline autocomplete | ✅ Works | ✅ Works |
| Cmd+K edits | ✅ Works | ✅ Works |
| Manual @file references | ✅ Works | ✅ Works |
| Terminal commands | ✅ Works | ✅ Works |
| @Codebase semantic search | ❌ Requires embeddings | ✅ Works |
| Automatic context | ❌ Requires embeddings | ✅ Works |
| Find similar code | ❌ Requires embeddings | ✅ Works |
Autocomplete Behavior:
- Cursor's inline autocomplete uses Cursor's built-in models (fast, local)
- Autocomplete does NOT go through Lynkr
- Only these features use Lynkr:
- ✅ Chat (
Cmd+L/Ctrl+L) - ✅ Cmd+K inline edits
- ✅ @Codebase search (with embeddings)
- ❌ Autocomplete (uses Cursor's models)
- ✅ Chat (
For Cursor's @Codebase semantic search, you need embeddings support.
If you configured MODEL_PROVIDER=openrouter, embeddings work automatically with the same OPENROUTER_API_KEY - no additional setup needed! OpenRouter handles both chat AND embeddings with one key.
If you're using Databricks, Bedrock, Ollama, or other providers for chat, add ONE of these for embeddings (ordered by privacy):
Best for: Privacy, offline work, zero cloud dependencies
# Pull embedding model
ollama pull nomic-embed-text
# Add to .env
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
OLLAMA_EMBEDDINGS_ENDPOINT=http://localhost:11434/api/embeddingsPopular models:
nomic-embed-text(768 dim, 137M params) - Recommended, best all-aroundmxbai-embed-large(1024 dim, 335M params) - Higher qualityall-minilm(384 dim, 23M params) - Fastest/smallest
Cost: 100% FREE 🔒 Privacy: All data stays on your machine
Best for: Performance, GGUF models, GPU acceleration
# Download embedding model (example: nomic-embed-text GGUF)
wget https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf
# Start llama-server with embedding model
./llama-server -m nomic-embed-text-v1.5.Q4_K_M.gguf --port 8080 --embedding
# Add to .env
LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddingsPopular models:
nomic-embed-text-v1.5.Q4_K_M.gguf- Recommended, 768 dimall-MiniLM-L6-v2.Q4_K_M.gguf- Smallest, fastest, 384 dimbge-large-en-v1.5.Q4_K_M.gguf- Highest quality, 1024 dim
Cost: 100% FREE 🔒 Privacy: All data stays on your machine Performance: Faster than Ollama, optimized C++
Best for: Simplicity, quality, one key for everything
# Add to .env (uses same key as chat if you're already using OpenRouter)
OPENROUTER_API_KEY=sk-or-v1-your-key
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-smallPopular models:
openai/text-embedding-3-small- $0.02 per 1M tokens (80% cheaper!) Recommendedopenai/text-embedding-ada-002- $0.10 per 1M tokens (standard)openai/text-embedding-3-large- $0.13 per 1M tokens (best quality, 3072 dim)voyage/voyage-code-2- $0.12 per 1M tokens (specialized for code)
Cost: ~$0.01-0.10/month for typical usage Privacy: Cloud-based
Best for: Best quality, direct OpenAI access
# Add to .env
OPENAI_API_KEY=sk-your-openai-api-key
# Optionally specify model (defaults to text-embedding-ada-002)
# OPENAI_EMBEDDINGS_MODEL=text-embedding-3-smallPopular models:
text-embedding-3-small- $0.02 per 1M tokens Recommendedtext-embedding-ada-002- $0.10 per 1M tokenstext-embedding-3-large- $0.13 per 1M tokens (best quality)
Cost: ~$0.01-0.10/month for typical usage Privacy: Cloud-based
By default, Lynkr uses the same provider as MODEL_PROVIDER for embeddings. To use a different provider:
# Use Databricks for chat, but Ollama for embeddings (privacy + cost savings)
MODEL_PROVIDER=databricks
DATABRICKS_API_BASE=https://your-workspace.databricks.com
DATABRICKS_API_KEY=your-key
# Override embeddings provider
EMBEDDINGS_PROVIDER=ollama
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-textRecommended setups:
- 100% Local/Private: Ollama chat + Ollama embeddings (zero cloud dependencies)
- Hybrid: Databricks/Bedrock chat + Ollama embeddings (private search, cloud chat)
- Simple Cloud: OpenRouter chat + OpenRouter embeddings (one key for both)
After configuration, restart Lynkr and @Codebase will work!
Lynkr implements all 4 OpenAI API endpoints for full Cursor compatibility:
Chat with streaming support
- Handles all chat/completion requests
- Converts OpenAI format ↔ Anthropic format automatically
- Full tool calling support
- Streaming responses
List available models
- Returns models based on configured provider
- Updates dynamically when you change providers
Generate embeddings for @Codebase search
- Supports 4 providers: Ollama, llama.cpp, OpenRouter, OpenAI
- Automatic provider detection
- Falls back gracefully if not configured (returns 501)
Health check
- Verify Lynkr is running
- Check provider status
- Returns status, provider info, and compatibility flags
Scenario: 100K requests/month, typical Cursor usage
| Setup | Monthly Cost | Embeddings Setup | Features | Privacy |
|---|---|---|---|---|
| Cursor native (GPT-4) | $20-50 | Built-in | All features | Cloud |
| Lynkr + OpenRouter | $5-10 | ⚡ Same key for both | All features, simplest setup | Cloud |
| Lynkr + Databricks | $15-30 | +Ollama/OpenRouter | All features | Cloud chat, local/cloud search |
| Lynkr + Ollama + Ollama embeddings | 100% FREE 🔒 | Ollama (local) | All features, 100% local | 100% Local |
| Lynkr + Ollama + llama.cpp embeddings | 100% FREE 🔒 | llama.cpp (local) | All features, 100% local | 100% Local |
| Lynkr + Ollama + OpenRouter embeddings | $0.01-0.10 | OpenRouter (cloud) | All features, hybrid | Local chat, cloud search |
| Lynkr + Ollama (no embeddings) | FREE | None | Chat/Cmd+K only, no @Codebase | 100% Local |
Ollama + Ollama embeddings
- Cost: 100% FREE
- Privacy: All data stays on your machine
- Features: Full @Codebase support with local embeddings
- Perfect for: Sensitive codebases, offline work, privacy requirements
MODEL_PROVIDER=ollama
OLLAMA_MODEL=llama3.1:8b
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-textOpenRouter
- Cost: $5-10/month
- Setup: ONE key for chat + embeddings, no extra setup
- Features: 100+ models, automatic fallbacks
- Perfect for: Easy setup, flexibility, cost optimization
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key
OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
# Embeddings work automatically with same key!Databricks or Azure Anthropic
- Cost: $15-30/month (enterprise pricing)
- Features: Claude Sonnet 4.5, enterprise SLA
- Perfect for: Production use, enterprise compliance
MODEL_PROVIDER=databricks
DATABRICKS_API_BASE=https://your-workspace.databricks.com
DATABRICKS_API_KEY=your-key
# Add Ollama embeddings for privacy
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-textAWS Bedrock
- Cost: $10-20/month (100+ models)
- Features: Claude + DeepSeek + Qwen + Nova + Titan + Llama
- Perfect for: AWS integration, multi-model flexibility
MODEL_PROVIDER=bedrock
AWS_BEDROCK_API_KEY=your-bearer-token
AWS_BEDROCK_REGION=us-east-1
AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0Ollama or llama.cpp
- Latency: 100-500ms (local inference)
- Cost: 100% FREE
- Perfect for: Fast iteration, local development
Symptoms: Cursor shows connection errors, can't reach Lynkr
Solutions:
-
Verify Lynkr is running:
# Check if Lynkr process is running on port 8081 lsof -i :8081 # Should show node process
-
Test health endpoint:
curl http://localhost:8081/v1/health # Should return: {"status":"ok"} -
Check port number:
- Verify Cursor Base URL uses correct port:
http://localhost:8081/v1 - Check
.envfile:PORT=8081 - If you changed PORT, update Cursor settings to match
- Verify Cursor Base URL uses correct port:
-
Verify URL format:
- ✅ Correct:
http://localhost:8081/v1 - ❌ Wrong:
http://localhost:8081(missing/v1) - ❌ Wrong:
localhost:8081/v1(missinghttp://)
- ✅ Correct:
Symptoms: Cursor says API key is invalid
Solutions:
- Lynkr doesn't validate API keys from Cursor
- This error means Cursor isn't reaching Lynkr at all
- Double-check Base URL in Cursor:
http://localhost:8081/v1 - Make sure you included
/v1at the end - Try clearing and re-entering the Base URL
Symptoms: Cursor can't find the model you specified
Solutions:
-
Match model name to your provider:
- Bedrock: Use
claude-3.5-sonnetorclaude-sonnet-4.5 - Databricks: Use
claude-sonnet-4.5 - OpenRouter: Use
anthropic/claude-3.5-sonnet - Ollama: Use your actual model name like
qwen2.5-coder:latest
- Bedrock: Use
-
Try generic names:
- Lynkr translates generic names, so try:
claude-3.5-sonnetgpt-4o- These work across most providers
-
Check provider logs:
# In Lynkr terminal # Look for "Unknown model" errors
Symptoms: @Codebase doesn't return results or shows error
Solutions:
-
Verify embeddings are configured:
curl http://localhost:8081/v1/embeddings \ -H "Content-Type: application/json" \ -d '{"input":"test","model":"text-embedding-ada-002"}' # Should return embeddings, not 501 error
-
Check embeddings provider:
# In .env, verify one of these is set: OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text # OR LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings # OR OPENROUTER_API_KEY=sk-or-v1-your-key # OR OPENAI_API_KEY=sk-your-key
-
Restart Lynkr after adding embeddings config
-
This is a Cursor indexing issue, not Lynkr:
- Cursor needs to re-index your codebase
- Try closing and reopening the workspace
Symptoms: Responses take 5+ seconds
Solutions:
-
Check provider latency:
- Local (Ollama/llama.cpp): Should be 100-500ms
- Cloud (OpenRouter/Databricks): Should be 500ms-2s
- Distant regions: Can be 2-5s
-
Enable tier-based routing for speed:
# Use Ollama for simple requests (fast), cloud for complex requests # Set all 4 TIER_* env vars to enable tier-based routing TIER_SIMPLE=ollama:llama3.2 TIER_MEDIUM=openrouter:openai/gpt-4o-mini TIER_COMPLEX=azure-openai:gpt-4o TIER_REASONING=azure-openai:gpt-4o FALLBACK_ENABLED=true
-
Check Lynkr logs:
- Look for actual response times
- Example:
Response time: 2500ms
Symptoms: @Codebase returns irrelevant files
Solutions:
-
Try better embedding models:
# For Ollama - upgrade to larger model ollama pull mxbai-embed-large # Better quality than nomic-embed-text OLLAMA_EMBEDDINGS_MODEL=mxbai-embed-large
-
Use cloud embeddings for better quality:
# OpenRouter has excellent embeddings OPENROUTER_API_KEY=sk-or-v1-your-key OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2 -
This is a Cursor indexing issue, not Lynkr:
- Cursor needs to re-index your codebase
- Try closing and reopening the workspace
Symptoms: Provider returns 429 errors
Solutions:
-
Enable fallback provider:
FALLBACK_ENABLED=true FALLBACK_PROVIDER=databricks
-
Switch to Ollama (no rate limits):
MODEL_PROVIDER=ollama OLLAMA_MODEL=llama3.1:8b
-
Use OpenRouter (pooled rate limits across providers):
MODEL_PROVIDER=openrouter
For detailed troubleshooting:
# In .env
LOG_LEVEL=debug
# Restart Lynkr
npm start
# Check logs for detailed request/response infoCursor IDE
↓ OpenAI API format
Lynkr Proxy
↓ Converts to Anthropic format
Your Provider (Databricks/Bedrock/OpenRouter/Ollama/etc.)
↓ Returns response
Lynkr Proxy
↓ Converts back to OpenAI format
Cursor IDE (displays result)
# Chat + Embeddings: OpenRouter handles both with ONE key
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key-here
# Done! Everything works with one keyBenefits:
- ✅ ONE key for chat + embeddings
- ✅ 100+ models available
- ✅ Automatic fallbacks
- ✅ Competitive pricing
# Chat: Ollama (local)
MODEL_PROVIDER=ollama
OLLAMA_MODEL=llama3.1:8b
# Embeddings: Ollama (local)
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
# Everything runs on your machine, zero cloud dependenciesBenefits:
- ✅ 100% FREE
- ✅ 100% private (all data stays local)
- ✅ Works offline
- ✅ Full @Codebase support
# Chat: Tier-based routing (set all 4 to enable)
TIER_SIMPLE=ollama:llama3.2
TIER_MEDIUM=openrouter:openai/gpt-4o-mini
TIER_COMPLEX=databricks:databricks-claude-sonnet-4-5
TIER_REASONING=databricks:databricks-claude-sonnet-4-5
FALLBACK_ENABLED=true
FALLBACK_PROVIDER=databricks
DATABRICKS_API_BASE=https://your-workspace.databricks.com
DATABRICKS_API_KEY=your-key
# Embeddings: Ollama (local, private)
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
# Cost: Mostly FREE (Ollama handles 70-80% of simple requests)
# Only complex/reasoning requests go to DatabricksBenefits:
- ✅ Mostly FREE (70-80% of requests on Ollama via TIER_SIMPLE)
- ✅ Private embeddings (local search)
- ✅ Cloud quality for complex tasks
- ✅ Automatic intelligent tier-based routing
| Aspect | Cursor Native | Lynkr + Cursor |
|---|---|---|
| Providers | OpenAI only | 12+ providers (Bedrock, Databricks, OpenRouter, Ollama, llama.cpp, Moonshot, etc.) |
| Costs | OpenAI pricing | 60-80% cheaper (or 100% FREE with Ollama) |
| Privacy | Cloud-only | Can run 100% locally (Ollama + local embeddings) |
| Embeddings | Built-in (cloud) | 4 options: Ollama (local), llama.cpp (local), OpenRouter (cloud), OpenAI (cloud) |
| Control | Black box | Full observability, logs, metrics |
| Features | All Cursor features | All Cursor features (chat, Cmd+K, @Codebase) |
| Flexibility | Fixed setup | Mix providers (e.g., Bedrock chat + Ollama embeddings) |
- Embeddings Configuration - Detailed embeddings setup guide
- Provider Configuration - Configure all providers
- Installation Guide - Install Lynkr
- Troubleshooting - More troubleshooting tips
- FAQ - Frequently asked questions
- GitHub Discussions - Community Q&A
- GitHub Issues - Report bugs
- Troubleshooting Guide - Common issues and solutions