openecon-data now supports flexible LLM backends, allowing you to switch between OpenRouter, local models (Ollama, LM Studio), and other LLM providers without changing code.
- Cost Savings: Use local models instead of paid APIs
- Privacy: Keep all data processing on-premises
- Flexibility: Switch providers easily via configuration
- Future-Proof: Add new providers without refactoring
backend/services/llm/
├── base.py # Abstract LLM provider interface
├── openrouter_provider.py # OpenRouter implementation
├── local_provider.py # Local model support (Ollama/LM Studio)
├── factory.py # Provider selection logic
└── __init__.py # Package exports
-
Configuration loads from
.env:LLM_PROVIDER: Which provider to useLLM_MODEL: Which specific modelLLM_BASE_URL: For local providers
-
Factory creates appropriate provider instance
-
OpenRouterService uses provider for query parsing (backward compatible)
-
Queries processed identically regardless of backend
Add to your .env file:
# LLM Provider Configuration
LLM_PROVIDER=openrouter # Options: openrouter, vllm, ollama, lm-studio
LLM_MODEL=openai/gpt-4o-mini
# LLM_BASE_URL=http://localhost:11434 # For local providers
# LLM_TIMEOUT=30LLM_PROVIDER=openrouter
LLM_MODEL=openai/gpt-4o-mini
OPENROUTER_API_KEY=sk-or-...
# Hosted last-resort used when a local provider fails at runtime:
# LLM_FALLBACK_MODEL=openai/gpt-oss-120bProduction note: the hosted app runs
LLM_PROVIDER=vllmwith a local gpt-oss-120b (SSH-tunneled vLLM atlocalhost:8000) and falls back toLLM_FALLBACK_MODELon OpenRouter when the tunnel is down.
Available Models: GPT-4, Claude, Llama, Mistral, and 100+ others via OpenRouter
LLM_PROVIDER=ollama
LLM_MODEL=llama2
LLM_BASE_URL=http://localhost:11434Setup: Install Ollama from https://ollama.ai/
ollama pull llama2
ollama serveLLM_PROVIDER=lm-studio
LLM_MODEL=local-model
LLM_BASE_URL=http://localhost:1234Setup: Download LM Studio from https://lmstudio.ai/
No changes needed - works out of the box with current configuration.
- Install and start Ollama:
curl https://ollama.ai/install.sh | sh
ollama pull llama2
ollama serve- Update
.env:
LLM_PROVIDER=ollama
LLM_MODEL=llama2
LLM_BASE_URL=http://localhost:11434- Restart backend - queries now use local model!
-
Download and start LM Studio with a model loaded
-
Update
.env:
LLM_PROVIDER=lm-studio
LLM_MODEL=your-model-name
LLM_BASE_URL=http://localhost:1234- Restart backend
All providers implement this interface:
class BaseLLMProvider(ABC):
@abstractmethod
async def generate(
self,
prompt: str,
system_prompt: Optional[str] = None,
temperature: float = 0.2,
max_tokens: int = 2000,
json_mode: bool = False,
**kwargs
) -> LLMResponse:
"""Generate completion from LLM"""
pass
@abstractmethod
async def health_check(self) -> bool:
"""Check if provider is accessible"""
pass
@property
@abstractmethod
def name(self) -> str:
"""Return provider name"""
pass
@property
@abstractmethod
def model_name(self) -> str:
"""Return specific model being used"""
passTo add a new provider:
- Create
backend/services/llm/custom_provider.py:
from .base import BaseLLMProvider, LLMResponse
class CustomProvider(BaseLLMProvider):
def __init__(self, config):
super().__init__(config)
self.api_key = config.get("api_key")
async def generate(self, prompt, system_prompt=None, **kwargs):
# Your implementation here
pass
async def health_check(self):
# Check if your API is accessible
pass
@property
def name(self):
return "Custom"
@property
def model_name(self):
return self._model- Register in
factory.py:
from .custom_provider import CustomProvider
def create_llm_provider(provider_type, config):
if provider_type == "custom":
return CustomProvider(config)
# ... existing providers- Use it:
LLM_PROVIDER=customBefore (hardcoded OpenRouter):
service = OpenRouterService(api_key="...")
result = await service.parse_query(query)After (flexible providers):
# Automatically uses configured provider
service = OpenRouterService(api_key="...", settings=settings)
result = await service.parse_query(query) # Same interface!Old code continues to work - settings parameter is optional:
# Still works without settings
service = OpenRouterService(api_key="...")- Latency: ~500-2000ms depending on model
- Cost: Pay per token
- Availability: 99.9% uptime
- Latency: ~100-500ms (local network)
- Cost: Free (compute costs only)
- Availability: Depends on your hardware
- Production: Use OpenRouter for reliability
- Development: Use local models to save costs
- High Volume: Consider local models for cost savings
- Privacy-Critical: Use local models for data security
- Check
OPENROUTER_API_KEYis valid - Verify you have credits at https://openrouter.ai
- Ensure Ollama/LM Studio is running
- Check
LLM_BASE_URLis correct - Verify firewall settings
- Pull the model:
ollama pull llama2 - Check available models:
ollama list
- Ensure adequate RAM (8GB minimum for most models)
- Consider smaller models (llama2:7b vs llama2:70b)
- Check CPU/GPU usage
Test your LLM provider:
# Health check
curl http://localhost:3001/api/health
# Test query
curl -X POST http://localhost:3001/api/query \
-H "Content-Type: application/json" \
-d '{"query": "Show me US GDP for 2023"}'Planned features:
- Direct OpenAI API support (bypass OpenRouter)
- Direct Anthropic Claude API support
- Azure OpenAI support
- Hugging Face Inference API support
- LLM response caching for common queries
- Automatic fallback between providers
- Cost tracking per provider
The LLM abstraction makes openecon-data:
- ✅ Flexible: Switch providers via config
- ✅ Cost-Effective: Use free local models
- ✅ Private: Keep data on-premises
- ✅ Professional: Clean, maintainable architecture
- ✅ Future-Proof: Easy to add new providers
October 19, 2025
backend/services/llm/- LLM abstraction layerbackend/config.py- Configuration settingsbackend/services/openrouter.py- Updated to use abstractionbackend/services/query.py- Updated to pass settingsbackend/main.py- Updated to inject settings.env.example- Configuration examples