Skip to content

Latest commit

 

History

History
268 lines (209 loc) · 7.24 KB

File metadata and controls

268 lines (209 loc) · 7.24 KB

OpenAI-Compatible Provider Implementation Summary

Implementation Complete ✅

The openai_compatible provider type has been successfully added to the Document-Analyzer-Operator Platform.


Files Updated

Backend

  1. backend/app/models/llm_provider.py

    • Added OPENAI_COMPATIBLE = "openai_compatible" to the ProviderType enum
    • Updated enum documentation to clarify support for custom OpenAI-compatible APIs
  2. backend/app/services/llm_client.py

    • Added OpenAICompatibleProvider class extending BaseProvider
    • Implemented all required methods:
      • chat_completion() - Chat completions with streaming support
      • text_completion() - Text completions
      • embeddings() - Embedding generation
      • list_models() - List available models
      • estimate_cost() - Cost estimation with configurable pricing
    • Added provider to LLMClient.PROVIDER_CLASSES registry
    • Proper error handling for authentication, rate limits, and server errors

Frontend

  1. frontend/src/types/index.ts

    • Added 'openai_compatible' to ProviderType type union
  2. frontend/src/app/dashboard/settings/llm-providers/create-dialog.tsx

    • Added "OpenAI-Compatible" option to provider type selector
    • Added description: "Custom OpenAI-compatible APIs (LocalAI, FastChat, Together AI, Anyscale, etc.)"
    • Made API key optional (displayed as optional for all compatible providers)
    • Added comprehensive preset dropdown with 8 options:
      • LocalAI
      • FastChat
      • Together AI
      • Anyscale Endpoints
      • Groq
      • DeepInfra
      • Lepton AI
      • Custom
    • Added contextual help text for OpenAI-compatible providers

Documentation

  1. backend/docs/LLM_PROVIDERS.md
    • Added comprehensive "OpenAI-Compatible Providers" section
    • Listed 12+ popular OpenAI-compatible services (local and cloud)
    • Provided configuration examples for each major service
    • Added pricing comparison table
    • Included setup guides for LocalAI, Together AI, Groq, and Anyscale
    • Added troubleshooting section for OpenAI-compatible providers

Preset Configurations Included

Local Deployment Presets

Preset Base URL API Key Required Description
LocalAI http://localhost:8080/v1 No Self-hosted LocalAI instance
FastChat http://localhost:8000/v1 No FastChat local server

Cloud Service Presets

Preset Base URL API Key Required Description
Together AI https://api.together.xyz/v1 Yes Cloud hosted open-source models
Anyscale Endpoints https://api.endpoints.anyscale.com/v1 Yes Ray-powered model serving
Groq https://api.groq.com/openai/v1 Yes Ultra-fast LPU inference
DeepInfra https://api.deepinfra.com/v1 Yes Serverless model inference
Lepton AI https://<workspace>.lepton.run/api/v1 Yes Lepton AI cloud platform
Custom (user enters) Optional Custom URL entry

Usage Examples

Example 1: LocalAI (Local)

{
  "name": "LocalAI",
  "provider_type": "openai_compatible",
  "base_url": "http://localhost:8080/v1",
  "model_name": "llama-2-7b",
  "is_active": true,
  "is_default": false,
  "config": {
    "temperature": 0.7,
    "max_tokens": 4096
  }
}

Example 2: Together AI (Cloud)

{
  "name": "Together AI",
  "provider_type": "openai_compatible",
  "base_url": "https://api.together.xyz/v1",
  "api_key": "your-together-api-key",
  "model_name": "togethercomputer/llama-2-70b-chat",
  "is_active": true,
  "is_default": false,
  "config": {
    "temperature": 0.7,
    "max_tokens": 4096,
    "pricing": {
      "input": 0.0009,
      "output": 0.0009
    }
  }
}

Example 3: Groq (Cloud - Ultra Fast)

{
  "name": "Groq",
  "provider_type": "openai_compatible",
  "base_url": "https://api.groq.com/openai/v1",
  "api_key": "your-groq-api-key",
  "model_name": "llama3-70b-8192",
  "is_active": true,
  "is_default": false,
  "config": {
    "temperature": 0.7,
    "max_tokens": 4096
  }
}

Example 4: Anyscale Endpoints (Cloud)

{
  "name": "Anyscale Endpoints",
  "provider_type": "openai_compatible",
  "base_url": "https://api.endpoints.anyscale.com/v1",
  "api_key": "your-anyscale-api-key",
  "model_name": "meta-llama/Llama-2-7b-chat-hf",
  "is_active": true,
  "is_default": false,
  "config": {
    "temperature": 0.7,
    "max_tokens": 4096
  }
}

Additional Configuration Notes

Pricing Configuration

For cloud services, you can add pricing information to the config to enable cost estimation:

{
  "config": {
    "pricing": {
      "input": 0.0009,
      "output": 0.0009
    }
  }
}

Pricing is specified in USD per 1,000 tokens.

Custom Timeouts and Retries

The default timeout is 60 seconds with 3 retry attempts. These can be configured in the LLMClient initialization if needed.

Model Names

Model names are service-specific. Check your provider's documentation for available models:

  • Together AI: togethercomputer/llama-2-70b-chat, togethercomputer/CodeLlama-34b-Instruct, etc.
  • Anyscale: meta-llama/Llama-2-7b-chat-hf, mistralai/Mixtral-8x7B-Instruct-v0.1, etc.
  • Groq: llama3-70b-8192, mixtral-8x7b-32768, gemma-7b-it, etc.
  • LocalAI: Depends on models you've installed locally

Supported Services

Local (Self-Hosted)

  • ✅ LocalAI
  • ✅ FastChat
  • ✅ vLLM API Server
  • ✅ Text Generation Inference

Cloud Services

  • ✅ Together AI
  • ✅ Anyscale Endpoints
  • ✅ Groq
  • ✅ DeepInfra
  • ✅ Lepton AI
  • ✅ Replicate (OpenAI-compatible mode)

Testing

To test the implementation:

  1. Backend Validation

    python -m py_compile backend/app/models/llm_provider.py
    python -m py_compile backend/app/services/llm_client.py
  2. Frontend Build

    cd frontend
    npm run build
  3. API Test

    curl -X POST "http://localhost:8000/api/v1/llm-providers" \
      -H "Authorization: Bearer YOUR_TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "name": "Test LocalAI",
        "provider_type": "openai_compatible",
        "base_url": "http://localhost:8080/v1",
        "model_name": "llama-2-7b",
        "is_active": true
      }'

Next Steps

  1. Database Migration: If using SQLAlchemy migrations, create a migration to add the new provider type to any database constraints.

  2. Update API Documentation: The OpenAPI/Swagger docs at /docs will automatically reflect the new provider type.

  3. User Documentation: Consider adding a blog post or announcement about the new provider support.

  4. Testing: Test with actual services (LocalAI, Together AI, Groq, etc.) to verify end-to-end functionality.


Benefits

Cost-Effective: Access to models at lower prices than official APIs
Model Variety: Hundreds of open-source models available
Privacy: Self-hosted options keep data on-premises
Performance: Specialized hardware options (Groq LPU)
Flexibility: Easy to switch between providers
No Vendor Lock-in: Standard API format across providers


Implementation Date: 2026-03-13
Status: ✅ Complete and Validated