@@ -130,14 +130,14 @@ Lightspeed Core Stack is based on the FastAPI framework (Uvicorn). The service i
130130
131131 Lightspeed Stack supports multiple LLM providers.
132132
133- | Provider | Setup Documentation |
134- | ----------------| -----------------------------------------------------------------------|
135- | OpenAI | https://platform.openai.com |
136- | Azure OpenAI | https://azure.microsoft.com/en-us/products/ai-services/openai-service |
137- | Google VertexAI| https://cloud.google.com/vertex-ai |
138- | IBM WatsonX | https://www.ibm.com/products/watsonx |
139- | RHOAI (vLLM) | See tests/e2e-prow/rhoai/configs/run.yaml |
140- | RHEL AI (vLLM) | See tests/e2e/configs/run-rhelai.yaml |
133+ | Provider | Setup Documentation |
134+ | ----------------- | -----------------------------------------------------------------------|
135+ | OpenAI | https://platform.openai.com |
136+ | Azure OpenAI | https://azure.microsoft.com/en-us/products/ai-services/openai-service |
137+ | Google VertexAI | https://cloud.google.com/vertex-ai |
138+ | IBM WatsonX | https://www.ibm.com/products/watsonx |
139+ | RHOAI (vLLM) | See tests/e2e-prow/rhoai/configs/run.yaml |
140+ | RHEL AI (vLLM) | See tests/e2e/configs/run-rhelai.yaml |
141141
142142 See ` docs/providers.md ` for configuration details.
143143
@@ -200,17 +200,18 @@ To quickly get hands on LCS, we can run it using the default configurations prov
200200Lightspeed Core Stack (LCS) provides support for Large Language Model providers. The models listed in the table below represent specific examples that have been tested within LCS.
201201__Note__: Support for individual models is dependent on the specific inference provider' s implementation within the currently supported version of Llama Stack.
202202
203- | Provider | Model | Tool Calling | provider_type | Example |
204- | -------- | ---------------------------------------------- | ------------ | -------------- | -------------------------------------------------------------------------- |
205- | OpenAI | gpt-5, gpt-4o, gpt4-turbo, gpt-4.1, o1, o3, o4 | Yes | remote::openai | [1](examples/openai-faiss-run.yaml) [2](examples/openai-pgvector-run.yaml) |
206- | OpenAI | gpt-3.5-turbo, gpt-4 | No | remote::openai | |
207- | RHOAI (vLLM)| meta-llama/Llama-3.2-1B-Instruct | Yes | remote::vllm | [1](tests/e2e-prow/rhoai/configs/run.yaml) |
208- | RHAIIS (vLLM)| meta-llama/Llama-3.1-8B-Instruct | Yes | remote::vllm | [1](tests/e2e/configs/run-rhaiis.yaml) |
209- | RHEL AI (vLLM)| meta-llama/Llama-3.1-8B-Instruct | Yes | remote::vllm | [1](tests/e2e/configs/run-rhelai.yaml) |
210- | Azure | gpt-5, gpt-5-mini, gpt-5-nano, gpt-4o-mini, o3-mini, o4-mini, o1| Yes | remote::azure | [1](examples/azure-run.yaml) |
211- | Azure | gpt-5-chat, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o1-mini | No or limited | remote::azure | |
212- | VertexAI | google/gemini-2.0-flash, google/gemini-2.5-flash, google/gemini-2.5-pro [^1] | Yes | remote::vertexai | [1](examples/vertexai-run.yaml) |
213- | WatsonX | meta-llama/llama-3-3-70b-instruct | Yes | remote::watsonx | [1](examples/watsonx-run.yaml) |
203+ | Provider | Model | Tool Calling | provider_type | Example |
204+ | ----------------| ------------------------------------------------------------------------------| ---------------| ------------------| ----------------------------------------------------------------------------|
205+ | OpenAI | gpt-5, gpt-4o, gpt4-turbo, gpt-4.1, o1, o3, o4 | Yes | remote::openai | [1](examples/openai-faiss-run.yaml) [2](examples/openai-pgvector-run.yaml) |
206+ | OpenAI | gpt-3.5-turbo, gpt-4 | No | remote::openai | |
207+ | RHOAI (vLLM) | meta-llama/Llama-3.2-1B-Instruct | Yes | remote::vllm | [1](tests/e2e-prow/rhoai/configs/run.yaml) |
208+ | RHAIIS (vLLM) | meta-llama/Llama-3.1-8B-Instruct | Yes | remote::vllm | [1](tests/e2e/configs/run-rhaiis.yaml) |
209+ | RHEL AI (vLLM) | meta-llama/Llama-3.1-8B-Instruct | Yes | remote::vllm | [1](tests/e2e/configs/run-rhelai.yaml) |
210+ | Azure | gpt-5, gpt-5-mini, gpt-5-nano, gpt-4o-mini, o3-mini, o4-mini, o1 | Yes | remote::azure | [1](examples/azure-run.yaml) |
211+ | Azure | gpt-5-chat, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o1-mini | No or limited | remote::azure | |
212+ | VertexAI | google/gemini-2.0-flash, google/gemini-2.5-flash, google/gemini-2.5-pro [^1] | Yes | remote::vertexai | [1](examples/vertexai-run.yaml) |
213+ | WatsonX | meta-llama/llama-3-3-70b-instruct | Yes | remote::watsonx | [1](examples/watsonx-run.yaml) |
214+
214215
215216[^1]: List of models is limited by design in llama-stack, future versions will probably allow to use more models (see [here](https://github.com/llamastack/llama-stack/blob/release-0.3.x/llama_stack/providers/remote/inference/vertexai/vertexai.py#L54))
216217
@@ -492,12 +493,13 @@ mcp_servers:
492493
493494# #### Authentication Method Comparison
494495
495- | Method | Use Case | Configuration | Token Scope | Example |
496- | --------| ----------| ---------------| -------------| ---------|
497- | ** Static File** | Service tokens, API keys | File path in config | Global (all users) | ` " /var/secrets/token" ` |
498- | ** Kubernetes** | K8s service accounts | ` " kubernetes" ` keyword | Per-user (from auth) | ` " kubernetes" ` |
499- | ** Client** | User-specific tokens | ` " client" ` keyword + HTTP header | Per-request | ` " client" ` |
500- | ** OAuth** | OAuth-protected MCP servers | ` " oauth" ` keyword + HTTP header | Per-request (from OAuth flow) | ` " oauth" ` |
496+ | Method | Use Case | Configuration | Token Scope | Example |
497+ | -----------------| -----------------------------| ----------------------------------| -------------------------------| ------------------------|
498+ | ** Static File** | Service tokens, API keys | File path in config | Global (all users) | ` " /var/secrets/token" ` |
499+ | ** Kubernetes** | K8s service accounts | ` " kubernetes" ` keyword | Per-user (from auth) | ` " kubernetes" ` |
500+ | ** Client** | User-specific tokens | ` " client" ` keyword + HTTP header | Per-request | ` " client" ` |
501+ | ** OAuth** | OAuth-protected MCP servers | ` " oauth" ` keyword + HTTP header | Per-request (from OAuth flow) | ` " oauth" ` |
502+
501503
502504# #### Important: Automatic Server Skipping
503505
@@ -804,7 +806,7 @@ verify Run all linters
804806distribution-archives Generate distribution archives to be uploaded into Python registry
805807upload-distribution-archives Upload distribution archives into Python registry
806808konflux-requirements generate hermetic requirements.* .txt file for konflux build
807- konflux-rpm-lock generate rpm.lock.yaml file for konflux build
809+ konflux-rpm-lock generate rpm.lock.yaml file for konflux build
808810` ` `
809811
810812# # Running Linux container image
0 commit comments