You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/providers.md
+17-24Lines changed: 17 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -92,51 +92,44 @@ azure_entra_id:
92
92
93
93
#### Llama Stack Configuration Requirements
94
94
95
-
Because Lightspeed builds on top of Llama Stack, certain configuration fields are required to satisfy the base Llama Stack schema. The config block for the Azure inference provider **must** include `api_key`, `api_base`, and `api_version` — Llama Stack will fail to start if any of these are missing.
95
+
Because Lightspeed builds on top of Llama Stack, certain configuration fields are required to satisfy the base Llama Stack schema. The config block for the Azure inference provider **must** include `base_url`and `api_version`. When using Entra ID authentication, `api_key` is not required to be configured, since the API key is acquired and passed automatically at runtime.
96
96
97
-
**Important:** The `api_key` field must be set to `${env.AZURE_API_KEY}` exactly as shown below. This is not optional — Lightspeed uses this specific environment variable name as a placeholder for injection of the Entra ID access token. Using a different variable name will break the authentication flow.
97
+
When `azure_entra_id`is configured in Lightspeed, config enrichment automatically sets `model_validation: false` on the `remote::azure` provider so Llama Stack can start without validating models against Azure at startup.
98
98
99
99
```yaml
100
100
inference:
101
101
- provider_id: azure
102
102
provider_type: remote::azure
103
103
config:
104
-
api_key: ${env.AZURE_API_KEY} # Must be exactly this - placeholder for Entra ID token
105
-
api_base: ${env.AZURE_API_BASE}
104
+
#api_key: ${env.AZURE_API_KEY} # Can be omitted when Entra ID configured in LCORE
105
+
base_url: ${env.AZURE_API_BASE}
106
106
api_version: 2025-01-01-preview
107
+
model_validation: false # added automatically by Lightspeed enrichment
107
108
```
108
109
109
-
**How it works:** At startup, Lightspeed acquires an Entra ID access token and stores it in the `AZURE_API_KEY` environment variable. When Llama Stack initializes, it reads the config, substitutes `${env.AZURE_API_KEY}` with the token value, and uses it to authenticate with Azure OpenAI. Llama Stack also calls `models.list()` during initialization to validate provider connectivity, which is why the token must be available before client initialization.
110
+
**How it works:** Llama Stack defers Azure authentication to inference time. Lightspeed acquires Entra ID tokens at runtime and passes them via the `X-LlamaStack-Provider-Data` header (`azure_api_key`, `azure_api_base`).
110
111
111
112
#### Access Token Lifecycle and Management
112
113
113
-
**Library mode startup:**
114
+
**Lightspeed startup (library and service mode):**
114
115
1. Lightspeed reads your Entra ID configuration
115
-
2. Acquires an initial access token from Microsoft Entra ID
116
-
3. Stores the token in the `AZURE_API_KEY` environment variable
117
-
4. **Then** initializes the Llama Stack library client
116
+
2. Does not acquire or cache access tokens at startup—authentication is deferred until request time
117
+
3. Initializes the Llama Stack client without Azure credentials; credentials are supplied later via `X-LlamaStack-Provider-Data` when an Azure model is used
118
118
119
-
This ordering is critical because Llama Stack calls `models.list()` during initialization to validate provider connectivity. If the token is not set before client initialization, Azure requests will fail with authentication errors.
120
-
121
-
**Service mode startup:**
122
-
123
-
When running Llama Stack as a separate service, Lightspeed runs a pre-startup script that:
124
-
1. Reads the Entra ID configuration
125
-
2. Acquires an initial access token
126
-
3. Writes the token to the `AZURE_API_KEY` environment variable
127
-
4. **Then** Llama Stack service starts
128
-
129
-
This initial token is used solely for the `models.list()` validation call during Llama Stack startup. After startup, Lightspeed manages token refresh independently and passes fresh tokens via request headers.
119
+
**Llama Stack service startup (container mode):**
120
+
1. Config enrichment sets `model_validation: false` on the Azure provider
121
+
2. Llama Stack starts without authenticating models against Azure
122
+
3. Lightspeed connects to this service at startup without Azure credentials; tokens are added only for Azure inference requests
130
123
131
124
**During inference requests:**
132
125
1. Before each request, Lightspeed checks if the token has expired
133
-
2. If expired, a new token is automatically acquired and the environment variable is updated
134
-
3. For library mode: the Llama Stack client is reloaded to pick up the new token
135
-
4. For service mode: the token is passed via `X-LlamaStack-Provider-Data` request headers
126
+
2. If expired, a new token is automatically acquired and cached in memory
127
+
3. The token is passed via `X-LlamaStack-Provider-Data` (library and service mode)
136
128
137
129
**Token security:**
138
130
- Access tokens are wrapped in `SecretStr` to prevent accidental logging
139
-
- Tokens are stored only in the `AZURE_API_KEY` environment variable (single source of truth)
131
+
- Tokens are cached in `AzureEntraIDManager` singleton class
0 commit comments