You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/azure-ai-citations.md
+192Lines changed: 192 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,6 +71,7 @@ Each citation is emitted as a separate event to ensure all sources appear in the
71
71
```
72
72
73
73
Key points:
74
+
74
75
- Each source document gets its own citation event
75
76
- The `source.name` includes the doc index (`[doc1]`, `[doc2]`, etc.) to prevent grouping
76
77
- The `distances` array contains relevance scores from Azure AI Search, which OpenWebUI displays as a percentage on the citation cards
@@ -159,23 +160,210 @@ This ensures every citation has a meaningful display name.
159
160
160
161
Citations are filtered to only show documents that are actually referenced in the response content. For example, if Azure returns 5 citations but the response only references `[doc1]` and `[doc3]`, only those 2 citations will appear in the UI.
161
162
163
+
## Index Schema Requirements for Citations
164
+
165
+
For citations to work correctly, your Azure AI Search index must contain the right fields with the right attributes. This section explains exactly which fields the pipeline reads and how they map to citation cards in OpenWebUI.
166
+
167
+
### Required and Recommended Index Fields
168
+
169
+
| Index Field | Type | Required? | Must Be Retrievable? | Citation Purpose |
170
+
|---|---|---|---|---|
171
+
|`content`|`Edm.String`| Yes | Yes | Provides the text snippet shown in the citation preview |
172
+
|`title`|`Edm.String`| Recommended | Yes | Displayed as the citation card title |
173
+
|`filepath`|`Edm.String`| Recommended | Yes | Used as the citation name in the response; fallback for title |
174
+
|`url`|`Edm.String`| Recommended | Yes | Makes `[docX]` references into clickable links |
175
+
|`chunk_id`|`Edm.String`| Optional | Yes | Helps match citations with relevance scores |
> **Key point**: The `title`, `filepath`, and `url` fields must be marked as **retrievable** in your index schema. If they are not retrievable, Azure will not include them in the citation response, and the pipeline cannot display them.
179
+
180
+
### Title Fallback Chain
181
+
182
+
The pipeline determines each citation's display title using this fallback chain:
183
+
184
+
1.`title` field → if present and non-empty
185
+
2.`filepath` field → if title is empty
186
+
3.`url` field → if both title and filepath are empty
187
+
4.`"Unknown Document"` → if all are empty
188
+
189
+
To avoid seeing "Unknown Document", ensure at least one of `title`, `filepath`, or `url` is populated in your index documents.
190
+
191
+
### Custom Field Names and `fields_mapping`
192
+
193
+
If your index uses different field names (e.g., `body` instead of `content`, or `doc_title` instead of `title`), you must tell Azure OpenAI how to map them using the `fields_mapping` parameter in your `AZURE_AI_DATA_SOURCES` configuration.
194
+
195
+
**`fields_mapping` properties:**
196
+
197
+
| Property | Type | Maps To |
198
+
|---|---|---|
199
+
|`content_fields`|`string[]`| The index fields to use as document content |
200
+
|`title_field`|`string`| The index field to use as the document title |
201
+
|`filepath_field`|`string`| The index field to use as the file path/name |
202
+
|`url_field`|`string`| The index field to use as the document URL |
203
+
|`vector_fields`|`string[]`| The index fields containing vector embeddings |
204
+
|`content_fields_separator`|`string`| Separator pattern between content fields (default: `\n`) |
> **Note**: The `content` field is automatically mapped when the source and target field names match. The blob indexer also **automatically** maps `metadata_storage_path` (base64-encoded) to the `id` key field — no explicit mapping is needed for `id`. Mapping `metadata_storage_name` → `title` gives citation cards a readable name from the blob filename.
301
+
302
+
### How the Pipeline Reads Citation Fields
303
+
304
+
When Azure OpenAI returns a response with citations, each citation object looks like this:
305
+
306
+
```json
307
+
{
308
+
"title": "Architecture Overview",
309
+
"content": "The system uses a microservices architecture...",
-[Azure AI Search Indexer Field Mappings](https://learn.microsoft.com/en-us/azure/search/search-indexer-field-mappings)
389
+
-[Azure OpenAI On Your Data - Index Field Mapping](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/use-your-data#index-field-mapping)
0 commit comments