Skip to content

Commit e6b07de

Browse files
authored
Merge pull request #101 from owndev/copilot/support-openwebui-citations
Add native OpenWebUI citation support for Azure AI Search responses
2 parents b74eeae + 8bfb7c0 commit e6b07de

4 files changed

Lines changed: 1058 additions & 306 deletions

File tree

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ The functions include a built-in encryption mechanism for sensitive information:
9999
100100
- Enables interaction with **Azure OpenAI** and other **Azure AI** models.
101101
- Supports Azure Search integration for enhanced document retrieval.
102+
- **Native OpenWebUI Citations Support** 🎯: Rich citation cards, source previews, and inline citation correlations for Azure AI Search responses (Azure OpenAI only).
102103
- Supports multiple Azure AI models selection via the `AZURE_AI_MODEL` environment variable (e.g. `gpt-4o;gpt-4o-mini`).
103104
- Customizable pipeline display with configurable prefix via `AZURE_AI_PIPELINE_PREFIX`.
104105
- Azure AI Search / RAG integration with enhanced collapsible citation display (Azure OpenAI only).
@@ -112,6 +113,8 @@ The functions include a built-in encryption mechanism for sensitive information:
112113
113114
🔗 [Learn More About Azure AI](https://azure.microsoft.com/en-us/solutions/ai)
114115
116+
📖 [Azure AI Citations Documentation](./docs/azure-ai-citations.md)
117+
115118
### **2. [N8N Pipeline](./pipelines/n8n/n8n.py)**
116119
117120
> [!TIP]

docs/azure-ai-citations.md

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
# Azure AI Foundry Pipeline - Native OpenWebUI Citations
2+
3+
This document describes the native OpenWebUI citation support in the Azure AI Foundry Pipeline, which enables rich citation cards and source previews in the OpenWebUI frontend.
4+
5+
## Overview
6+
7+
The Azure AI Foundry Pipeline supports **native OpenWebUI citations** for Azure AI Search (RAG) responses. This feature is **automatically enabled** when you configure Azure AI Search data sources (`AZURE_AI_DATA_SOURCES`). The OpenWebUI frontend will display:
8+
9+
- **Citation cards** with source information and relevance scores
10+
- **Source previews** with content snippets
11+
- **Relevance percentage** displayed on citation cards (requires `AZURE_AI_INCLUDE_SEARCH_SCORES=true`)
12+
- **Clickable `[docX]` references** that link directly to document URLs
13+
- **Interactive citation UI** with expandable source details
14+
15+
## Features
16+
17+
### Automatic Citation Support
18+
19+
When Azure AI Search is configured, the pipeline automatically:
20+
21+
1. Emits citation events via `__event_emitter__` for the OpenWebUI frontend
22+
2. Converts `[docX]` references in the response to clickable markdown links
23+
3. Filters citations to only show documents actually referenced in the response
24+
4. Extracts relevance scores from Azure Search when available
25+
26+
### Configuration Options
27+
28+
| Environment Variable | Default | Description |
29+
|---------------------|---------|-------------|
30+
| `AZURE_AI_DATA_SOURCES` | `""` | JSON configuration for Azure AI Search (required for citations) |
31+
| `AZURE_AI_INCLUDE_SEARCH_SCORES` | `true` | Enable relevance score extraction from Azure Search |
32+
33+
### How It Works
34+
35+
#### Streaming Responses
36+
37+
When Azure AI Search returns citations in a streaming response:
38+
39+
1. The pipeline detects citations in the SSE (Server-Sent Events) stream
40+
2. `[docX]` references in each chunk are converted to markdown links with document URLs
41+
3. After the stream ends, citation events are emitted via `__event_emitter__`
42+
4. Citations are filtered to only include documents referenced in the response
43+
44+
#### Non-Streaming Responses
45+
46+
When Azure AI Search returns citations in a non-streaming response:
47+
48+
1. The pipeline extracts citations from the response context
49+
2. `[docX]` references in the content are converted to markdown links
50+
3. Individual citation events are emitted via `__event_emitter__` for each referenced source
51+
52+
## Citation Format
53+
54+
### OpenWebUI Citation Event Structure
55+
56+
Each citation is emitted as a separate event to ensure all sources appear in the UI. Citation events follow the official OpenWebUI specification (see [OpenWebUI Events Documentation](https://docs.openwebui.com/features/plugin/development/events#source-or-citation-and-code-execution)):
57+
58+
```python
59+
{
60+
"type": "citation",
61+
"data": {
62+
"document": ["Document content..."], # Content from this citation
63+
"metadata": [{"source": "https://..."}], # Metadata with source URL
64+
"source": {
65+
"name": "[doc1] Document Title", # Unique name with index
66+
"url": "https://..." # Source URL if available
67+
},
68+
"distances": [0.95] # Relevance score (displayed as percentage)
69+
}
70+
}
71+
```
72+
73+
Key points:
74+
- Each source document gets its own citation event
75+
- The `source.name` includes the doc index (`[doc1]`, `[doc2]`, etc.) to prevent grouping
76+
- The `distances` array contains relevance scores from Azure AI Search, which OpenWebUI displays as a percentage on the citation cards
77+
78+
### Azure Citation Format (Input)
79+
80+
Azure AI Search returns citations in this format:
81+
82+
```python
83+
{
84+
"title": "Document Title",
85+
"content": "Full or partial content",
86+
"url": "https://...",
87+
"filepath": "/path/to/file",
88+
"chunk_id": "chunk-123",
89+
"score": 0.95,
90+
"metadata": {}
91+
}
92+
```
93+
94+
The pipeline automatically converts Azure citations to OpenWebUI format.
95+
96+
## Usage
97+
98+
### Basic Setup
99+
100+
Configure Azure AI Search to enable citation support:
101+
102+
```bash
103+
# Azure AI Search configuration (required for citations)
104+
AZURE_AI_DATA_SOURCES='[{"type":"azure_search","parameters":{"endpoint":"https://YOUR-SEARCH-SERVICE.search.windows.net","index_name":"YOUR-INDEX-NAME","authentication":{"type":"api_key","key":"YOUR-SEARCH-API-KEY"}}}]'
105+
106+
# Enable relevance scores (default: true)
107+
AZURE_AI_INCLUDE_SEARCH_SCORES=true
108+
```
109+
110+
### Clickable Document Links
111+
112+
The pipeline automatically converts `[docX]` references to clickable markdown links:
113+
114+
```markdown
115+
# Input from Azure AI
116+
The answer can be found in [doc1] and [doc2].
117+
118+
# Output (converted by pipeline)
119+
The answer can be found in [[doc1]](https://example.com/doc1.pdf) and [[doc2]](https://example.com/doc2.pdf).
120+
```
121+
122+
This works for both streaming and non-streaming responses.
123+
124+
### Relevance Scores
125+
126+
When `AZURE_AI_INCLUDE_SEARCH_SCORES=true` (default), the pipeline:
127+
128+
1. Automatically adds `include_contexts: ["citations", "all_retrieved_documents"]` to Azure Search requests
129+
2. Extracts scores based on the `filter_reason` field:
130+
- `filter_reason="rerank"` → uses `rerank_score`
131+
- `filter_reason="score"` or not present → uses `original_search_score`
132+
3. Displays the score as a percentage on citation cards
133+
134+
## Implementation Details
135+
136+
### Helper Functions
137+
138+
The pipeline includes these helper functions for citation processing:
139+
140+
1. **`_extract_citations_from_response()`**: Extracts citations from Azure responses
141+
2. **`_normalize_citation_for_openwebui()`**: Converts Azure citations to OpenWebUI format
142+
3. **`_emit_openwebui_citation_events()`**: Emits citation events via `__event_emitter__`
143+
4. **`_merge_score_data()`**: Matches citations with score data from `all_retrieved_documents`
144+
5. **`_build_citation_urls_map()`**: Builds mapping of citation indices to URLs
145+
6. **`_format_citation_link()`**: Creates markdown links for `[docX]` references
146+
7. **`_convert_doc_refs_to_links()`**: Converts all `[docX]` references in content to markdown links
147+
148+
### Title Fallback Logic
149+
150+
The pipeline uses intelligent title fallback:
151+
152+
1. Use `title` field if available
153+
2. Fallback to filename extracted from `filepath` or `url`
154+
3. Fallback to `"Unknown Document"` if all are empty
155+
156+
This ensures every citation has a meaningful display name.
157+
158+
### Citation Filtering
159+
160+
Citations are filtered to only show documents that are actually referenced in the response content. For example, if Azure returns 5 citations but the response only references `[doc1]` and `[doc3]`, only those 2 citations will appear in the UI.
161+
162+
## Troubleshooting
163+
164+
### Citations Not Appearing
165+
166+
**Problem**: Citations don't appear in the OpenWebUI frontend
167+
168+
**Solutions**:
169+
1. Check that Azure AI Search is properly configured (`AZURE_AI_DATA_SOURCES`)
170+
2. Ensure you're using an Azure OpenAI endpoint (not a generic Azure AI endpoint)
171+
3. Verify the response contains `[docX]` references
172+
4. Check browser console and server logs for errors
173+
174+
### Relevance Scores Showing 0%
175+
176+
**Problem**: All citation cards show 0% relevance
177+
178+
**Solutions**:
179+
1. Verify `AZURE_AI_INCLUDE_SEARCH_SCORES=true` is set
180+
2. Check that your Azure Search index supports scoring
181+
3. Enable DEBUG logging to see the raw score values from Azure
182+
183+
### Links Not Working
184+
185+
**Problem**: `[docX]` references are not clickable
186+
187+
**Solutions**:
188+
1. Ensure citations have valid `url` or `filepath` fields
189+
2. Check that the document URL is accessible
190+
3. Verify the markdown link format is being generated correctly
191+
192+
## References
193+
194+
- [OpenWebUI Pipelines Citation Feature Discussion](https://github.com/open-webui/pipelines/issues/229)
195+
- [OpenWebUI Event Emitter Documentation](https://docs.openwebui.com/features/plugin/development/events)
196+
- [Azure AI Search Documentation](https://learn.microsoft.com/en-us/azure/search/)
197+
- [Azure On Your Data API Reference](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/references/on-your-data)
198+
199+
## Version History
200+
201+
- **v2.6.0**: Major refactor - removed `AZURE_AI_ENHANCE_CITATIONS` and `AZURE_AI_OPENWEBUI_CITATIONS` valves; citation support is now always enabled when `AZURE_AI_DATA_SOURCES` is configured; added clickable `[docX]` markdown links; improved score extraction using `filter_reason` field
202+
- **v2.5.x**: Dual citation modes (OpenWebUI events + markdown/HTML)

docs/azure-ai-integration.md

Lines changed: 26 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -60,8 +60,9 @@ AZURE_AI_ENDPOINT="https://<deployment>.openai.azure.com/openai/deployments/<mod
6060
# Complete JSON configuration for Azure Search - copy exactly and replace placeholder values
6161
AZURE_AI_DATA_SOURCES='[{"type":"azure_search","parameters":{"endpoint":"https://<your-search-service>.search.windows.net","index_name":"<your-index-name>","authentication":{"type":"api_key","key":"<your-search-api-key>"}}}]'
6262

63-
# Enable enhanced citation display for better readability (default: true)
64-
AZURE_AI_ENHANCE_CITATIONS=true
63+
# Enable relevance score extraction from Azure Search (default: true)
64+
# When enabled, automatically adds include_contexts to get original_search_score and rerank_score
65+
AZURE_AI_INCLUDE_SEARCH_SCORES=true
6566
```
6667

6768
### Azure AI Search / RAG Integration
@@ -155,73 +156,40 @@ For advanced use cases, you can include additional parameters:
155156
- **Missing API key**: Ensure your Azure Search API key has proper permissions
156157
- **Index not found**: Verify your index name matches exactly (case-sensitive)
157158

158-
#### Enhanced Citation Display
159+
#### Native OpenWebUI Citation Support
159160

160-
The pipeline automatically enhances Azure AI Search responses to make citations and source documents more accessible and readable. When Azure AI Search is configured, the pipeline transforms the raw citation data into a user-friendly format.
161+
The pipeline automatically provides native OpenWebUI citation support for Azure AI Search responses. When Azure AI Search is configured, the pipeline:
161162

162-
**Original Azure AI Response:**
163+
1. **Emits citation events** via `__event_emitter__` for the OpenWebUI frontend to display interactive citation cards
164+
2. **Converts `[docX]` references** to clickable markdown links that link directly to document URLs
165+
3. **Extracts relevance scores** when `AZURE_AI_INCLUDE_SEARCH_SCORES=true`
166+
4. **Filters citations** to only show documents actually referenced in the response
163167

164-
```json
165-
{
166-
"choices": [
167-
{
168-
"message": {
169-
"content": "**Docker container actions** are a type of GitHub Actions [doc1]...",
170-
"context": {
171-
"citations": [
172-
{
173-
"content": "environment variable. The token can be used to authenticate...",
174-
"title": "README.md",
175-
"chunk_id": "0"
176-
}
177-
]
178-
}
179-
}
180-
}
181-
]
182-
}
183-
```
184-
185-
**Enhanced Response with Collapsible Citations:**
168+
**Example: Clickable Document Links**
186169

187-
```html
170+
```markdown
171+
# Original Azure AI response
188172
**Docker container actions** are a type of GitHub Actions [doc1]...
189173

190-
<details>
191-
<summary>📚 Sources and References</summary>
192-
193-
<details>
194-
<summary>[doc1] - README.md</summary>
195-
196-
📁 **File:** `README.md`
197-
📄 **Chunk ID:** 0
198-
**Content:**
199-
> environment variable. The token can be used to authenticate the workflow when accessing GitHub resources...
200-
201-
</details>
202-
203-
<details>
204-
<summary>[doc2] - Documentation.md</summary>
174+
# Enhanced response (with clickable links)
175+
**Docker container actions** are a type of GitHub Actions [[doc1]](https://example.com/README.md)...
176+
```
205177

206-
📁 **File:** `Documentation.md`
207-
📄 **Chunk ID:** 1
208-
**Content:**
209-
> Docker container actions contain all their dependencies in the container and are therefore very consistent...
178+
**Citation Card Features:**
210179

211-
</details>
180+
- **Source information** with `[docX]` prefix for easy identification
181+
- **Relevance percentage** displayed on citation cards (requires `AZURE_AI_INCLUDE_SEARCH_SCORES=true`)
182+
- **Document preview** with content snippets
183+
- **Clickable links** to source documents when URLs are available
184+
- **Streaming support** with links converted inline as content streams
212185

213-
</details>
214-
```
186+
**Relevance Score Selection:**
215187

216-
**Enhanced Citation Features:**
188+
The pipeline uses the `filter_reason` field from Azure Search to select the appropriate score:
189+
- `filter_reason="rerank"` → uses `rerank_score`
190+
- `filter_reason="score"` or not present → uses `original_search_score`
217191

218-
- **Collapsible interface** with expandable sections for clean presentation
219-
- **Two-level organization** - main sources section and individual document details
220-
- **Complete content display** - full document content, not just previews
221-
- **Document references** with clear [doc1], [doc2] labels for easy cross-referencing
222-
- **Source metadata** including file paths, URLs, and chunk IDs for precise tracking
223-
- **Streaming support** with citations properly formatted for both streaming and non-streaming responses
224-
- **Space efficient** - collapsed by default to avoid overwhelming the main response
192+
For more details, see the [Azure AI Citations Documentation](azure-ai-citations.md).
225193

226194
> [!TIP]
227195
> To use **Azure OpenAI** and other **Azure AI** models **simultaneously**, you can use the following URL: `https://<your project>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview`

0 commit comments

Comments
 (0)