Azure AI Foundry Pipeline - Native OpenWebUI Citations

This document describes the native OpenWebUI citation support in the Azure AI Foundry Pipeline, which enables rich citation cards and source previews in the OpenWebUI frontend.

Overview

The Azure AI Foundry Pipeline supports native OpenWebUI citations for Azure AI Search (RAG) responses. This feature is automatically enabled when you configure Azure AI Search data sources (AZURE_AI_DATA_SOURCES). The OpenWebUI frontend will display:

Citation cards with source information and relevance scores
Source previews with content snippets
Relevance percentage displayed on citation cards (requires AZURE_AI_INCLUDE_SEARCH_SCORES=true)
Clickable [docX] references that link directly to document URLs
Interactive citation UI with expandable source details

Features

Automatic Citation Support

When Azure AI Search is configured, the pipeline automatically:

Emits citation events via __event_emitter__ for the OpenWebUI frontend
Converts [docX] references in the response to clickable markdown links
Filters citations to only show documents actually referenced in the response
Extracts relevance scores from Azure Search when available

Configuration Options

Environment Variable	Default	Description
`AZURE_AI_DATA_SOURCES`	`""`	JSON configuration for Azure AI Search (required for citations)
`AZURE_AI_INCLUDE_SEARCH_SCORES`	`true`	Enable relevance score extraction from Azure Search

How It Works

Streaming Responses

When Azure AI Search returns citations in a streaming response:

The pipeline detects citations in the SSE (Server-Sent Events) stream
[docX] references in each chunk are converted to markdown links with document URLs
After the stream ends, citation events are emitted via __event_emitter__
Citations are filtered to only include documents referenced in the response

Non-Streaming Responses

When Azure AI Search returns citations in a non-streaming response:

The pipeline extracts citations from the response context
[docX] references in the content are converted to markdown links
Individual citation events are emitted via __event_emitter__ for each referenced source

Citation Format

OpenWebUI Citation Event Structure

Each citation is emitted as a separate event to ensure all sources appear in the UI. Citation events follow the official OpenWebUI specification (see OpenWebUI Events Documentation):

{
    "type": "citation",
    "data": {
        "document": ["Document content..."],  # Content from this citation
        "metadata": [{"source": "https://..."}],  # Metadata with source URL
        "source": {
            "name": "[doc1] Document Title",  # Unique name with index
            "url": "https://..."  # Source URL if available
        },
        "distances": [0.95]  # Relevance score (displayed as percentage)
    }
}

Key points:

Each source document gets its own citation event
The source.name includes the doc index ([doc1], [doc2], etc.) to prevent grouping
The distances array contains relevance scores from Azure AI Search, which OpenWebUI displays as a percentage on the citation cards

Azure Citation Format (Input)

Azure AI Search returns citations in this format:

{
    "title": "Document Title",
    "content": "Full or partial content",
    "url": "https://...",
    "filepath": "/path/to/file",
    "chunk_id": "chunk-123",
    "score": 0.95,
    "metadata": {}
}

The pipeline automatically converts Azure citations to OpenWebUI format.

Usage

Basic Setup

Configure Azure AI Search to enable citation support:

# Azure AI Search configuration (required for citations)
AZURE_AI_DATA_SOURCES='[{"type":"azure_search","parameters":{"endpoint":"https://YOUR-SEARCH-SERVICE.search.windows.net","index_name":"YOUR-INDEX-NAME","authentication":{"type":"api_key","key":"YOUR-SEARCH-API-KEY"}}}]'

# Enable relevance scores (default: true)
AZURE_AI_INCLUDE_SEARCH_SCORES=true

Clickable Document Links

The pipeline automatically converts [docX] references to clickable markdown links:

# Input from Azure AI
The answer can be found in [doc1] and [doc2].

# Output (converted by pipeline)
The answer can be found in [[doc1]](https://example.com/doc1.pdf) and [[doc2]](https://example.com/doc2.pdf).

This works for both streaming and non-streaming responses.

Relevance Scores

When AZURE_AI_INCLUDE_SEARCH_SCORES=true (default), the pipeline:

Automatically adds include_contexts: ["citations", "all_retrieved_documents"] to Azure Search requests
Extracts scores based on the filter_reason field:
- filter_reason="rerank" → uses rerank_score
- filter_reason="score" or not present → uses original_search_score
Displays the score as a percentage on citation cards

Implementation Details

Helper Functions

The pipeline includes these helper functions for citation processing:

_extract_citations_from_response(): Extracts citations from Azure responses
_normalize_citation_for_openwebui(): Converts Azure citations to OpenWebUI format
_emit_openwebui_citation_events(): Emits citation events via __event_emitter__
_merge_score_data(): Matches citations with score data from all_retrieved_documents
_build_citation_urls_map(): Builds mapping of citation indices to URLs
_format_citation_link(): Creates markdown links for [docX] references
_convert_doc_refs_to_links(): Converts all [docX] references in content to markdown links

Title Fallback Logic

The pipeline uses intelligent title fallback:

Use title field if available
Fallback to filename extracted from filepath or url
Fallback to "Unknown Document" if all are empty

This ensures every citation has a meaningful display name.

Citation Filtering

Citations are filtered to only show documents that are actually referenced in the response content. For example, if Azure returns 5 citations but the response only references [doc1] and [doc3], only those 2 citations will appear in the UI.

Index Schema Requirements for Citations

For citations to work correctly, your Azure AI Search index must contain the right fields with the right attributes. This section explains exactly which fields the pipeline reads and how they map to citation cards in OpenWebUI.

Required and Recommended Index Fields

Index Field	Type	Required?	Must Be Retrievable?	Citation Purpose
`content`	`Edm.String`	Yes	Yes	Provides the text snippet shown in the citation preview
`title`	`Edm.String`	Recommended	Yes	Displayed as the citation card title
`filepath`	`Edm.String`	Recommended	Yes	Used as the citation name in the response; fallback for title
`url`	`Edm.String`	Recommended	Yes	Makes `[docX]` references into clickable links
`chunk_id`	`Edm.String`	Optional	Yes	Helps match citations with relevance scores
`contentVector`	`Collection(Edm.Single)`	For vector search	N/A	Enables vector/hybrid search

Key point: The title, filepath, and url fields must be marked as retrievable in your index schema. If they are not retrievable, Azure will not include them in the citation response, and the pipeline cannot display them.

Title Fallback Chain

The pipeline determines each citation's display title using this fallback chain:

title field → if present and non-empty
filepath field → if title is empty
url field → if both title and filepath are empty
"Unknown Document" → if all are empty

To avoid seeing "Unknown Document", ensure at least one of title, filepath, or url is populated in your index documents.

Custom Field Names and `fields_mapping`

If your index uses different field names (e.g., body instead of content, or doc_title instead of title), you must tell Azure OpenAI how to map them using the fields_mapping parameter in your AZURE_AI_DATA_SOURCES configuration.

fields_mapping properties:

Property	Type	Maps To
`content_fields`	`string[]`	The index fields to use as document content
`title_field`	`string`	The index field to use as the document title
`filepath_field`	`string`	The index field to use as the file path/name
`url_field`	`string`	The index field to use as the document URL
`vector_fields`	`string[]`	The index fields containing vector embeddings
`content_fields_separator`	`string`	Separator pattern between content fields (default: `\n`)

Example with custom field names:

[
  {
    "type": "azure_search",
    "parameters": {
      "endpoint": "https://my-search.search.windows.net",
      "index_name": "my-custom-index",
      "authentication": {
        "type": "api_key",
        "key": "YOUR-SEARCH-API-KEY"
      },
      "fields_mapping": {
        "content_fields": ["body", "summary"],
        "title_field": "doc_title",
        "filepath_field": "source_file",
        "url_field": "source_url",
        "vector_fields": ["embedding"]
      }
    }
  }
]

Creating an Index with the Right Fields

If you are creating a new index manually, here is a minimal schema that supports all citation features:

{
  "name": "my-docs-index",
  "fields": [
    { "name": "id", "type": "Edm.String", "key": true, "filterable": true },
    { "name": "content", "type": "Edm.String", "searchable": true, "retrievable": true },
    { "name": "title", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true },
    { "name": "filepath", "type": "Edm.String", "retrievable": true, "filterable": true },
    { "name": "url", "type": "Edm.String", "retrievable": true },
    { "name": "chunk_id", "type": "Edm.String", "retrievable": true, "filterable": true }
  ]
}

For vector/hybrid search, add a vector field:

{ "name": "contentVector", "type": "Collection(Edm.Single)", "searchable": true, "dimensions": 1536, "vectorSearchProfile": "my-vector-profile" }

Indexer Field Mappings (Blob Storage)

If you index documents from Azure Blob Storage using an indexer, you need to map blob metadata to your index fields. Common blob metadata fields:

Blob Metadata Field	Description	Typical Index Mapping
`metadata_storage_name`	Blob filename (e.g., `report.pdf`)	`title`
`metadata_storage_path`	Full blob URL (e.g., `https://account.blob.core.windows.net/container/file.pdf`)	`filepath` and `url`
`metadata_storage_last_modified`	Last modified timestamp	`last_modified` (optional, useful for sorting)
`metadata_storage_content_type`	MIME type	(optional, useful for filtering)
`content`	Extracted text from the document	`content` (auto-mapped if names match)

Example indexer with field mappings:

{
  "name": "my-blob-indexer",
  "dataSourceName": "my-blob-datasource",
  "targetIndexName": "my-docs-index",
  "fieldMappings": [
    {
      "sourceFieldName": "metadata_storage_name",
      "targetFieldName": "title"
    },
    {
      "sourceFieldName": "metadata_storage_path",
      "targetFieldName": "filepath"
    },
    {
      "sourceFieldName": "metadata_storage_path",
      "targetFieldName": "url"
    },
    {
      "sourceFieldName": "metadata_storage_last_modified",
      "targetFieldName": "last_modified"
    }
  ],
  "parameters": {
    "configuration": {
      "dataToExtract": "contentAndMetadata"
    }
  }
}

Note: The content field is automatically mapped when the source and target field names match. The blob indexer also automatically maps metadata_storage_path (base64-encoded) to the id key field — no explicit mapping is needed for id. Mapping metadata_storage_name → title gives citation cards a readable name from the blob filename.

How the Pipeline Reads Citation Fields

When Azure OpenAI returns a response with citations, each citation object looks like this:

{
  "title": "Architecture Overview",
  "content": "The system uses a microservices architecture...",
  "url": "https://storageaccount.blob.core.windows.net/docs/architecture.pdf",
  "filepath": "architecture.pdf",
  "chunk_id": "0",
  "original_search_score": 12.5,
  "rerank_score": 3.2,
  "filter_reason": "rerank"
}

The pipeline maps these fields to the OpenWebUI citation event:

Azure Citation Field	OpenWebUI Citation Property	Display
`title`	`source.name`	`[doc1] - Architecture Overview`
`content`	`document[0]`	Preview text in citation card
`url` / `filepath`	`source.url`	Clickable link
`rerank_score` / `original_search_score`	`distances[0]`	Relevance percentage

Troubleshooting

Citations Not Appearing

Problem: Citations don't appear in the OpenWebUI frontend

Solutions:

Check that Azure AI Search is properly configured (AZURE_AI_DATA_SOURCES)
Ensure you're using an Azure OpenAI endpoint (not a generic Azure AI endpoint)
Verify the response contains [docX] references
Check browser console and server logs for errors

Citations Showing "Unknown Document"

Problem: Citation cards display "Unknown Document" instead of a meaningful title

Solutions:

Verify your index has title, filepath, or url fields and that they are marked as retrievable
If using custom field names, add fields_mapping with title_field, filepath_field, and url_field to your AZURE_AI_DATA_SOURCES JSON
Verify the fields are actually populated in your indexed documents (empty fields cause fallback to "Unknown Document")

No Clickable Links on [docX] References

Problem: [docX] references appear as plain text, not clickable links

Solutions:

Your index needs a url field (or mapped url_field) that contains valid URLs
If your index stores URLs in a field with a different name, map it using "url_field": "your_field_name" in fields_mapping
Verify that the url field is marked as retrievable in your index schema

Relevance Scores Showing 0%

Problem: All citation cards show 0% relevance

Solutions:

Verify AZURE_AI_INCLUDE_SEARCH_SCORES=true is set
Check that your Azure Search index supports scoring
Enable DEBUG logging to see the raw score values from Azure

Links Not Working

Problem: [docX] references are not clickable

Solutions:

Ensure citations have valid url or filepath fields
Check that the document URL is accessible
Verify the markdown link format is being generated correctly

References

Version History

v2.6.0: Major refactor - removed AZURE_AI_ENHANCE_CITATIONS and AZURE_AI_OPENWEBUI_CITATIONS valves; citation support is now always enabled when AZURE_AI_DATA_SOURCES is configured; added clickable [docX] markdown links; improved score extraction using filter_reason field
v2.5.x: Dual citation modes (OpenWebUI events + markdown/HTML)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Azure AI Foundry Pipeline - Native OpenWebUI Citations

Overview

Features

Automatic Citation Support

Configuration Options

How It Works

Streaming Responses

Non-Streaming Responses

Citation Format

OpenWebUI Citation Event Structure

Azure Citation Format (Input)

Usage

Basic Setup

Clickable Document Links

Relevance Scores

Implementation Details

Helper Functions

Title Fallback Logic

Citation Filtering

Index Schema Requirements for Citations

Required and Recommended Index Fields

Title Fallback Chain

Custom Field Names and `fields_mapping`

Creating an Index with the Right Fields

Indexer Field Mappings (Blob Storage)

How the Pipeline Reads Citation Fields

Troubleshooting

Citations Not Appearing

Citations Showing "Unknown Document"

No Clickable Links on [docX] References

Relevance Scores Showing 0%

Links Not Working

References

Version History

Uh oh!

FilesExpand file tree

azure-ai-citations.md

Latest commit

History

azure-ai-citations.md

File metadata and controls

Azure AI Foundry Pipeline - Native OpenWebUI Citations

Overview

Features

Automatic Citation Support

Configuration Options

How It Works

Streaming Responses

Non-Streaming Responses

Citation Format

OpenWebUI Citation Event Structure

Azure Citation Format (Input)

Usage

Basic Setup

Clickable Document Links

Relevance Scores

Implementation Details

Helper Functions

Title Fallback Logic

Citation Filtering

Index Schema Requirements for Citations

Required and Recommended Index Fields

Title Fallback Chain

Custom Field Names and fields_mapping

Creating an Index with the Right Fields

Indexer Field Mappings (Blob Storage)

How the Pipeline Reads Citation Fields

Troubleshooting

Citations Not Appearing

Citations Showing "Unknown Document"

No Clickable Links on [docX] References

Relevance Scores Showing 0%

Links Not Working

References

Version History

Custom Field Names and `fields_mapping`