Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
9d27e67
docs: rewrite external knowledge base overview page
RiskeyL Apr 9, 2026
b26993b
docs: rewrite external knowledge API specification
RiskeyL Apr 9, 2026
8c54a79
style: add adjustable parameter guidance pattern
RiskeyL Apr 9, 2026
7ac0849
fix: standardize "similarity score" terminology across docs and glossary
RiskeyL Apr 9, 2026
a047f7b
fix: address Copilot's PR review feedback on external knowledge base …
RiskeyL Apr 10, 2026
d7dff83
translate: sync zh/ja external knowledge base pages with English rewrite
RiskeyL Apr 10, 2026
0a10a46
style: add standard phrase translation convention for zh docs
RiskeyL Apr 10, 2026
f6f124e
fix: address Copilot's PR review feedback on external knowledge base …
RiskeyL Apr 10, 2026
4a62638
docs: strengthen codebase verification rule for rewrites
RiskeyL Apr 10, 2026
0f3c118
docs: elevate cross-reference anchor rule in translation guides
RiskeyL Apr 10, 2026
534b2e1
style: add vague cross-references pattern to style guide
RiskeyL Apr 10, 2026
7bb6da7
fix: remove filler content from external knowledge base pages
RiskeyL Apr 10, 2026
69997d0
fix: correct anchor examples in translation formatting guides
RiskeyL Apr 10, 2026
f624449
fix: translate code block examples and remove invalid operators in zh…
RiskeyL Apr 10, 2026
f2e3e07
docs: add translation quality rules for EN→ZH/JA
RiskeyL Apr 14, 2026
3664fad
docs: move LlamaCloud setup to tip and remove Connection Example section
RiskeyL Apr 14, 2026
79a189a
translate: sync zh/ja with LlamaCloud restructure and apply new quali…
RiskeyL Apr 14, 2026
1fd4be4
Merge remote-tracking branch 'origin/main' into docs/external-knowled…
RiskeyL Apr 14, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ For task-specific guidance, see `writing-guides/index.md`.
- Write in English only, except when specifically optimizing Chinese or Japanese translations.
- Only edit the English section in `docs.json`. Translation sections sync automatically.
- MDX files require `title` and `description` in YAML frontmatter.
- When writing about a feature, verify behavior against the Dify codebase, not just existing docs. Existing docs may be outdated.
- When writing about a feature, verify behavior against the Dify codebase, not just existing docs. Existing docs may be outdated or completely wrong. When rewriting a page, treat every claim in the original as unverified. Check field names, types, required/optional status, and behavior descriptions against the current code. Never carry forward details from legacy docs without independent verification.
- For new features, the user may specify a development branch. Code on development branches may be in flux—when behavior is ambiguous, ask rather than assume.
- When adding or updating internal-only instructions, tooling, configs, or other non-public files, ensure all paths that should not be exposed by Mintlify are covered in `.mintignore`.
- Never use `--no-verify` when committing.
Expand Down
2 changes: 1 addition & 1 deletion en/api-reference/openapi_chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -3130,7 +3130,7 @@
"score": {
"type": "number",
"format": "float",
"description": "Relevance score of the resource."
"description": "Similarity score of the resource."
},
"hit_count": {
"type": "integer",
Expand Down
2 changes: 1 addition & 1 deletion en/api-reference/openapi_chatflow.json
Original file line number Diff line number Diff line change
Expand Up @@ -3352,7 +3352,7 @@
"score": {
"type": "number",
"format": "float",
"description": "Relevance score of the resource."
"description": "Similarity score of the resource."
},
"hit_count": {
"type": "integer",
Expand Down
2 changes: 1 addition & 1 deletion en/api-reference/openapi_completion.json
Original file line number Diff line number Diff line change
Expand Up @@ -1783,7 +1783,7 @@
"score": {
"type": "number",
"format": "float",
"description": "Relevance score of the resource."
"description": "Similarity score of the resource."
},
"hit_count": {
"type": "integer",
Expand Down
14 changes: 7 additions & 7 deletions en/api-reference/openapi_knowledge.json
Original file line number Diff line number Diff line change
Expand Up @@ -612,7 +612,7 @@
},
"score_threshold": {
"type": "number",
"description": "Minimum relevance score threshold for filtering results."
"description": "Minimum similarity score threshold for filtering results."
},
"score_threshold_enabled": {
"type": "boolean",
Expand Down Expand Up @@ -3838,7 +3838,7 @@
},
"score_threshold": {
"type": "number",
"description": "Minimum relevance score threshold for filtering results."
"description": "Minimum similarity score threshold for filtering results."
},
"score_threshold_enabled": {
"type": "boolean",
Expand Down Expand Up @@ -4034,14 +4034,14 @@
},
"score": {
"type": "number",
"description": "Relevance score of the child chunk."
"description": "Similarity score of the child chunk."
}
}
}
},
"score": {
"type": "number",
"description": "Relevance score."
"description": "Similarity score."
},
"tsne_position": {
"type": "object",
Expand Down Expand Up @@ -6218,7 +6218,7 @@
},
"score_threshold": {
"type": "number",
"description": "Minimum relevance score for results. Only effective when `score_threshold_enabled` is `true`."
"description": "Minimum similarity score for results. Only effective when `score_threshold_enabled` is `true`."
}
}
},
Expand Down Expand Up @@ -6304,7 +6304,7 @@
},
"score_threshold": {
"type": "number",
"description": "Minimum relevance score threshold."
"description": "Minimum similarity score threshold."
},
"score_threshold_enabled": {
"type": "boolean",
Expand Down Expand Up @@ -6772,7 +6772,7 @@
"score_threshold": {
"type": "number",
"nullable": true,
"description": "Minimum relevance score for results. Only effective when `score_threshold_enabled` is `true`."
"description": "Minimum similarity score for results. Only effective when `score_threshold_enabled` is `true`."
},
"weights": {
"type": "object",
Expand Down
114 changes: 75 additions & 39 deletions en/use-dify/knowledge/connect-external-knowledge-base.mdx
Original file line number Diff line number Diff line change
@@ -1,65 +1,101 @@
---
title: Connect to External Knowledge Base
description: Integrate external knowledge sources with Dify applications through API connections to leverage custom RAG systems or third-party knowledge services
sidebarTitle: Overview
---

> To make a distinction, knowledge bases independent of the Dify platform are collectively referred to as **"external knowledge bases"** in this article.
If your team maintains its own RAG system or hosts content in a third-party knowledge service like [AWS Bedrock](https://aws.amazon.com/bedrock/), you can connect these external sources to Dify instead of migrating content into Dify's built-in knowledge base.

## Functional Introduction
This lets your AI applications retrieve information directly from your existing infrastructure while you retain full control over the retrieval logic and content management.

For developers with advanced content retrieval requirements, **the built-in knowledge base functionality and text retrieval mechanisms of the Dify platform may have limitations, particularly in terms of customizing recall results.**
<Frame>
![External Knowledge Base Architecture](https://assets-docs.dify.ai/2025/03/f5fb91d18740c1e2d3938d4d106c4d3c.png)
</Frame>

Due to the requirement of higher accuracy of text retrieval and recall, as well as the need to manage internal materials, some developer teams choose to independently develop RAG algorithms and independently maintain text retrieval systems, or uniformly host content to cloud vendors' knowledge base services (such as [AWS Bedrock](https://aws.amazon.com/bedrock/)).
**Connecting an external knowledge base involves three steps**:

As a neutral platform for LLM application development, Dify is committed to providing developers with a wider range of options.
1. [Build an API service that Dify can query](#step-1-build-the-retrieval-api).
2. [Register the API endpoint in Dify](#step-2-register-an-external-knowledge-api).
3. [Connect a specific knowledge source through the registered API](#step-3-create-an-external-knowledge-base).

The **Connect to External Knowledge Base** feature enables integration between the Dify platform and external knowledge bases. Through API services, AI applications can access a broader range of information sources. This capability offers two key advantages:
When your application runs, Dify sends retrieval requests to your endpoint and uses the returned chunks as context for LLM responses.

* The Dify platform can directly obtain the text content hosted in the cloud service provider's knowledge base, so that developers do not need to repeatedly move the content to the knowledge base in Dify;
* The Dify platform can directly obtain the text content processed by algorithms in the self-built knowledge base. Developers only need to focus on the information retrieval mechanism of the self-built knowledge base and continuously optimize and improve the accuracy of information retrieval.
<Tip>
If you're connecting to LlamaCloud, install the [LlamaCloud plugin](https://marketplace.dify.ai/plugin/langgenius/llamacloud) instead of building a custom API. See the [video walkthrough](https://www.youtube.com/watch?v=FaOzKZRS-2E) for a complete setup demo.
Comment thread
RiskeyL marked this conversation as resolved.

<Frame caption="Principle of external knowledge base connection">
<img src="https://assets-docs.dify.ai/2025/03/f5fb91d18740c1e2d3938d4d106c4d3c.png" alt="" />
</Frame>
If you're building a plugin for another knowledge service, the LlamaCloud plugin's [source code](https://github.com/langgenius/dify-official-plugins/tree/main/extensions/llamacloud) is available for reference.
</Tip>

<Info>
Dify only has retrieval access to external knowledge bases—it cannot modify or manage your external content. You maintain the knowledge base and its retrieval logic independently.
</Info>

## Step 1: Build the Retrieval API

Build an API service that implements the [External Knowledge API specification](/en/use-dify/knowledge/external-knowledge-api). Your service needs a single `POST` endpoint that accepts a search query and returns matching text chunks with similarity scores.

## Step 2: Register an External Knowledge API

An External Knowledge API stores your endpoint URL and authentication credentials. Multiple knowledge bases can share one API connection.

1. Go to **Knowledge**, click **External Knowledge API** in the upper-right corner, then click **Add an External Knowledge API**.

2. Fill in the following fields:

- **Name**: A label to distinguish this API connection from others.
- **API Endpoint**: The base URL of your external knowledge service. Dify appends `/retrieval` automatically when sending requests.
- **API Key**: The authentication credential for your service. Dify sends this as a Bearer token in the `Authorization` header.

Dify validates the connection by sending a test request to your endpoint when you save.

## Step 3: Create an External Knowledge Base

With the API registered, connect an external knowledge source to Dify. This creates a knowledge base in Dify that is linked to your external system.

1. Go to **Knowledge** and click **Connect to an External Knowledge Base**.

<Frame>
![Connect to External Knowledge Base](/images/connect-to-external-knowledge-base.png)
</Frame>

2. Fill in the following fields:
- **External Knowledge Name** and **Knowledge Description** (optional).
- **External Knowledge API**: Select the API connection you registered.
- **External Knowledge ID**: The identifier of the specific knowledge source within your external system, passed to your API as the `knowledge_id` field.

This is whatever ID your external service uses to distinguish between different knowledge bases. For example, a Bedrock knowledge base ARN or an ID you defined in your own system.

### Connection Examples
<Note>
The **External Knowledge API** and **External Knowledge ID** cannot be changed after creation. To use a different API or knowledge source, create a new external knowledge base.
</Note>

#### LlamaCloud
- **Retrieval Settings**:
- **Top K**: Maximum number of chunks to retrieve per query. Higher values return more results but may include less relevant content.
- **Score Threshold**: Minimum similarity score for returned chunks. Enable this to filter out low-relevance results. Use higher value for stricter relevance or lower value to include broader matches.
Comment thread
RiskeyL marked this conversation as resolved.

Dify provides an official LlamaCloud plugin that helps you quickly connect to LlamaCloud knowledge bases.
When disabled, all results up to the Top K limit are returned regardless of score.

##### Plugin Installation
Once created, the external knowledge base is available for use in your applications just like any built-in knowledge base. See [Integrate Knowledge Within Application](/en/use-dify/knowledge/integrate-knowledge-within-application) for details.

1. Visit the Dify [Marketplace](https://marketplace.dify.ai/) and search for `LlamaCloud`
2. Install and configure the LlamaCloud plugin according to the instructions
3. Enable the plugin in the Dify platform
4. Fill in the LlamaCloud API key and other necessary information following the plugin configuration wizard
5. After configuration is complete, you can see the connected external knowledge base in your knowledge base list
## Troubleshoot
Comment thread
RiskeyL marked this conversation as resolved.

With the LlamaCloud plugin, you can directly use LlamaCloud's powerful retrieval capabilities in the Dify platform without writing custom APIs.
### Connection Refused or Timeout (Self-Hosted)

For more information about how it works, please refer to the plugin's [GitHub repository](https://github.com/langgenius/dify-official-plugins/tree/main/extensions/llamacloud).
Dify routes outbound HTTP requests through a Squid-based SSRF proxy. If your external knowledge service runs on the same host as Dify or its domain is not allowlisted, the proxy blocks the request.

#### Video Tutorial
To allow connections, add your service's domain to the `allowed_domains` ACL in `docker/ssrf_proxy/squid.conf.template`:

The following video demonstrates in detail how to use the LlamaCloud plugin to connect to external knowledge bases:
```text
acl allowed_domains dstdomain .marketplace.dify.ai .your-kb-service.com
```
Comment thread
RiskeyL marked this conversation as resolved.

<iframe
src="https://www.youtube.com/embed/FaOzKZRS-2E"
width="100%"
height="315"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
/>
Restart the SSRF proxy container after editing.

## FAQ
### API Response Format Issues

**How to Fix the Errors Occurring When Connecting to External Knowledge API?**
If retrieval fails or returns unexpected results, verify your API response against the [External Knowledge API specification](/en/use-dify/knowledge/external-knowledge-api#response).

Solutions corresponding to each error code in the return information:
Common issues:

| Error Code | Result | Solutions |
| ---------- | ----------------------------------- | ----------------------------------------------------------- |
| 1001 | Invalid Authorization header format | Please check the Authorization header format of the request |
| 1002 | Authorization failed | Please check whether the API Key you entered is correct. |
| 2001 | The knowledge is not exist | Please check the external repository |
- The `metadata` field in each record must be an object (`{}`), not `null`. A `null` value causes errors in the retrieval pipeline.
- The `content` and `score` fields must be present in every record.
Loading
Loading