You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: support external embedding service for scaling (#248)
tested with (using both config.yaml and env vars):
- internal main_em.py service
- external local embedding service
- remotely hosted embedding service (IONOS)
---------
Signed-off-by: Anupam Kumar <kyteinsky@gmail.com>
Copy file name to clipboardExpand all lines: README.md
+7Lines changed: 7 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -116,6 +116,13 @@ Make sure to restart the app after changing the config file. For docker, this wo
116
116
117
117
This is a file copied from one of the two configurations (config.cpu.yaml or config.gpu.conf) during app startup if `config.yaml` is not already present to the persistent storage. See [Repair section](#repair) on details on the repair step that removes the config if you have a custom config.
118
118
119
+
The default way is to spawn an embedding server backed by llama.cpp, where the local model runs on either CPU or GPU. The other option is to use a remote model from a OpenAI-compatible API. The configuration for the remote model is also present in the sample config files.
120
+
API key or username/password for the remote API can be stored in the config file itself or environment variables can be used. `CCB_EM_APIKEY` for the API key and `CCB_EM_USERNAME` and `CCB_EM_PASSWORD` for the username and password respectively.
121
+
To indicate the use of environment variables, set the value of `auth` in the config file to `from_env`, like so:
122
+
```yaml
123
+
auth: from_env
124
+
```
125
+
119
126
## Repair
120
127
v2.1.0 introduces repair steps. These run on app startup.
Copy file name to clipboardExpand all lines: appinfo/info.xml
+25Lines changed: 25 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,31 @@ Setup background job workers as described here: https://docs.nextcloud.com/serve
56
56
<display-name>Auto-download models from Huggingface</display-name>
57
57
<description>When set to "false", "0" or "no", initial download of the Huggingface models will be skipped in the init phase. They would have to be downloaded and placed in the persistent storage manually or through a mount point.</description>
<description>Set this to an OpenAI-compatible endpoint like https://api.my-local-llm.lan/v1. When set, the internal embedding server is not started. For authentication, set CC_EM_APIKEY or CC_EM_USERNAME and CC_EM_PASSWORD as needed.</description>
63
+
</variable>
64
+
<variable>
65
+
<name>CC_EM_MODEL_NAME</name>
66
+
<display-name>External embedding model name</display-name>
67
+
<description>Model name to be used with the OpenAI-compatible endpoint set in CC_EM_BASE_URL. For example, "text-embedding-3-small" or any other model supported by the endpoint. If unset, no model name is sent in the requests.</description>
68
+
</variable>
69
+
<variable>
70
+
<name>CC_EM_APIKEY</name>
71
+
<display-name>API key for authentication to CC_EM_BASE_URL</display-name>
72
+
<description>API key to be used for authenticating requests to the OpenAI-compatible endpoint set in CC_EM_BASE_URL. Either this or CC_EM_USERNAME and CC_EM_PASSWORD should be set if the endpoint requires authentication.</description>
73
+
</variable>
74
+
<variable>
75
+
<name>CC_EM_USERNAME</name>
76
+
<display-name>Username for authentication to CC_EM_BASE_URL</display-name>
77
+
<description>Username to be used for authenticating requests to the OpenAI-compatible endpoint set in CC_EM_BASE_URL.</description>
78
+
</variable>
79
+
<variable>
80
+
<name>CC_EM_PASSWORD</name>
81
+
<display-name>Password for authentication to CC_EM_BASE_URL</display-name>
82
+
<description>Password to be used for authenticating requests to the OpenAI-compatible endpoint set in CC_EM_BASE_URL.</description>
0 commit comments