GITBOOK-282: ThiloteE's 2026-01-20 rewrite of the AI section

ThiloteE · gitbook-bot · commit 003cec9a86d2 · 2026-01-20T19:54:59.000Z
diff --git a/en/SUMMARY.md b/en/SUMMARY.md
@@ -49,7 +49,7 @@
   * [AI providers and API keys](ai/ai-providers-and-api-keys.md)
   * [AI preferences](ai/preferences.md)
   * [AI troubleshooting](ai/troubleshooting.md)
-  * [Running a local LLM model](ai/local-llm.md)
+  * [Running a local language model](ai/local-llm.md)
 * [Configuration](setup/README.md)
   * [Customize the citation key generator](setup/citationkeypatterns.md)
   * [Customize entry types](setup/customentrytypes.md)
diff --git a/en/ai/ai-providers-and-api-keys.md b/en/ai/ai-providers-and-api-keys.md
@@ -19,9 +19,7 @@ You can find more information about providers in the [`langchain4j` documentatio
 
 We cannot give a clear recommendation. Providers change their service and their prices regularly and our documentation page is too static to keep up with daily changes. It is recommended to look up LLM benchmarks on the internet or to use the trial and error method. To date, remote AI providers like OpenAI, Google, Mistral and others offer state of the art quality.
 
-If you want to [run a model locally](local-llm.md), choose GPT4All or Ollama. In comparison to remote AI providers, open weight local models that are compatible with average consumer devices offer less capabilities. There are state of the art local models available, but they are very large (in terms of number of parameters) and the higher the number of parameters, the more memory is needed. To run the largest models, very expensive and capable hardware (preferably VRAM in GPU's or ASICs) is required. That said, even small models can be sufficient for the [add entry using refrence text](../collect/newentryfromplaintext.md) workflow.\
-\
-Hugging Face is a special case. One the one hand, it serves as a free hosting platform from where you can download numerous models to host them yourself locally. On the other hand Huggingface also is a remote AI provider that offers running numerous large and small open weight models for you.
+If you want to [run a model locally](local-llm.md), choose GPT4All or Ollama or make use of the OpenAI API. In comparison to remote AI providers, open weight local models that are compatible with average consumer devices offer less capabilities. There are state of the art local models available, but they are very large (in terms of number of parameters) and the higher the number of parameters, the more memory is needed. To run the largest models, very expensive and capable hardware is required. That said, even small models can be sufficient for the [add entry using refrence text](../collect/newentryfromplaintext.md) workflow.
 
 ## Why do I need an API key?
 
@@ -92,4 +90,4 @@ Make the subscription on [their website](https://admin.mistral.ai/organization/b
 
 ### Hugging Face
 
-You do not have to pay anything for Hugging Face in order to send requests to LLMs. Though, the speed is very slow by default. It may take a long time to allocate free compute resources to your instance, resulting in longer response times. You can switch to faster inference by [upgrading your user account](https://huggingface.co/pricing#pro) or by [running a space on GPU](https://huggingface.co/docs/hub/spaces-gpus).
+You possibly may not have to pay anything for Hugging Face in order to send requests to LLMs. Though, the speed is very slow by default. It may take a long time to allocate free compute resources to your instance, resulting in longer response times. You can switch to faster inference by [upgrading your user account](https://huggingface.co/pricing#pro) or by [running a space on GPU](https://huggingface.co/docs/hub/spaces-gpus).
diff --git a/en/ai/how-to-enable-and-use-ai-features.md b/en/ai/how-to-enable-and-use-ai-features.md
@@ -2,32 +2,44 @@
 
 Thank you for checking out JabRef AI features! We believe you can find them useful in your research or brainstorming process.
 
-## 1. Locate new entry editor tabs and accept AI Privacy Policy
+## 1. Locate and accept the AI Privacy Policy
 
-As you run the latest JabRef version, you will find 2 new entry editor tabs ("AI Chat" and "AI Summary"):
+1.  Run JabRef, open a library, select an entry and open the [entry editor](../advanced/entryeditor/). There you will see tabs that have AI in their name.<br>
 
-![New entry editor tabs](../.gitbook/assets/ai-new-entries.png)
+    <figure><img src="../.gitbook/assets/ai-new-entries.png" alt="AI related entry editor tabs (AI Summary and AI Chat)"><figcaption><p>AI related entry editor tabs</p></figcaption></figure>
+2.  Open the **AI Chat** or the **AI Summary** tab. The first time you open any of these tabs, JabRef will ask for your permission to accept the Privacy notice. In order to enable all AI features, you need to accept it, by pressing the **I agree** button. If you do not accept it, none of your information will be transmitted to external services.<br>
 
-However, the first time you open these tabs, JabRef will ask your permission for using AI features and accepting AI Privacy Policy:
+    <figure><img src="../.gitbook/assets/ai-preferences-connection.png" alt="AI privacy notice"><figcaption><p>AI privacy notice<br></p></figcaption></figure>
 
-![AI Privacy Policy](../.gitbook/assets/ai-privacy-policy.png)
+    In the AI Privacy notice you can find links to Privacy Policies of supported external services and an explanation what data is sent to external services.
 
-In AI Privacy Policy you can find links to Privacy Policies of external services like OpenAI, Mistral AI, etc.
+## 2. Attach a file to your entry
 
-In order to enable all AI features, you need to accept this Privacy Policy. If you do not accept it, none of your information will be transmitted to external services.
+In order to use the following AI features in the entry editor, you need to [add PDFs to an entry](../collect/add-pdfs-to-an-entry.md):
 
-## 2. Obtain API key
+* AI Chat and&#x20;
+* AI Summary tabs
 
-After clicking "I agree" button, there is only one crucial step left for using AI features. You need to setup connection to AI provider. Please refer to [AI providers and API keys](ai-providers-and-api-keys.md) page to understand what is an AI provider and how to get an API key.
+This in turn requires you to [set a main file directory](../finding-sorting-and-cleaning-entries/filelinks.md#directories-for-files). JabRef supports other AI features that do not require you to attach a file to your entry, such as [using a language model to turn plain reference text into an entry](../collect/newentryfromplaintext.md#llm) and if that's all you need, you can skip this step.
 
-## 3. Enter API key in JabRef
+## 3. Connect to an external AI provider
 
-After you got your API key, you need to enter it in JabRef preferences. Open Preferences using menu `File -> Preferences`. Locate tab `AI`:
+There is only one crucial step left for using AI features. You need to setup a connection to an external AI provider. With _external_, we mean a provider outside of JabRef, regardless, if that entails connecting to an [AI app on your local device](local-llm.md) or connecting to a remote online service.
 
-![AI preferences](../.gitbook/assets/ai-preferences-connection.png)
+While the former may or may not require an API key, online services most definitely will require you to enter one, therefore here is some guidance:
 
-Choose the AI provider you have the API key from and enter the API key (in this order, because JabRef stores several API keys tied to specific AI provider). Additionally you can choose chat model of the AI provider.[AI providers and API keys](ai-providers-and-api-keys.md) page to understand what is an AI provider and how to get an API key.
+#### 1. Obtain an API key
 
-Choose the AI provider you have the API key from and enter the API key (in this order, because JabRef stores several API keys tied to specific AI provider). Additionally, you can choose chat model of the AI provider that you want to use in JabRef.
+Please look at the [AI providers and API keys](ai-providers-and-api-keys.md) documentation page to understand what is an AI provider and how to get an API key.
 
-Save the preferences and after that you are able to use JabRef's AI features on full power!
+#### 2. Enter an API key
+
+After you got your API key, you need to enter it in JabRef's [preferences.md](preferences.md "mention").
+
+1. Open the preferences menu via `File > Preferences`.
+2. Locate the `AI` tab.
+3. Choose the AI provider you have the API key from and enter the API key (in this order, because JabRef can store several API keys, tied to specific AI providers).
+
+Finally, you can choose the chat model of the AI provider.
+
+Save the preferences and henceforth you are able to use JabRef's AI features as you see fit!
diff --git a/en/ai/local-llm.md b/en/ai/local-llm.md
@@ -1,13 +1,12 @@
-# Running a local Large Language Model (LLM)
+# Running a local language model
 
-Notice:
+## Hardware Recommendations
 
-1. LLMs require a lot of computational power and therefore lots of electricity.
-2. Smaller models typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case.
+1. Large Language Models (LLMs) require a lot of computational power and therefore lots of electricity and dedicated hardware. This following advise assumes a small scale project and availability of consumer hardware.
+2. Smaller models typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case (so start out with the small ones and if need be, scale up).
 3. The size of a model can be measured in number of parameters in its neural network. The "b" in the model name typically stands for **b**illion parameters. It also can be measured in terms of gigabytes required to load the model into your devices RAM/VRAM.
-4. The model should always completely fit into VRAM (fast), otherwise layers will be offloaded to RAM (slower) and if it doesn't fit in there either, it will use SSD (abysmally slow).
-5. Hardware recommendation for maximize prompt processing and token generation speed: A device with high *bandwidth*. A modern GPU with lots of VRAM will satisfy this requirement best.
-
+4. The model should always completely fit into VRAM (fast), otherwise layers will typically be offloaded to RAM (very slow) and if it doesn't fit in there either, it will use your harddrive, typically a SSD or HDD (abysmally slow).
+5. The Hardware recommendation to maximize prompt processing and token generation speed is a device with high _bandwidth_. To date, modern GPU with lots of VRAM will satisfy this requirement best.
 
 ## High-level explanation
 
@@ -25,7 +24,7 @@ Voilà! You can use a local LLM right away in JabRef.
 The following steps guide you on how to use `ollama` to download and run local LLMs.
 
 1. Install `ollama` from [their website](https://ollama.com/download)
-2. Select a model that you want to run. The `ollama` provides [a large list of models](https://ollama.com/library) to choose from. Some popular models are for instance [qwen3:30b-a3b](https://ollama.com/library/qwen3), [`granite3.1-moe:3b`](https://ollama.com/library/granite3.1-moe), [`devkit/L1-Qwen-1.5B-Max`](https://ollama.com/devkit/L1-Qwen-1.5B-Max), [`mistral:7b`](https://ollama.com/library/mistral) or [`mistral-small3.1:24b`](https://ollama.com/library/mistral-small3.1).
+2. Select a model that you want to run. `ollama` provides [a large list of models](https://ollama.com/library) to choose from. Some popular models are for instance [qwen3:30b-a3b](https://ollama.com/library/qwen3), [`granite3.1-moe:3b`](https://ollama.com/library/granite3.1-moe), [`devkit/L1-Qwen-1.5B-Max`](https://ollama.com/devkit/L1-Qwen-1.5B-Max), [`mistral:7b`](https://ollama.com/library/mistral) or [`mistral-small3.1:24b`](https://ollama.com/library/mistral-small3.1).
 3. When you have selected your model, type `ollama pull <MODEL>:<PARAMETERS>` in your terminal. `<MODEL>` refers to the model name like `gemma2` or `mistral`, and `<PARAMETERS>` refers to parameters count like `2b` or `9b`.
 4. `ollama` will download the model for you
 5. After that, you can run ollama serve to start a local web server. This server will accept requests and respond with LLM output. Note: The ollama server may already be running, so do not be alarmed by a cannot bind error. If it is not yet running, use the following command: `ollama run <MODEL>:<PARAMETERS>`
@@ -46,4 +45,3 @@ The following steps guide you on how to use `GPT4All`to download and run local L
 4. Set the "AI provider" to "GPT4All"
 5. Set the "Chat model" to the name (including the `.gguf`part) of the model you have downloaded in GPT4All.
 6. Set the "API base URL" in "Expert Settings" to `http://localhost:4891/v1/chat/completions`.
-
diff --git a/en/ai/preferences.md b/en/ai/preferences.md
@@ -36,7 +36,7 @@ The embedding model transforms a document (or a piece of text) into a vector (an
 
 Different embedding models have varying performance, including accuracy and the speed of computing embeddings. The `_q` at the end of the model name usually denotes _quantized_ (meaning reduced or simplified). These models are faster and smaller than their original counterparts but provide slightly less accuracy.
 
-Currently, only local embedding models are supported. This means you do not need to provide a new API key, as all the processing will be done on your machine.
+Currently, only local embedding models are supported. This means you do not need to provide a separate API key for them, as all the processing will be done on your machine.
 
 ### Instruction
 
@@ -74,6 +74,10 @@ The "chunk size" parameter in document splitting refers to the size of segments
 
 These segments are then passed to the AI model for processing. This approach helps optimize performance by breaking down large documents into smaller, more digestible parts, allowing for more efficient handling and analysis by the AI.
 
+{% hint style="warning" %}
+The chunk size should not exceed the capabilities of the embedding model, otherwise embeddings may fail to be generated. Users have to set the chunk size of the embedding model to `"max_position_embeddings": *,`. Model makers publish this info usually as part of the config.json (mostly at [https://huggingface.co](https://huggingface.co)). For example [https://huggingface.co/intfloat/multilingual-e5-large/blob/main/config.json](https://huggingface.co/intfloat/multilingual-e5-large/blob/main/config.json)
+{% endhint %}
+
 ### Document splitter chunk overlap
 
 **Type**: integer
diff --git a/en/ai/troubleshooting.md b/en/ai/troubleshooting.md
@@ -1,13 +1,13 @@
-# Troubleshooting
+# AI troubleshooting
 
 ## "Failed to load PyTorch native library" while trying the AI chat
 
-If you encounter this error, download the latest [Visual C++ redistributable from Microsoft](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#latest-microsoft-visual-c-redistributable-version). This installation is only required for AI features in JabRef, all other features can work without it. Also, if multiple installations of CUDA are installed, JabRef's Version needs to be added to the PATH first. For example, on Windows this would be adding `C:\Users\USER\.djl.ai\pytorch\CUDA-VERSION` to the Environment Variables. See [how to edit environment variables on Windows 10 or 11](https://www.howtogeek.com/787217/how-to-edit-environment-variables-on-windows-10-or-11/).
+If you encounter this error, download the latest [Visual C++ redistributable from Microsoft](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#latest-microsoft-visual-c-redistributable-version). This installation is only required for AI features in JabRef, all other features can work without it. Also, if multiple installations of CUDA are installed, JabRef's Version first needs to be added to the PATH. For example, on Windows this would be adding `C:\Users\USER\.djl.ai\pytorch\CUDA-VERSION` to the Environment Variables. See [how to edit environment variables on Windows 10 or 11](https://www.howtogeek.com/787217/how-to-edit-environment-variables-on-windows-10-or-11/).
 
 If you still have issues, the [DJL documentation](https://docs.djl.ai/master/docs/development/troubleshooting.html#unsatisfiedlinkerror-issue) might be of help.
 
 ## JabRef closed or crashed in the middle of downloading the embedding model
 
-Do not worry! You simply need to delete the embedding model cache.
+Do not worry! It could be as simple as only having to delete the embedding model cache.
 
 The name of the folder is `.djl.ai`, and it is located in your home directory.