Skip to content

Commit 003cec9

Browse files
ThiloteEgitbook-bot
authored andcommitted
GITBOOK-282: ThiloteE's 2026-01-20 rewrite of the AI section
1 parent 568ea76 commit 003cec9

6 files changed

Lines changed: 45 additions & 33 deletions

File tree

en/SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
* [AI providers and API keys](ai/ai-providers-and-api-keys.md)
5050
* [AI preferences](ai/preferences.md)
5151
* [AI troubleshooting](ai/troubleshooting.md)
52-
* [Running a local LLM model](ai/local-llm.md)
52+
* [Running a local language model](ai/local-llm.md)
5353
* [Configuration](setup/README.md)
5454
* [Customize the citation key generator](setup/citationkeypatterns.md)
5555
* [Customize entry types](setup/customentrytypes.md)

en/ai/ai-providers-and-api-keys.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,7 @@ You can find more information about providers in the [`langchain4j` documentatio
1919

2020
We cannot give a clear recommendation. Providers change their service and their prices regularly and our documentation page is too static to keep up with daily changes. It is recommended to look up LLM benchmarks on the internet or to use the trial and error method. To date, remote AI providers like OpenAI, Google, Mistral and others offer state of the art quality.
2121

22-
If you want to [run a model locally](local-llm.md), choose GPT4All or Ollama. In comparison to remote AI providers, open weight local models that are compatible with average consumer devices offer less capabilities. There are state of the art local models available, but they are very large (in terms of number of parameters) and the higher the number of parameters, the more memory is needed. To run the largest models, very expensive and capable hardware (preferably VRAM in GPU's or ASICs) is required. That said, even small models can be sufficient for the [add entry using refrence text](../collect/newentryfromplaintext.md) workflow.\
23-
\
24-
Hugging Face is a special case. One the one hand, it serves as a free hosting platform from where you can download numerous models to host them yourself locally. On the other hand Huggingface also is a remote AI provider that offers running numerous large and small open weight models for you.
22+
If you want to [run a model locally](local-llm.md), choose GPT4All or Ollama or make use of the OpenAI API. In comparison to remote AI providers, open weight local models that are compatible with average consumer devices offer less capabilities. There are state of the art local models available, but they are very large (in terms of number of parameters) and the higher the number of parameters, the more memory is needed. To run the largest models, very expensive and capable hardware is required. That said, even small models can be sufficient for the [add entry using refrence text](../collect/newentryfromplaintext.md) workflow.
2523

2624
## Why do I need an API key?
2725

@@ -92,4 +90,4 @@ Make the subscription on [their website](https://admin.mistral.ai/organization/b
9290

9391
### Hugging Face
9492

95-
You do not have to pay anything for Hugging Face in order to send requests to LLMs. Though, the speed is very slow by default. It may take a long time to allocate free compute resources to your instance, resulting in longer response times. You can switch to faster inference by [upgrading your user account](https://huggingface.co/pricing#pro) or by [running a space on GPU](https://huggingface.co/docs/hub/spaces-gpus).
93+
You possibly may not have to pay anything for Hugging Face in order to send requests to LLMs. Though, the speed is very slow by default. It may take a long time to allocate free compute resources to your instance, resulting in longer response times. You can switch to faster inference by [upgrading your user account](https://huggingface.co/pricing#pro) or by [running a space on GPU](https://huggingface.co/docs/hub/spaces-gpus).

en/ai/how-to-enable-and-use-ai-features.md

Lines changed: 27 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,44 @@
22

33
Thank you for checking out JabRef AI features! We believe you can find them useful in your research or brainstorming process.
44

5-
## 1. Locate new entry editor tabs and accept AI Privacy Policy
5+
## 1. Locate and accept the AI Privacy Policy
66

7-
As you run the latest JabRef version, you will find 2 new entry editor tabs ("AI Chat" and "AI Summary"):
7+
1. Run JabRef, open a library, select an entry and open the [entry editor](../advanced/entryeditor/). There you will see tabs that have AI in their name.<br>
88

9-
![New entry editor tabs](../.gitbook/assets/ai-new-entries.png)
9+
<figure><img src="../.gitbook/assets/ai-new-entries.png" alt="AI related entry editor tabs (AI Summary and AI Chat)"><figcaption><p>AI related entry editor tabs</p></figcaption></figure>
10+
2. Open the **AI Chat** or the **AI Summary** tab. The first time you open any of these tabs, JabRef will ask for your permission to accept the Privacy notice. In order to enable all AI features, you need to accept it, by pressing the **I agree** button. If you do not accept it, none of your information will be transmitted to external services.<br>
1011

11-
However, the first time you open these tabs, JabRef will ask your permission for using AI features and accepting AI Privacy Policy:
12+
<figure><img src="../.gitbook/assets/ai-preferences-connection.png" alt="AI privacy notice"><figcaption><p>AI privacy notice<br></p></figcaption></figure>
1213

13-
![AI Privacy Policy](../.gitbook/assets/ai-privacy-policy.png)
14+
In the AI Privacy notice you can find links to Privacy Policies of supported external services and an explanation what data is sent to external services.
1415

15-
In AI Privacy Policy you can find links to Privacy Policies of external services like OpenAI, Mistral AI, etc.
16+
## 2. Attach a file to your entry
1617

17-
In order to enable all AI features, you need to accept this Privacy Policy. If you do not accept it, none of your information will be transmitted to external services.
18+
In order to use the following AI features in the entry editor, you need to [add PDFs to an entry](../collect/add-pdfs-to-an-entry.md):
1819

19-
## 2. Obtain API key
20+
* AI Chat and&#x20;
21+
* AI Summary tabs
2022

21-
After clicking "I agree" button, there is only one crucial step left for using AI features. You need to setup connection to AI provider. Please refer to [AI providers and API keys](ai-providers-and-api-keys.md) page to understand what is an AI provider and how to get an API key.
23+
This in turn requires you to [set a main file directory](../finding-sorting-and-cleaning-entries/filelinks.md#directories-for-files). JabRef supports other AI features that do not require you to attach a file to your entry, such as [using a language model to turn plain reference text into an entry](../collect/newentryfromplaintext.md#llm) and if that's all you need, you can skip this step.
2224

23-
## 3. Enter API key in JabRef
25+
## 3. Connect to an external AI provider
2426

25-
After you got your API key, you need to enter it in JabRef preferences. Open Preferences using menu `File -> Preferences`. Locate tab `AI`:
27+
There is only one crucial step left for using AI features. You need to setup a connection to an external AI provider. With _external_, we mean a provider outside of JabRef, regardless, if that entails connecting to an [AI app on your local device](local-llm.md) or connecting to a remote online service.
2628

27-
![AI preferences](../.gitbook/assets/ai-preferences-connection.png)
29+
While the former may or may not require an API key, online services most definitely will require you to enter one, therefore here is some guidance:
2830

29-
Choose the AI provider you have the API key from and enter the API key (in this order, because JabRef stores several API keys tied to specific AI provider). Additionally you can choose chat model of the AI provider.[AI providers and API keys](ai-providers-and-api-keys.md) page to understand what is an AI provider and how to get an API key.
31+
#### 1. Obtain an API key
3032

31-
Choose the AI provider you have the API key from and enter the API key (in this order, because JabRef stores several API keys tied to specific AI provider). Additionally, you can choose chat model of the AI provider that you want to use in JabRef.
33+
Please look at the [AI providers and API keys](ai-providers-and-api-keys.md) documentation page to understand what is an AI provider and how to get an API key.
3234

33-
Save the preferences and after that you are able to use JabRef's AI features on full power!
35+
#### 2. Enter an API key
36+
37+
After you got your API key, you need to enter it in JabRef's [preferences.md](preferences.md "mention").
38+
39+
1. Open the preferences menu via `File > Preferences`.
40+
2. Locate the `AI` tab.
41+
3. Choose the AI provider you have the API key from and enter the API key (in this order, because JabRef can store several API keys, tied to specific AI providers).
42+
43+
Finally, you can choose the chat model of the AI provider.
44+
45+
Save the preferences and henceforth you are able to use JabRef's AI features as you see fit!

en/ai/local-llm.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,12 @@
1-
# Running a local Large Language Model (LLM)
1+
# Running a local language model
22

3-
Notice:
3+
## Hardware Recommendations
44

5-
1. LLMs require a lot of computational power and therefore lots of electricity.
6-
2. Smaller models typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case.
5+
1. Large Language Models (LLMs) require a lot of computational power and therefore lots of electricity and dedicated hardware. This following advise assumes a small scale project and availability of consumer hardware.
6+
2. Smaller models typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case (so start out with the small ones and if need be, scale up).
77
3. The size of a model can be measured in number of parameters in its neural network. The "b" in the model name typically stands for **b**illion parameters. It also can be measured in terms of gigabytes required to load the model into your devices RAM/VRAM.
8-
4. The model should always completely fit into VRAM (fast), otherwise layers will be offloaded to RAM (slower) and if it doesn't fit in there either, it will use SSD (abysmally slow).
9-
5. Hardware recommendation for maximize prompt processing and token generation speed: A device with high *bandwidth*. A modern GPU with lots of VRAM will satisfy this requirement best.
10-
8+
4. The model should always completely fit into VRAM (fast), otherwise layers will typically be offloaded to RAM (very slow) and if it doesn't fit in there either, it will use your harddrive, typically a SSD or HDD (abysmally slow).
9+
5. The Hardware recommendation to maximize prompt processing and token generation speed is a device with high _bandwidth_. To date, modern GPU with lots of VRAM will satisfy this requirement best.
1110

1211
## High-level explanation
1312

@@ -25,7 +24,7 @@ Voilà! You can use a local LLM right away in JabRef.
2524
The following steps guide you on how to use `ollama` to download and run local LLMs.
2625

2726
1. Install `ollama` from [their website](https://ollama.com/download)
28-
2. Select a model that you want to run. The `ollama` provides [a large list of models](https://ollama.com/library) to choose from. Some popular models are for instance [qwen3:30b-a3b](https://ollama.com/library/qwen3), [`granite3.1-moe:3b`](https://ollama.com/library/granite3.1-moe), [`devkit/L1-Qwen-1.5B-Max`](https://ollama.com/devkit/L1-Qwen-1.5B-Max), [`mistral:7b`](https://ollama.com/library/mistral) or [`mistral-small3.1:24b`](https://ollama.com/library/mistral-small3.1).
27+
2. Select a model that you want to run. `ollama` provides [a large list of models](https://ollama.com/library) to choose from. Some popular models are for instance [qwen3:30b-a3b](https://ollama.com/library/qwen3), [`granite3.1-moe:3b`](https://ollama.com/library/granite3.1-moe), [`devkit/L1-Qwen-1.5B-Max`](https://ollama.com/devkit/L1-Qwen-1.5B-Max), [`mistral:7b`](https://ollama.com/library/mistral) or [`mistral-small3.1:24b`](https://ollama.com/library/mistral-small3.1).
2928
3. When you have selected your model, type `ollama pull <MODEL>:<PARAMETERS>` in your terminal. `<MODEL>` refers to the model name like `gemma2` or `mistral`, and `<PARAMETERS>` refers to parameters count like `2b` or `9b`.
3029
4. `ollama` will download the model for you
3130
5. After that, you can run ollama serve to start a local web server. This server will accept requests and respond with LLM output. Note: The ollama server may already be running, so do not be alarmed by a cannot bind error. If it is not yet running, use the following command: `ollama run <MODEL>:<PARAMETERS>`
@@ -46,4 +45,3 @@ The following steps guide you on how to use `GPT4All`to download and run local L
4645
4. Set the "AI provider" to "GPT4All"
4746
5. Set the "Chat model" to the name (including the `.gguf`part) of the model you have downloaded in GPT4All.
4847
6. Set the "API base URL" in "Expert Settings" to `http://localhost:4891/v1/chat/completions`.
49-

en/ai/preferences.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The embedding model transforms a document (or a piece of text) into a vector (an
3636

3737
Different embedding models have varying performance, including accuracy and the speed of computing embeddings. The `_q` at the end of the model name usually denotes _quantized_ (meaning reduced or simplified). These models are faster and smaller than their original counterparts but provide slightly less accuracy.
3838

39-
Currently, only local embedding models are supported. This means you do not need to provide a new API key, as all the processing will be done on your machine.
39+
Currently, only local embedding models are supported. This means you do not need to provide a separate API key for them, as all the processing will be done on your machine.
4040

4141
### Instruction
4242

@@ -74,6 +74,10 @@ The "chunk size" parameter in document splitting refers to the size of segments
7474

7575
These segments are then passed to the AI model for processing. This approach helps optimize performance by breaking down large documents into smaller, more digestible parts, allowing for more efficient handling and analysis by the AI.
7676

77+
{% hint style="warning" %}
78+
The chunk size should not exceed the capabilities of the embedding model, otherwise embeddings may fail to be generated. Users have to set the chunk size of the embedding model to `"max_position_embeddings": *,`. Model makers publish this info usually as part of the config.json (mostly at [https://huggingface.co](https://huggingface.co)). For example [https://huggingface.co/intfloat/multilingual-e5-large/blob/main/config.json](https://huggingface.co/intfloat/multilingual-e5-large/blob/main/config.json)
79+
{% endhint %}
80+
7781
### Document splitter chunk overlap
7882

7983
**Type**: integer

en/ai/troubleshooting.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# Troubleshooting
1+
# AI troubleshooting
22

33
## "Failed to load PyTorch native library" while trying the AI chat
44

5-
If you encounter this error, download the latest [Visual C++ redistributable from Microsoft](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#latest-microsoft-visual-c-redistributable-version). This installation is only required for AI features in JabRef, all other features can work without it. Also, if multiple installations of CUDA are installed, JabRef's Version needs to be added to the PATH first. For example, on Windows this would be adding `C:\Users\USER\.djl.ai\pytorch\CUDA-VERSION` to the Environment Variables. See [how to edit environment variables on Windows 10 or 11](https://www.howtogeek.com/787217/how-to-edit-environment-variables-on-windows-10-or-11/).
5+
If you encounter this error, download the latest [Visual C++ redistributable from Microsoft](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#latest-microsoft-visual-c-redistributable-version). This installation is only required for AI features in JabRef, all other features can work without it. Also, if multiple installations of CUDA are installed, JabRef's Version first needs to be added to the PATH. For example, on Windows this would be adding `C:\Users\USER\.djl.ai\pytorch\CUDA-VERSION` to the Environment Variables. See [how to edit environment variables on Windows 10 or 11](https://www.howtogeek.com/787217/how-to-edit-environment-variables-on-windows-10-or-11/).
66

77
If you still have issues, the [DJL documentation](https://docs.djl.ai/master/docs/development/troubleshooting.html#unsatisfiedlinkerror-issue) might be of help.
88

99
## JabRef closed or crashed in the middle of downloading the embedding model
1010

11-
Do not worry! You simply need to delete the embedding model cache.
11+
Do not worry! It could be as simple as only having to delete the embedding model cache.
1212

1313
The name of the folder is `.djl.ai`, and it is located in your home directory.

0 commit comments

Comments
 (0)