Allowing other LLMs and custom prompts in evaluation (specifically, deepeval)

**Is your feature request related to a problem? Please describe.**
I cannot use a (small) local LLM or customized prompts for evaluation of the RAG pipeline output.  Smaller LLMs (e.g., minicheck) have become as good as GPT for evaluation.

**Describe the solution you'd like**
I would like to use a small local LLM for evaluation of the RAG pipeline output. At this time, it seems that only GPT LLMs are allowed. Smaller LLMs (e.g., minicheck) have become as good as GPT for evaluation. These local LLMs are available via Ollama. Also, there does not seem to be a way to customize the prompts used in haystack-deepeval. 

**Describe alternatives you've considered**
Use deepeval "offline", i.e. saved the question, contexts (chunks) and answer and use deepeval locally. This is not very convenient, since I would like to be able to fine tune the model.

**Additional context**
The ability to use deepeval to evaluate a model during fine tuning is very useful. It is also good to be able to customize the prompt, since it looks like CoT or other techniques can improve evaluation outputs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allowing other LLMs and custom prompts in evaluation (specifically, deepeval) #1872

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Allowing other LLMs and custom prompts in evaluation (specifically, deepeval) #1872

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions