Can't use Hugginface Inference API (serverless) due to hardcoded /generate path

Spring-ai has support for Hugginface Inference endpoints. However this doesn't work with the 'serverless' version of the inference API due to a hardcoded '/generate' subpath being used.

**Bug description**
Configure hugginface and use a serverless inference endpoint such as:
```spring.ai.huggingface.chat.url=https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct```

This will result in an exception with the following error:
```404 Not Found: "{"error":"Model meta-llama/Meta-Llama-3-8B-Instruct/generate does not exist"}"````

This is because the chatModel is calling the generate method (which leads to the openclient client calling /generate), while it seems it should be the 'compatGenerate' method to invoke the endpoint the `/` path. 

https://github.com/spring-projects/spring-ai/blob/v1.0.0-M1/models/spring-ai-huggingface/src/main/java/org/springframework/ai/huggingface/HuggingfaceChatModel.java#L97

**Environment**
Spring-ai 1.0.0.M1

**Expected behavior**
I would expect this library to work with the serverless version of the inference endpoints (much cheaper 😅). If the path has to be different between versions it should be configurable.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't use Hugginface Inference API (serverless) due to hardcoded /generate path #849

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Can't use Hugginface Inference API (serverless) due to hardcoded /generate path #849

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions