Error 404 on input detections when using an openai server other than vLLM

## Describe the bug

When using either the completions or chat completions endpoints and the `openai` server configured is not a vLLM instance (e.g. Ollama), a request with input detections returns 404:

Request
```
curl --location 'http://localhost:8033/api/v2/text/completions-detection' \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen3:0.6b",
    "prompt": "I hate aliens",
    "detectors": {
        "input": {
            "hap": {}
        }
    }
}'
```

Response
```
{"code":404,"details":"tokenize request failed for `qwen3:0.6b`: unknown error occurred"}
```

This is caused because the `/tokenize` is invoked when there are input detections to gather `usage` data (.e.g [here](https://github.com/foundation-model-stack/fms-guardrails-orchestrator/blob/main/src/orchestrator/handlers/completions_detection/unary.rs#L125-L135)). However, this endpoint is not part of the openai API, it's strict to vLLM.

## Discussion

I'm wondering what would be the best approach in this scenario. I've thought of two ideas, but both have limitations:

1. **When invoking the `/tokenize` endpoint returns 404, return `usage` as empty.** - the limitation with this idea is that it might not be obvious if no warning about this is provided (and I think there was an ongoing discussion to deprecate the `warnings` field in the orchestrator response.
2. **Create an additional config to accept `404` responses from `/tokenize`** - this would have the same behavior as the previous option, except that it would only happen if this additional config parameter would be set to allow such responses. The drawback here is that there would be an extra config for the orchestrator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error 404 on input detections when using an openai server other than vLLM #499

Describe the bug

Discussion

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Error 404 on input detections when using an openai server other than vLLM #499

Description

Describe the bug

Discussion

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions