Skip to content

'vllm-server' verification tasks did not detect failed 'podman run vLLM' #63

@jharriga

Description

@jharriga

I think this logic needs to be revised

TASK [vllm_server : Wait for vLLM initialization (longer for LLM models)] ****** Pausing for 20 seconds (ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort) [vllm_server : Wait for vLLM initialization (longer for LLM models)] Waiting for vLLM to download model and initialize (this may take a while for large models)...: ok: [vllm-server] => {"changed": false, "delta": 20, "echo": true, "rc": 0, "start": "2026-03-23 15:18:40.688415", "stderr": "", "stdout": "Paused for 20.0 seconds", "stop": "2026-03-23 15:19:00.691384", "user_input": ""}

I had a run that failed to successfully start vLLM Container on my system BUT the Playbook carried on, resulting in a TIMEOUT failure during [guidellm-client]
`
PLAY [Health Check - Wait for vLLM Server] *************************************

TASK [Display health check configuration] **************************************
ok: [guidellm-client] => {
"msg": [
"vLLM Server: http://10.26.10.19:8000",
"Timeout: 300s",
"Check Interval: 5s"
]
}

TASK [Wait for vLLM health endpoint] *******************************************
FAILED - RETRYING: [guidellm-client]: Wait for vLLM health endpoint (60 retries left).
`

In my scripting I have verified vLLM startup by probing the IEserver for 'model' endpoint. For reference you can see that technique HERE: https://github.com/jharriga/Test_LLMs/blob/main/run_tests.sh
line 40 (function) and line 153 (caller).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions