You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fix(vllm): implement shared backend to prevent GPU OOM errors
- Add session-scoped shared_vllm_backend fixture using Granite 4 Micro
- Update test_vllm.py and test_vllm_tools.py to use shared backend
- Fall back to module-scoped backends when --isolate-heavy flag is set
- Both modules now use consistent Granite 4 Micro model
- Enhance CUDA OOM error message with actionable solutions
- Maintains backward compatibility with existing isolation mechanism
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* reduce vllm GPU allocation for tests
* implement backend test grouping via reordering
* add gpu cleanup between backend groups
* delay vllm backend creation until after openai vllm group
* adding explicit served model name for vllm openai test
* fix: rag intrinscis are not for the hybrid model (I think)
* testing a fix in tests for all the gpu issues
* more gpu cleaning
* adding docs tooling to mypy exclude
* removing kv cache also from GPU in cleanup for tests
* moving test order around and also fixing a fixture bug
* rolling back some changes from exclusive process
* some changes to the error message in vllm and also conftest cleaning
* adding an end-to-end script for tests with ollama
* adding a port finder (just in case)
* adding direct download of ollama binary from github
* warm starting ollama
* warm starting ollama
* adding cuda paths for ollama
* some extra checks for vllm and teardown
* making group by backend default
* making the script executable
* test: remove heavy ram pytest marks added in #623
Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>
* ruff formatting
* small changes to script and adding cleaup to guardian and core
* making log dir more easy to set
* increasing ollama startup to 2 mins
* adding pytest-json-report
---------
Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: avinash.bala@us.ibm.com;0J8455897;AVINASH BALAKRISHNAN <avinashbala@p5-r03-n2.bluevela.rmf.ibm.com>
Co-authored-by: Alex Bozarth <ajbozart@us.ibm.com>
0 commit comments