OpenVINO GenAI tests NPU support and Windows fixes#1660
OpenVINO GenAI tests NPU support and Windows fixes#1660helena-intel wants to merge 3 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Pull request overview
Updates the OpenVINO GenAI integration tests to improve cross-device stability (notably Windows/GPU) and introduce initial NPU coverage.
Changes:
- Add a pytest-based temp directory/traceback cleanup mechanism to avoid Windows file-handle issues after failures.
- Add initial NPU support by restricting the tested model sets and skipping unsupported test classes.
- Make LLM comparisons more robust on GPU by comparing generated token IDs rather than detokenized text; align VLM generation with chat templates.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
|
|
||
| # NPU does not support f32 inference | ||
| TEST_CONFIG = {"CACHE_DIR": ""} if OPENVINO_DEVICE == "NPU" else {**F32_CONFIG, "CACHE_DIR": ""} |
There was a problem hiding this comment.
TEST_CONFIG sets CACHE_DIR to an empty string. In Optimum's OpenVINO integration, CACHE_DIR is treated as an actual directory path when present, so passing "" can lead to an invalid cache path (or unexpected caching behavior) when compiling models. Prefer omitting CACHE_DIR entirely, or set it to a real directory under self.temp_dir if you need deterministic caching behavior in these tests.
| TEST_CONFIG = {"CACHE_DIR": ""} if OPENVINO_DEVICE == "NPU" else {**F32_CONFIG, "CACHE_DIR": ""} | |
| TEST_CONFIG = {} if OPENVINO_DEVICE == "NPU" else {**F32_CONFIG} |
There was a problem hiding this comment.
CACHE_DIR may be set by default to a particular directory, CACHE_DIR="" prevents that. We do not want model caching to be used for testing, even if the default for a particular device is to use model caching.
|
@helena-intel, please resolve merge conflict. We re-run CI again. in the meantime, @popovaan, please take a look at this PR. |
eef8826 to
0137597
Compare
- Fix TemporaryDirectory issues on Windows - Compare model output tokens instead of tokenized outputs for LLMs - Initial NPU support - Use chat template for VLM test
0137597 to
dee6428
Compare
- Change supported versions for deepseek and qwen - ChatGLM issue is caused by NaN in tiny model outputs, tracked by internal ticket. For now, remove chatglm from genai tests. This only affects chatglm, not chatglm4.
e180fc1 to
57eef28
Compare
|
@anatyrova, @regisss, please take a look at this PR |
Update OpenVINO GenAI tests
The solution for the temporary directory looks convoluted but this was trickier than expected when we also want to delete the directory if the test fails.
I tested GPU and NPU on LNL 258V with Linux and Windows.