Skip to content

[wwb] move to dtype auto#3793

Draft
sbalandi wants to merge 1 commit into
openvinotoolkit:masterfrom
sbalandi:dtype_auto
Draft

[wwb] move to dtype auto#3793
sbalandi wants to merge 1 commit into
openvinotoolkit:masterfrom
sbalandi:dtype_auto

Conversation

@sbalandi
Copy link
Copy Markdown
Contributor

@sbalandi sbalandi commented May 1, 2026

Description

CVS-###

Fixes #(issue)

Checklist:

  • This PR follows GenAI Contributing guidelines.
  • Tests have been updated or added to cover the new code.
  • This PR fully addresses the ticket.
  • I have made corresponding changes to the documentation.

Copilot AI review requested due to automatic review settings May 1, 2026 13:59
@github-actions github-actions Bot added the category: WWB PR changes WWB label May 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates WWB’s default HuggingFace model loading dtype behavior to rely on Transformers’ automatic dtype selection rather than forcing fp32.

Changes:

  • Switch PYTORCH_MODEL_DTYPE_KWARG from torch.float32 to "auto" for HF model loading in WWB.

Comment on lines +45 to 47
PYTORCH_MODEL_DTYPE_KWARG = {"torch_dtype": "auto"}


Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting torch_dtype to "auto" for all HuggingFace loads can break CPU execution: many models advertise fp16/bf16 in config, and loading them on CPU frequently leads to runtime errors (e.g., Half ops not implemented) or unexpected slowdowns. Consider keeping torch.float32 when device is CPU / CUDA is unavailable, and only using "auto" on GPU (or make dtype configurable via CLI).

Suggested change
PYTORCH_MODEL_DTYPE_KWARG = {"torch_dtype": "auto"}
def _get_pytorch_model_dtype_kwarg():
if torch.cuda.is_available():
return {"torch_dtype": "auto"}
return {"torch_dtype": torch.float32}
PYTORCH_MODEL_DTYPE_KWARG = _get_pytorch_model_dtype_kwarg()

Copilot uses AI. Check for mistakes.
logger = logging.getLogger(__name__)

PYTORCH_MODEL_DTYPE_KWARG = {"torch_dtype": torch.float32}
PYTORCH_MODEL_DTYPE_KWARG = {"torch_dtype": "auto"}
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description still contains placeholders (e.g., CVS-###, Fixes #(issue)) and the checklist is not filled out. Please update the PR description to match the repository template before merging so reviewers can confirm scope, tests, and docs impact.

Copilot uses AI. Check for mistakes.
logger = logging.getLogger(__name__)

PYTORCH_MODEL_DTYPE_KWARG = {"torch_dtype": torch.float32}
PYTORCH_MODEL_DTYPE_KWARG = {"torch_dtype": "auto"}
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters the default dtype selection for HF models, but the WWB test suite doesn’t appear to cover the new behavior (e.g., loading a model whose config defaults to fp16/bf16 on CPU). Please add/update a WWB test to exercise HF loading with a non-fp32 default dtype (ideally using a tiny-random model) so regressions are caught.

Copilot generated this review using guidance from repository custom instructions.
Copy link
Copy Markdown
Collaborator

@rkazants rkazants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before moving to auto type, please make sure that there will be no problems with CPU.
That is because I expect multiple JIRA tickets after this assigned to optimum-intel. Hovewer, the problem can be with CPU plugin.
So I would ask you to have validation runs and determine all existing issue in CPU. So we will know of it in advance.

@sbalandi
Copy link
Copy Markdown
Contributor Author

sbalandi commented May 5, 2026

Before moving to auto type, please make sure that there will be no problems with CPU. That is because I expect multiple JIRA tickets after this assigned to optimum-intel. Hovewer, the problem can be with CPU plugin. So I would ask you to have validation runs and determine all existing issue in CPU. So we will know of it in advance.

yes, this pr was created for validation purpose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: WWB PR changes WWB

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants