Skip to content

Commit 24ceba6

Browse files
authored
Bug fix: 6012573 (#1131)
### What does this PR do? Type of change: ? <!-- Use one of the following: Bug fix, new feature, new example, new tests, documentation. --> <!-- Details about the change. --> ### Usage ```python # Add a code snippet demonstrating how to use this ``` ### Testing <!-- Mention how have you tested your change if applicable. --> ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain why. --> - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A <!--- Mandatory --> - Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory for new features or examples. --> - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes or backward incompatible changes. --> ### Additional Information <!-- E.g. related issue. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Standardized the configuration key for model precision. * Model loading now defaults to bfloat16 precision instead of float32, aligning configs and runtime behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
1 parent 610707a commit 24ceba6

3 files changed

Lines changed: 3 additions & 3 deletions

File tree

examples/gpt-oss/configs/sft_full.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Model
22
model_name_or_path: openai/gpt-oss-20b
33
attn_implementation: eager
4-
torch_dtype: bfloat16
4+
dtype: bfloat16
55

66
# Dataset
77
dataset_name: HuggingFaceH4/Multilingual-Thinking

examples/gpt-oss/configs/sft_lora.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Model
22
model_name_or_path: openai/gpt-oss-20b
33
attn_implementation: eager
4-
torch_dtype: bfloat16
4+
dtype: bfloat16
55

66
# Dataset
77
dataset_name: HuggingFaceH4/Multilingual-Thinking

examples/gpt-oss/sft.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ def main(script_args, training_args, model_args, quant_args):
7272
"revision": model_args.model_revision,
7373
"trust_remote_code": model_args.trust_remote_code,
7474
"attn_implementation": model_args.attn_implementation,
75-
"torch_dtype": getattr(model_args, "dtype", "float32"),
75+
"torch_dtype": getattr(model_args, "dtype", "bfloat16"),
7676
"use_cache": not training_args.gradient_checkpointing,
7777
}
7878

0 commit comments

Comments
 (0)