Fix Deepseek transformers model loading (#740)

cjluo-nv · web-flow · commit 9c24e2c08e7a · 2026-01-08T11:12:53.000-08:00
## What does this PR do? **Type of change:** ? Bug fix **Overview:** ? For Deepseek, let's force the user to apply trust_remote_code and use AutoModelForCausalLM for loading the model. ## Testing python hf_ptq.py --pyt_ckpt_path <Kimi-K2-Thinking_path> --qformat nvfp4 --export_path <quantized_ckpt> --kv_cache_qformat none --calib_size 64 --trust_remote_code --dataset cnn_dailymail ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information  Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
diff --git a/examples/llm_ptq/example_utils.py b/examples/llm_ptq/example_utils.py
@@ -349,11 +349,12 @@ def get_model(
         else:
             architecture = hf_config.architectures[0]
 
-            if not hasattr(transformers, architecture):
-                warnings.warn(
-                    f"Architecture {architecture} not found in transformers: {transformers.__version__}. "
-                    "Falling back to AutoModelForCausalLM."
-                )
+            if not hasattr(transformers, architecture) or "Deepseek" in architecture:
+                if not hasattr(transformers, architecture):
+                    warnings.warn(
+                        f"Architecture {architecture} not found in transformers: {transformers.__version__}. "
+                        "Falling back to AutoModelForCausalLM."
+                    )
                 assert trust_remote_code, (
                     "Please set trust_remote_code to True if you want to use this architecture"
                 )