fix(evaluate): Remove ModelPackageConfig from EvaluateBaseModel steps

jam-jee · jam-jee · commit 4f765c2f45aa · 2026-03-17T14:39:39.000-07:00
When evaluate_base_model=True, the EvaluateBaseModel step in both DETERMINISTIC_TEMPLATE and CUSTOM_SCORER_TEMPLATE incorrectly included ModelPackageConfig with SourceModelPackageArn, causing the base model evaluation to load fine-tuned model weights instead of using only the base model from the public hub. This made both evaluations identical, leading users to believe fine-tuning had no effect. Remove ModelPackageConfig from the EvaluateBaseModel step in both templates so it only uses BaseModelArn from ServerlessJobConfig. The EvaluateCustomModel step retains ModelPackageConfig to correctly load fine-tuned weights. This is consistent with the fix already applied to the LLMAJ_TEMPLATE. --- X-AI-Prompt: Fix BenchMarkEvaluator evaluate_base_model bug from D406780217 X-AI-Tool: Kiro sim: https://t.corp.amazon.com/D406780217
diff --git a/sagemaker-train/src/sagemaker/train/evaluate/pipeline_templates.py b/sagemaker-train/src/sagemaker/train/evaluate/pipeline_templates.py
@@ -94,10 +94,6 @@
             "Type": "Training",
             "Arguments": {
                 "RoleArn": "{{ role_arn }}",
-                "ModelPackageConfig": {
-                    "ModelPackageGroupArn": "{{ model_package_group_arn }}",
-                    "SourceModelPackageArn": "{{ source_model_package_arn }}"
-                },
                 "ServerlessJobConfig": {
                     "BaseModelArn": "{{ base_model_arn }}",
                     "AcceptEula": true,
@@ -614,10 +610,6 @@
             "Type": "Training",
             "Arguments": {
                 "RoleArn": "{{ role_arn }}",
-                "ModelPackageConfig": {
-                    "ModelPackageGroupArn": "{{ model_package_group_arn }}",
-                    "SourceModelPackageArn": "{{ source_model_package_arn }}"
-                },
                 "ServerlessJobConfig": {
                     "BaseModelArn": "{{ base_model_arn }}",
                     "AcceptEula": true,