Skip to content

fix: [Bug] Pipeline parameters (ParameterInteger, ParameterString) fail in ModelTrain (5504)#36

Closed
aviruthen wants to merge 2 commits intomasterfrom
fix/bug-pipeline-parameters-parameterinteger-5504-2
Closed

fix: [Bug] Pipeline parameters (ParameterInteger, ParameterString) fail in ModelTrain (5504)#36
aviruthen wants to merge 2 commits intomasterfrom
fix/bug-pipeline-parameters-parameterinteger-5504-2

Conversation

@aviruthen
Copy link
Copy Markdown
Owner

Description

Add tests for PipelineVariable support in ModelTrainer hyperparameters

The safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py already correctly handles PipelineVariable objects (including ParameterInteger, ParameterString, ParameterFloat) by checking isinstance(data, PipelineVariable) and returning the object as-is, avoiding the TypeError from json.dumps().

The downstream serialize() function in sagemaker-core also handles PipelineVariable objects correctly by returning them as-is.

This PR adds comprehensive test coverage for the exact bug scenario described in the issue:

  • TestSafeSerializePipelineVariable: Direct unit tests for safe_serialize() with ParameterInteger, ParameterString, ParameterFloat, and plain values
  • TestModelTrainerHyperparametersPipelineVariable: Integration tests verifying that ModelTrainer accepts PipelineVariable objects in hyperparameters and that _create_training_job_args() preserves them correctly through safe_serialize()

Testing

  • All new tests pass and verify the fix is working correctly
  • Existing tests continue to pass (regression coverage maintained)

Related Issue

Related issue: 5504

Changes Made

The issue described in aws#5504 has already been fixed in the current codebase. The safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py already handles PipelineVariable objects correctly. Specifically:

  1. safe_serialize() (lines 179-201) has an explicit elif isinstance(data, PipelineVariable): return data branch that returns PipelineVariable objects as-is, without attempting JSON serialization.

  2. The import at the top of the file (from sagemaker.core.workflow.parameters import PipelineVariable) correctly imports the base PipelineVariable class that ParameterInteger, ParameterString, etc. all inherit from.

  3. The serialize() function in sagemaker-core/src/sagemaker/core/utils/utils.py also handles PipelineVariable objects by returning them as-is, so downstream serialization in _create_training_job_args() (when using PipelineSession) also works correctly.

  4. The _create_training_job_args() method in model_trainer.py iterates over hyperparameters and calls safe_serialize(value) for each value. When the session is a PipelineSession, the training request goes through serialize() which preserves PipelineVariable objects for proper pipeline JSON rendering.

Additionally, the existing test file test_model_trainer_pipeline_variable.py already contains tests for PipelineVariable acceptance in various ModelTrainer fields (training_image, algorithm_name, training_input_mode, environment), confirming the pattern is established.

AI-Generated PR

This PR was automatically generated by the PySDK Issue Agent.

  • Confidence score: 95%
  • Classification: bug
  • SDK version target: V3

Merge Checklist

  • Changes are backward compatible
  • Commit message follows prefix: description format
  • Unit tests added/updated
  • Integration tests added (if applicable)
  • Documentation updated (if applicable)

Copy link
Copy Markdown

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds test coverage for PipelineVariable support in ModelTrainer hyperparameters, verifying that the existing safe_serialize function correctly handles ParameterInteger, ParameterString, and ParameterFloat objects. The tests are well-structured and follow SDK conventions. A few minor improvements could be made.

"""ModelTrainer hyperparameters should accept ParameterInteger (GH#5504).

This is the exact bug scenario: ParameterInteger in hyperparameters
caused TypeError in safe_serialize before the fix.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name includes via_safe_serialize but it's only testing that the ModelTrainer constructor stores the hyperparameter — it doesn't actually exercise safe_serialize. The safe_serialize call happens in _create_training_job_args(), which is tested separately below. Consider renaming to something like test_hyperparameters_accept_parameter_integer to avoid implying safe_serialize is being tested here.


def test_hyperparameters_accept_parameter_string_via_safe_serialize(self):
"""ModelTrainer hyperparameters should accept ParameterString (GH#5504)."""
objective = ParameterString(name="Objective", default_value="reg:squarederror")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same naming concern as above — via_safe_serialize is misleading since this test only checks constructor assignment, not serialization. Consider test_hyperparameters_accept_parameter_string.

"""
max_depth = ParameterInteger(name="MaxDepth", default_value=5)
trainer = ModelTrainer(
training_image=DEFAULT_IMAGE,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test — this is the most valuable one in the PR as it actually exercises the code path (_create_training_job_argssafe_serialize) that was broken in the original bug report. Consider also asserting that args["hyper_parameters"]["max_depth"] is an instance of PipelineVariable (in addition to the identity check) for extra clarity:

assert isinstance(args["hyper_parameters"]["max_depth"], PipelineVariable)

Repository owner deleted a comment from sagemaker-bot Mar 30, 2026
@aviruthen
Copy link
Copy Markdown
Owner Author

🤖 Iteration #1 — Review Comments Addressed

Add tests for PipelineVariable support in ModelTrainer hyperparameters

The safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py already correctly handles PipelineVariable objects (including ParameterInteger, ParameterString, ParameterFloat) by checking isinstance(data, PipelineVariable) and returning the object as-is, avoiding the TypeError from json.dumps().

This PR adds comprehensive test coverage for the exact bug scenario described in the issue:

  • TestSafeSerializePipelineVariable: Direct unit tests for safe_serialize() with ParameterInteger, ParameterString, ParameterFloat, and plain values
  • TestModelTrainerHyperparametersPipelineVariable: Tests verifying that ModelTrainer accepts PipelineVariable objects in hyperparameters and that _create_training_job_args() preserves them correctly through safe_serialize()

Changes from review feedback

  • Renamed `test_hyperparameters_accept_parameter_integer_via_safe_

Comments reviewed: 3
Files modified: sagemaker-train/tests/unit/train/test_model_trainer_pipeline_variable.py

  • sagemaker-train/tests/unit/train/test_model_trainer_pipeline_variable.py: Rename misleading test methods and add isinstance assertion per reviewer feedback

@aviruthen aviruthen closed this Mar 30, 2026
@aviruthen aviruthen deleted the fix/bug-pipeline-parameters-parameterinteger-5504-2 branch March 30, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants