Skip to content

fix: [Bug] Pipeline parameters (ParameterInteger, ParameterString) fail in ModelTrain (5504)#37

Closed
aviruthen wants to merge 2 commits intomasterfrom
fix/bug-pipeline-parameters-parameterinteger-5504-2
Closed

fix: [Bug] Pipeline parameters (ParameterInteger, ParameterString) fail in ModelTrain (5504)#37
aviruthen wants to merge 2 commits intomasterfrom
fix/bug-pipeline-parameters-parameterinteger-5504-2

Conversation

@aviruthen
Copy link
Copy Markdown
Owner

Description

Add test coverage for PipelineVariable support in ModelTrainer hyperparameters

Problem

The safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py already correctly handles PipelineVariable objects (returning them as-is), and the _create_training_job_args method properly preserves PipelineVariable instances in hyperparameters. However, there was no test coverage for this specific use case — passing ParameterInteger, ParameterString, or mixed pipeline/plain values as hyperparameter values to ModelTrainer.

Changes

  • Added ParameterInteger and ParameterFloat imports to the test file
  • Added test_hyperparameters_accept_parameter_integer: verifies that ParameterInteger can be used as a hyperparameter value
  • Added test_hyperparameters_accept_parameter_string: verifies that ParameterString can be used as a hyperparameter value
  • Added test_hyperparameters_accept_mixed_pipeline_and_plain_values: verifies that a mix of PipelineVariable objects and plain values (int, str, float) work correctly together in hyperparameters

Testing

All new tests validate that PipelineVariable objects are preserved as-is when passed through ModelTrainer hyperparameters, ensuring they can be properly serialized by safe_serialize during pipeline execution.

Related Issue

Related issue: 5504

Changes Made

The safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py already handles PipelineVariable objects correctly — it has an isinstance(data, PipelineVariable) check that returns the PipelineVariable as-is (line ~182). The serialize function in sagemaker-core/src/sagemaker/core/utils/utils.py also properly handles PipelineVariable by returning it as-is. The _create_training_job_args method in model_trainer.py calls safe_serialize on each hyperparameter value, which correctly preserves PipelineVariable objects. The existing test file test_model_trainer_pipeline_variable.py tests PipelineVariable support for fields like training_image, algorithm_name, training_input_mode, and environment, but does NOT test hyperparameters with PipelineVariable values. The core bug described in the issue has already been fixed in the current codebase, but test coverage for the specific hyperparameters use case is missing.

AI-Generated PR

This PR was automatically generated by the PySDK Issue Agent.

  • Confidence score: 88%
  • Classification: bug
  • SDK version target: V3

Merge Checklist

  • Changes are backward compatible
  • Commit message follows prefix: description format
  • Unit tests added/updated
  • Integration tests added (if applicable)
  • Documentation updated (if applicable)

Copy link
Copy Markdown

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds test coverage for PipelineVariable support in ModelTrainer hyperparameters, addressing issue aws#5504. The tests are well-structured and follow existing patterns in the file. However, there are a few issues: the tests only verify assignment (not serialization behavior), an unused import is added, and one line exceeds the 100-character limit.

from sagemaker.core.helper.session_helper import Session
from sagemaker.core.helper.pipeline_variable import PipelineVariable, StrPipeVar
from sagemaker.core.workflow.parameters import ParameterString
from sagemaker.core.workflow.parameters import ParameterString, ParameterInteger, ParameterFloat
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParameterFloat is imported but never used in any of the new tests. Please remove the unused import to keep the code clean.

Suggested change
from sagemaker.core.workflow.parameters import ParameterString, ParameterInteger, ParameterFloat
from sagemaker.core.workflow.parameters import ParameterString, ParameterInteger

assert trainer.hyperparameters["algorithm"] is param

def test_hyperparameters_accept_mixed_pipeline_and_plain_values(self):
"""ModelTrainer.hyperparameters should accept a mix of PipelineVariable and plain values (GH#5504)."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line exceeds the 100-character line length limit (PEP 8 / project convention). Please wrap the docstring:

Suggested change
"""ModelTrainer.hyperparameters should accept a mix of PipelineVariable and plain values (GH#5504)."""
"""ModelTrainer.hyperparameters should accept a mix of PipelineVariable and plain values.
Regression test for GH#5504.
"""

@aviruthen
Copy link
Copy Markdown
Owner Author

🤖 Iteration #1 — Review Comments Addressed

Add test coverage for PipelineVariable support in ModelTrainer hyperparameters

Problem

The safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py already correctly handles PipelineVariable objects (returning them as-is), and the _create_training_job_args method properly preserves PipelineVariable instances in hyperparameters. However, there was no test coverage for this specific use case — passing ParameterInteger, ParameterString, or mixed pipeline/plain values as hyperparameter values to ModelTrainer.

Changes

  • Removed unused ParameterFloat import
  • Fixed long docstring line to comply with 100-character line length limit
  • Added test_hyperparameters_accept_parameter_integer: verifies that ParameterInteger can be used as a hyperparameter value
  • Added test_hyperparameters_accept_parameter_string: verifies that ParameterString can be used as a hyperparameter value
  • Added test_hyperparameters_accept_mixed_pipeline_and_plain_values: verifies that a mix of PipelineVariable objects and plain values work correctly together
  • Added TestSafeSerializePipelineVariable test class with tests that verify:
    • safe_serialize preserves ParameterInteger as-is
    • safe_serialize preserves ParameterString as-is
    • safe_serialize still correctly JSON-serializes plain values
    • _create_training_job_args preserves PipelineVariable objects in the resulting hyperparameters dict while serializing plain values

Testing

All new tests validate both the storage and serialization paths for PipelineVariable objects in ModelTrainer hyperparameters, ensuring they survive through safe_serialize and _create_training_job_args for proper pipeline execution.

Comments reviewed: 3
Files modified: sagemaker-train/tests/unit/train/test_model_trainer_pipeline_variable.py

  • sagemaker-train/tests/unit/train/test_model_trainer_pipeline_variable.py: Remove unused ParameterFloat import, fix long docstring, add safe_serialize serialization tests

Copy link
Copy Markdown

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds test coverage for PipelineVariable support in ModelTrainer hyperparameters, addressing GH#5504. The tests are well-structured and cover the key scenarios (ParameterInteger, ParameterString, mixed values, safe_serialize behavior, and _create_training_job_args preservation). A few minor issues worth addressing.

)
assert trainer.hyperparameters["max_depth"] is param_int
assert trainer.hyperparameters["objective"] is param_str
assert trainer.hyperparameters["eta"] == 0.1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing safe_serialize inside the test method body is unconventional. Since this is used in multiple tests within TestSafeSerializePipelineVariable, consider importing it at the module level (with the other imports at the top of the file) to keep imports consistent and avoid repeated inline imports.

# At the top of the file, with other imports:
from sagemaker.train.utils import safe_serialize

"""Test that safe_serialize correctly preserves PipelineVariable objects (GH#5504)."""

def test_safe_serialize_preserves_parameter_integer(self):
"""safe_serialize should return PipelineVariable as-is, not stringify it."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same inline import issue — safe_serialize is imported again here. Moving it to the module-level imports would clean this up.

result = safe_serialize(param)
assert result is param
assert isinstance(result, PipelineVariable)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And again here — third inline import of safe_serialize.

@@ -26,7 +26,7 @@

from sagemaker.core.helper.session_helper import Session
from sagemaker.core.helper.pipeline_variable import PipelineVariable, StrPipeVar
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ParameterInteger is imported but ParameterFloat is mentioned in the PR description as being imported. The PR description says "Added ParameterInteger and ParameterFloat imports" but only ParameterInteger is actually imported here. The description is slightly misleading, though the code itself is correct since ParameterFloat isn't used in any test.

assert isinstance(result, PipelineVariable)

def test_safe_serialize_still_serializes_plain_values(self):
"""safe_serialize should still JSON-serialize plain values."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_create_training_job_args_preserves_pipeline_hyperparameters test calls trainer._create_training_job_args() which is a private method. This is acceptable for a regression test verifying internal behavior, but be aware this test will break if the private method signature changes. Consider adding a brief comment noting this tests internal implementation details intentionally.

Also, does _create_training_job_args() require a session or make any API calls? If so, this test may need mocking to avoid failures in CI. If it works without mocking (as implied by the existing test patterns), then this is fine.

@aviruthen aviruthen closed this Mar 30, 2026
@aviruthen aviruthen deleted the fix/bug-pipeline-parameters-parameterinteger-5504-2 branch March 30, 2026 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants