Conversation
…il in ModelTrain (5504)
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds test coverage for PipelineVariable support in ModelTrainer hyperparameters, addressing issue aws#5504. The tests are well-structured and follow existing patterns in the file. However, there are a few issues: the tests only verify assignment (not serialization behavior), an unused import is added, and one line exceeds the 100-character limit.
| from sagemaker.core.helper.session_helper import Session | ||
| from sagemaker.core.helper.pipeline_variable import PipelineVariable, StrPipeVar | ||
| from sagemaker.core.workflow.parameters import ParameterString | ||
| from sagemaker.core.workflow.parameters import ParameterString, ParameterInteger, ParameterFloat |
There was a problem hiding this comment.
ParameterFloat is imported but never used in any of the new tests. Please remove the unused import to keep the code clean.
| from sagemaker.core.workflow.parameters import ParameterString, ParameterInteger, ParameterFloat | |
| from sagemaker.core.workflow.parameters import ParameterString, ParameterInteger |
| assert trainer.hyperparameters["algorithm"] is param | ||
|
|
||
| def test_hyperparameters_accept_mixed_pipeline_and_plain_values(self): | ||
| """ModelTrainer.hyperparameters should accept a mix of PipelineVariable and plain values (GH#5504).""" |
There was a problem hiding this comment.
This line exceeds the 100-character line length limit (PEP 8 / project convention). Please wrap the docstring:
| """ModelTrainer.hyperparameters should accept a mix of PipelineVariable and plain values (GH#5504).""" | |
| """ModelTrainer.hyperparameters should accept a mix of PipelineVariable and plain values. | |
| Regression test for GH#5504. | |
| """ |
🤖 Iteration #1 — Review Comments AddressedAdd test coverage for PipelineVariable support in ModelTrainer hyperparametersProblemThe Changes
TestingAll new tests validate both the storage and serialization paths for Comments reviewed: 3
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds test coverage for PipelineVariable support in ModelTrainer hyperparameters, addressing GH#5504. The tests are well-structured and cover the key scenarios (ParameterInteger, ParameterString, mixed values, safe_serialize behavior, and _create_training_job_args preservation). A few minor issues worth addressing.
| ) | ||
| assert trainer.hyperparameters["max_depth"] is param_int | ||
| assert trainer.hyperparameters["objective"] is param_str | ||
| assert trainer.hyperparameters["eta"] == 0.1 |
There was a problem hiding this comment.
Importing safe_serialize inside the test method body is unconventional. Since this is used in multiple tests within TestSafeSerializePipelineVariable, consider importing it at the module level (with the other imports at the top of the file) to keep imports consistent and avoid repeated inline imports.
# At the top of the file, with other imports:
from sagemaker.train.utils import safe_serialize| """Test that safe_serialize correctly preserves PipelineVariable objects (GH#5504).""" | ||
|
|
||
| def test_safe_serialize_preserves_parameter_integer(self): | ||
| """safe_serialize should return PipelineVariable as-is, not stringify it.""" |
There was a problem hiding this comment.
Same inline import issue — safe_serialize is imported again here. Moving it to the module-level imports would clean this up.
| result = safe_serialize(param) | ||
| assert result is param | ||
| assert isinstance(result, PipelineVariable) | ||
|
|
There was a problem hiding this comment.
And again here — third inline import of safe_serialize.
| @@ -26,7 +26,7 @@ | |||
|
|
|||
| from sagemaker.core.helper.session_helper import Session | |||
| from sagemaker.core.helper.pipeline_variable import PipelineVariable, StrPipeVar | |||
There was a problem hiding this comment.
Nit: ParameterInteger is imported but ParameterFloat is mentioned in the PR description as being imported. The PR description says "Added ParameterInteger and ParameterFloat imports" but only ParameterInteger is actually imported here. The description is slightly misleading, though the code itself is correct since ParameterFloat isn't used in any test.
| assert isinstance(result, PipelineVariable) | ||
|
|
||
| def test_safe_serialize_still_serializes_plain_values(self): | ||
| """safe_serialize should still JSON-serialize plain values.""" |
There was a problem hiding this comment.
The test_create_training_job_args_preserves_pipeline_hyperparameters test calls trainer._create_training_job_args() which is a private method. This is acceptable for a regression test verifying internal behavior, but be aware this test will break if the private method signature changes. Consider adding a brief comment noting this tests internal implementation details intentionally.
Also, does _create_training_job_args() require a session or make any API calls? If so, this test may need mocking to avoid failures in CI. If it works without mocking (as implied by the existing test patterns), then this is fine.
Description
Add test coverage for PipelineVariable support in ModelTrainer hyperparameters
Problem
The
safe_serializefunction insagemaker-train/src/sagemaker/train/utils.pyalready correctly handlesPipelineVariableobjects (returning them as-is), and the_create_training_job_argsmethod properly preservesPipelineVariableinstances in hyperparameters. However, there was no test coverage for this specific use case — passingParameterInteger,ParameterString, or mixed pipeline/plain values as hyperparameter values toModelTrainer.Changes
ParameterIntegerandParameterFloatimports to the test filetest_hyperparameters_accept_parameter_integer: verifies thatParameterIntegercan be used as a hyperparameter valuetest_hyperparameters_accept_parameter_string: verifies thatParameterStringcan be used as a hyperparameter valuetest_hyperparameters_accept_mixed_pipeline_and_plain_values: verifies that a mix ofPipelineVariableobjects and plain values (int, str, float) work correctly together in hyperparametersTesting
All new tests validate that
PipelineVariableobjects are preserved as-is when passed throughModelTrainerhyperparameters, ensuring they can be properly serialized bysafe_serializeduring pipeline execution.Related Issue
Related issue: 5504
Changes Made
The
safe_serializefunction insagemaker-train/src/sagemaker/train/utils.pyalready handlesPipelineVariableobjects correctly — it has anisinstance(data, PipelineVariable)check that returns the PipelineVariable as-is (line ~182). Theserializefunction insagemaker-core/src/sagemaker/core/utils/utils.pyalso properly handles PipelineVariable by returning it as-is. The_create_training_job_argsmethod inmodel_trainer.pycallssafe_serializeon each hyperparameter value, which correctly preserves PipelineVariable objects. The existing test filetest_model_trainer_pipeline_variable.pytests PipelineVariable support for fields like training_image, algorithm_name, training_input_mode, and environment, but does NOT test hyperparameters with PipelineVariable values. The core bug described in the issue has already been fixed in the current codebase, but test coverage for the specific hyperparameters use case is missing.AI-Generated PR
This PR was automatically generated by the PySDK Issue Agent.
Merge Checklist
prefix: descriptionformat