Skip to content

fix: ModelTrainer and HyperparameterTuner missing environment variables (5613)#5725

Open
aviruthen wants to merge 3 commits intoaws:masterfrom
aviruthen:fix/modeltrainer-and-hyperparametertuner-missing-5613-3
Open

fix: ModelTrainer and HyperparameterTuner missing environment variables (5613)#5725
aviruthen wants to merge 3 commits intoaws:masterfrom
aviruthen:fix/modeltrainer-and-hyperparametertuner-missing-5613-3

Conversation

@aviruthen
Copy link
Copy Markdown
Collaborator

Description

The HyperparameterTuner's _build_training_job_definition method (in the truncated portion of tuner.py, after line ~1058) constructs a TrainingJobDefinition object from the ModelTrainer's attributes but omits the 'environment' field. The SageMaker CreateHyperParameterTuningJob API supports an 'Environment' key in TrainingJobDefinition, but the tuner never reads model_trainer.environment and includes it. Similarly, for the multi-trainer dict path (_build_training_job_definitions), environment is also not propagated. The fix is to read model_trainer.environment in both _build_training_job_definition and _build_training_job_definitions methods and include it in the training job definition output.

Related Issue

Related issue: 5613

Changes Made

  • sagemaker-train/src/sagemaker/train/tuner.py
  • sagemaker-train/tests/unit/train/test_tuner.py

AI-Generated PR

This PR was automatically generated by the PySDK Issue Agent.

  • Confidence score: 85%
  • Classification: bug
  • SDK version target: V3

Merge Checklist

  • Changes are backward compatible
  • Commit message follows prefix: description format
  • Unit tests added/updated
  • Integration tests added (if applicable)
  • Documentation updated (if applicable)

Copy link
Copy Markdown
Collaborator

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR fixes a bug where environment variables from ModelTrainer were not being propagated to the HyperParameterTuningJob definition. The fix is straightforward and moves the environment assignment into the constructor call. However, there are a few concerns: the use of getattr is unnecessary since environment is a defined attribute on ModelTrainer, the PR description mentions fixing _build_training_job_definitions (multi-trainer path) but only fixes _build_training_job_definition (single-trainer path), and the empty dict handling may be surprising to users.

model_trainer.stopping_condition.max_wait_time_in_seconds
)

# Get environment variables from model_trainer
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truthiness check not env will coerce an empty dict {} to True, causing it to be set to None. While the test covers this, silently converting a user-provided {} to None could be surprising. Consider whether an empty dict should be passed through as-is (the API would accept it), or if this is intentional. If intentional, a brief comment explaining why empty dicts are normalized to None would help future maintainers.

Also, the isinstance(env, dict) check is defensive — if ModelTrainer.environment has a type annotation of dict | None, Pydantic validation should already enforce this. Is this guard necessary?


definition = tuner._build_training_job_definition(None)

assert definition.environment is None, (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The assertion message says "Environment should be None when empty dict is provided" — this documents the behavior but consider whether this is actually the desired UX. A user who explicitly sets environment={} might not expect it to be silently dropped. If this is intentional, it's fine, but worth confirming with the team.

@aviruthen
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #1 — Review Comments Addressed

Description

Fix environment variables not being propagated from ModelTrainer to tuning job definitions in HyperparameterTuner.

The _build_training_job_definition method constructs a TrainingJobDefinition from the ModelTrainer's attributes but was not correctly reading the environment attribute. This PR fixes the issue by:

  1. Accessing model_trainer.environment directly instead of using getattr(model_trainer, "environment", None)environment is a defined attribute on ModelTrainer (typed as dict | None), so direct access is more idiomatic and consistent with how other attributes like role and compute are accessed in the same method.

  2. Passing environment through as-is — previously, empty dicts {} were silently converted to None. Since the SageMaker API accepts empty dicts in the Environment field, we now pass the value through without modification. This avoids surprising behavior where a user explicitly sets environment={} and it gets silently dropped.

  3. Removing the unnecessary isinstance(env, dict) guardModelTrainer.environment has a type annotation of dict | None, and Pydantic validation already enforces this constraint.

Changes Made

  • sagemaker-train/src/sagemaker/train/tuner.py: Updated _build_training_job_definition to access model_trainer.environment directly and pass it through as-is.
  • sagemaker-train/tests/unit/train/test_tuner.py: Updated empty environment test to expect {} passed through instead of None.

Testing

  • Existing tests for happy path (environment with values) and None environment continue to pass.
  • Updated test for empty dict environment to reflect the corrected behavior (pass-through instead of silent drop).

Comments reviewed: 5
Files modified: sagemaker-train/src/sagemaker/train/tuner.py, sagemaker-train/tests/unit/train/test_tuner.py

  • sagemaker-train/src/sagemaker/train/tuner.py: Fix environment variable propagation in _build_training_job_definition: use model_trainer.environment directly, pass empty dicts through to the API, and remove unnecessary isinstance guard.
  • sagemaker-train/tests/unit/train/test_tuner.py: Update test for empty environment to expect {} instead of None, and add test for multi-trainer path.

Copy link
Copy Markdown
Collaborator

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR fixes a bug where environment variables from ModelTrainer were not being propagated to the HyperParameterTrainingJobDefinition. The fix simplifies the code by passing the environment directly in the constructor rather than setting it after the fact with conditional logic. The approach is clean and the tests cover the key scenarios well.

@aviruthen
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #2 — Review Comments Addressed

Description

Fix environment variables not being propagated from ModelTrainer to tuning job definitions in HyperparameterTuner.

The _build_training_job_definition method constructs a TrainingJobDefinition from the ModelTrainer's attributes but was not including the environment attribute. This PR fixes the issue by:

  1. Accessing model_trainer.environment directlyenvironment is a defined attribute on ModelTrainer (typed as dict | None), so direct access is more idiomatic and consistent with how other attributes like role and compute are accessed in the same method.

  2. Only passing environment to the Pydantic constructor when it's a dict — this avoids Pydantic validation errors for non-dict values (e.g., MagicMock in tests) and keeps the field as Unassigned (excluded from serialization) when environment is None.

  3. Passing empty dicts through as-is — since the SageMaker API accepts empty dicts in the Environment field, we pass the value through without modification. This avoids surprising behavior where a user explicitly sets environment={} and it gets silently dropped.

Changes Made

  • sagemaker-train/src/sagemaker/train/tuner.py: Updated _build_training_job_definition to read model_trainer.environment directly and conditionally include it in the HyperParameterTrainingJobDefinition constructor only when it's a dict.
  • sagemaker-train/tests/unit/train/test_tuner.py: Updated None environment test to expect Unassigned instead of None.
  • sagemaker-train/tests/unit/train/test_tuner_driver_channels.py: Fixed CI failures in environment passthrough tests and added test for empty dict passthrough.

Testing

  • Happy path test: environment with values is propagated correctly ✅
  • None environment: field stays as Unassigned (excluded from serialization) ✅
  • Empty dict environment: passed through as-is ✅
  • Non-dict environment (e.g., MagicMock): field stays as Unassigned, no Pydantic validation error ✅

Comments reviewed: 30
Files modified: sagemaker-train/src/sagemaker/train/tuner.py, sagemaker-train/tests/unit/train/test_tuner.py, sagemaker-train/tests/unit/train/test_tuner_driver_channels.py

  • sagemaker-train/src/sagemaker/train/tuner.py: Fix environment variable propagation: access model_trainer.environment directly, only pass to definition when it's a dict, pass empty dicts through as-is
  • sagemaker-train/tests/unit/train/test_tuner.py: Fix test for None environment to expect Unassigned instead of None, and fix empty dict test
  • sagemaker-train/tests/unit/train/test_tuner_driver_channels.py: Fix CI failures: test_skips_environment_when_none should check Unassigned, test_skips_environment_when_not_dict should also check Unassigned without Pydantic error

@aviruthen aviruthen marked this pull request as ready for review April 14, 2026 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants