fix(eval): resolve mlflow_resource_arn in _get_base_template_context#5758
fix(eval): resolve mlflow_resource_arn in _get_base_template_context#5758mollyheamazon wants to merge 2 commits intoaws:masterfrom
Conversation
…when session was absent at construction
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
The fix correctly addresses the deferred resolution of mlflow_resource_arn when sagemaker_session is None at construction time. The approach is sound and the test covers the new path. A few minor issues: a missing blank line after the import, the test line length likely exceeds 100 characters, and the test could be slightly more robust in verifying the mock call count.
| @@ -21,6 +21,7 @@ | |||
| from sagemaker.core.helper.session_helper import Session | |||
|
|
|||
| from sagemaker.train.base_trainer import BaseTrainer | |||
There was a problem hiding this comment.
Missing blank line between this import and the # Module-level logger comment. PEP 8 and the existing file style expect a blank line separating import groups from module-level code.
from sagemaker.train.common_utils.finetune_utils import _resolve_mlflow_resource_arn
# Module-level logger| Returns: | ||
| dict: Base template context dictionary | ||
| """ | ||
| # Resolve MLflow ARN if not already resolved (e.g. session was None at construction time) |
There was a problem hiding this comment.
Minor consideration: _resolve_mlflow_resource_arn may raise exceptions (e.g., if the SageMaker API call fails). In the validator, failures are silently swallowed because the validator just returns None. Here, an exception would propagate up and fail evaluate(). Is that the desired behavior? If not, consider wrapping this in a try/except with a warning log, consistent with how the validator handles failures:
if not self.mlflow_resource_arn and self.sagemaker_session:
try:
self.mlflow_resource_arn = _resolve_mlflow_resource_arn(self.sagemaker_session)
except Exception:
_logger.warning("Failed to resolve MLflow resource ARN during deferred resolution.")| @@ -925,6 +925,37 @@ def test_get_base_template_context(self, mock_resolve, mock_session, mock_model_ | |||
| assert context['dataset_artifact_arn'] == DEFAULT_ARTIFACT_ARN | |||
| assert 'action_arn_prefix' in context | |||
|
|
|||
There was a problem hiding this comment.
This line is well over 100 characters (the project's line length limit). Please wrap it:
@patch(
"sagemaker.train.evaluate.base_evaluator._resolve_mlflow_resource_arn"
)
def test_get_base_template_context_deferred_mlflow_resolution(
self, mock_resolve_mlflow, mock_resolve, mock_session, mock_model_info
):| region=DEFAULT_REGION, | ||
| account_id="123456789012", | ||
| model_package_group_arn=DEFAULT_MODEL_PACKAGE_GROUP_ARN, | ||
| resolved_model_artifact_arn=DEFAULT_ARTIFACT_ARN, |
There was a problem hiding this comment.
Consider using mock_resolve_mlflow.assert_called_once_with(mock_session) instead of assert_called_with to verify it was called exactly once during _get_base_template_context. Since the mock is also invoked by the validator during construction, you may want to reset the mock after construction (mock_resolve_mlflow.reset_mock()) before calling _get_base_template_context, then assert assert_called_once_with. This makes the test more precise about which call path triggered the resolution.
| model=DEFAULT_MODEL, | ||
| s3_output_path=DEFAULT_S3_OUTPUT, | ||
| model_package_group=DEFAULT_MODEL_PACKAGE_GROUP_ARN, | ||
| sagemaker_session=mock_session, |
There was a problem hiding this comment.
Setting evaluator.mlflow_resource_arn = None directly after construction to simulate the deferred case works, but it's a bit fragile — it relies on Pydantic allowing direct attribute mutation. A comment explaining why this is necessary (or configuring the mock to return None only during construction) would improve clarity. Actually, you already set mock_resolve_mlflow.return_value = None before construction, so the validator should already leave it as None. Is this explicit assignment redundant? If so, removing it would simplify the test.
Problem
BaseEvaluator has a Pydantic validator that auto-resolves
mlflow_resource_arnat construction time. However, ifsagemaker_sessionis None when the object is created, the validator exits early and the ARN is never resolved — leaving MLflow tracking silently disabled even when a default tracking server exists in the account.Fix
Add a deferred resolution fallback in
_get_base_template_context: ifmlflow_resource_arnis still None whenevaluate()is called and a session is available, resolve it then using the same_resolve_mlflow_resource_arnutility used by training.Also promotes the import of
_resolve_mlflow_resource_arnto module level in base_evaluator.py (was previously an inline import inside the validator).Changes
base_evaluator.py: module-level import + 3-line deferred resolution in_get_base_template_contexttest_base_evaluator.py: one new unit test covering the deferred resolution pathBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.