fix: ProcessingS3Output's s3_uri to be an optional field (5559)#5730
fix: ProcessingS3Output's s3_uri to be an optional field (5559)#5730aviruthen wants to merge 2 commits intoaws:masterfrom
ProcessingS3Output's s3_uri to be an optional field (5559)#5730Conversation
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR fixes an issue where ProcessingS3Output's s3_uri was required, preventing users from omitting it. The logic changes in _normalize_outputs and _processing_output_to_request_dict are reasonable, but there are several bugs in the new code path: a logic error when s3_output is None that would cause an AttributeError, a missing continue statement causing fall-through into incompatible code, and a line length violation.
| if output.s3_output and output.s3_output.s3_uri is not None and is_pipeline_variable(output.s3_output.s3_uri): | ||
| normalized_outputs.append(output) | ||
| continue | ||
| # If s3_output is None or s3_uri is None, auto-generate an S3 URI |
There was a problem hiding this comment.
The PR description mentions making s3_uri optional in the ProcessingS3Output shape class (sagemaker-core/src/sagemaker/core/shapes/shapes.py), but that file is not included in the diff. Without that change, ProcessingS3Output(s3_uri=None, ...) or ProcessingS3Output(local_path=..., s3_upload_mode=...) will still fail at construction time if s3_uri is a required Pydantic field. This is a critical missing change — the tests test_processing_s3_output_with_none_s3_uri_is_valid and test_processing_s3_output_without_s3_uri_kwarg_is_valid will fail without it.
🤖 Iteration #1 — Review Comments AddressedDescriptionMake Changes
NoteThe How it worksWhen a user creates a
Comments reviewed: 28
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR fixes issue #5559 by making s3_uri optional in ProcessingS3Output and updating _normalize_outputs to auto-generate S3 paths when s3_uri is None. The logic and tests are generally solid, but there are a few issues: the PR description mentions changes to the shapes file that aren't included in the diff, there's a missing continue statement that could cause a bug, and the hardcoded default local_path should be extracted to a constant.
| class TestProcessingS3OutputOptionalS3Uri: | ||
| """Tests for ProcessingS3Output with optional s3_uri (issue #5559).""" | ||
|
|
||
| def test_processing_s3_output_with_none_s3_uri_is_valid(self): |
There was a problem hiding this comment.
Missing change to shapes file. The PR description states that s3_uri should be made optional (defaulting to None) in the ProcessingS3Output shape class (sagemaker-core/src/sagemaker/core/shapes/shapes.py), but this file is not included in the diff. Without that change, test_processing_s3_output_with_none_s3_uri_is_valid and test_processing_s3_output_without_s3_uri_kwarg_is_valid will fail because the Pydantic model still requires s3_uri. Please include the shapes file change or confirm it was already made separately.
There was a problem hiding this comment.
$context sagemaker-core/src/sagemaker/core/shapes/shapes.py
| role="arn:aws:iam::123456789012:role/SageMakerRole", | ||
| image_uri="test-image:latest", | ||
| instance_count=1, | ||
| instance_type="ml.m5.xlarge", |
There was a problem hiding this comment.
Test may not catch the fall-through bug. This test verifies that the generated URI starts with s3:// and contains expected substrings, but it doesn't verify that urlparse is NOT called on the auto-generated path (which would be the symptom of the missing continue/append bug noted above). Consider adding a test for the pipeline config case where s3_uri=None — if the fall-through bug exists, urlparse would be called on a Join object and raise an error, which would make test_normalize_outputs_with_none_s3_uri_and_pipeline_config fail. Make sure that test actually exercises the code path correctly.
Description
The issue has two parts: (1) The
ProcessingS3Outputshape class insagemaker-core/src/sagemaker/core/shapes/shapes.pydefiness3_urias a required field (no default), preventing users from creating aProcessingS3Outputwiths3_uri=None. (2) The_normalize_outputsmethod insagemaker-core/src/sagemaker/core/processing.pycallsurlparse(output.s3_output.s3_uri)unconditionally, which would fail withNone. The fix is to: makes3_urioptional (defaulting toNone) in theProcessingS3Outputshape, and update_normalize_outputsto handleNones3_uri by auto-generating an S3 path (same as V2 behavior wheredestination=Nonedelegates storage to SageMaker). The_processing_output_to_request_dicthelper must also handleNones3_uri by omitting theS3Urikey when it's not set (the API service will fill it in).Related Issue
Related issue: 5559
Changes Made
sagemaker-core/src/sagemaker/core/processing.pysagemaker-core/tests/unit/test_processing.pyAI-Generated PR
This PR was automatically generated by the PySDK Issue Agent.
Merge Checklist
prefix: descriptionformat