fix: ModelBuilder.deploy() should expose DataCacheConfig and other CreateInferenceCom (5750)#5753
Conversation
…eateInferenceCom (5750)
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds imports and two helper methods for resolving DataCacheConfig and ContainerSpecification, but is critically incomplete — the deploy() and _deploy_core_endpoint() methods are never actually modified to accept or wire through the new parameters. The helper methods also lack type annotations and the overall change doesn't achieve the stated goal.
| return None | ||
|
|
||
| from sagemaker.core.shapes import InferenceComponentDataCacheConfig | ||
|
|
There was a problem hiding this comment.
Minor: The dict-to-object conversion silently defaults enable_caching to False if the key is missing. Consider raising a ValueError if the dict doesn't contain the required enable_caching key, since the error message already states it must have that key. Silent defaults can mask user mistakes, which violates the core tenet of making it hard for users to make mistakes.
🤖 Iteration #1 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName parameters to ModelBuilder.deploy(). The resolver methods and tests are well-structured, but there are several issues: missing tests for the core wiring logic in _deploy_core_endpoint, a line length violation, missing from __future__ import annotations, and the DataCacheConfig dict serialization doesn't forward all possible fields from the shape.
| container_dict = {} | ||
| if hasattr(resolved_container, "image") and resolved_container.image: | ||
| container_dict["Image"] = resolved_container.image | ||
| if hasattr(resolved_container, "artifact_url") and resolved_container.artifact_url: |
There was a problem hiding this comment.
Using hasattr checks on a Pydantic BaseModel (which InferenceComponentContainerSpecification likely is) is unnecessary — the attributes are always present (possibly None). Simplify to:
container_dict = {}
if resolved_container.image:
container_dict["Image"] = resolved_container.image
if resolved_container.artifact_url:
container_dict["ArtifactUrl"] = resolved_container.artifact_url
if resolved_container.environment:
container_dict["Environment"] = resolved_container.environmentThis is cleaner and more idiomatic for Pydantic models.
🤖 Iteration #2 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container, and VariantName parameters to ModelBuilder.deploy(). The approach is reasonable but has several issues: line length violations, inconsistent handling of variant_name=None vs not provided, missing type annotations on the deploy method's return path, and the resolver methods are placed in the wrong file (model_builder_utils.py) but used in model_builder.py without proper wiring.
| resolved_cache_config = self._resolve_data_cache_config(ic_data_cache_config) | ||
| if resolved_cache_config is not None: | ||
| cache_dict = {"EnableCaching": resolved_cache_config.enable_caching} | ||
| # Forward any additional fields from the shape as they become available |
There was a problem hiding this comment.
The DataCacheConfig is being manually serialized to a dict ({"EnableCaching": ...}), but the spec dict already uses PascalCase API keys. Consider whether create_inference_component expects the Pydantic shape object directly (as sagemaker-core typically handles serialization) rather than a manually constructed dict. If the session method handles serialization, passing the resolved InferenceComponentDataCacheConfig object directly would be more robust and future-proof as new fields are added to the shape.
🤖 Iteration #3 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container, and VariantName parameters to ModelBuilder.deploy(). While the feature is useful, there are several issues: a large block of validation logic was accidentally deleted from _deploy_core_endpoint, there's significant code duplication between deploy and update paths, the new parameters in deploy() are added before **kwargs breaking convention, and line length violations exist.
| mock_pv.assert_called_once() | ||
| pv_kwargs = mock_pv.call_args | ||
| assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant" or \ | ||
| (len(pv_kwargs.args) > 0 and False) # variant_name is always a kwarg |
There was a problem hiding this comment.
The test file ends abruptly with an incomplete assertion and a backslash continuation. This line:
assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant" or \
(len(pv_kwargs.args) > 0 and False) # variant_name is always a kwargThe or branch always evaluates to False, making it dead code. Simplify to:
assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant"Also, the test file appears to be missing tests for _update_inference_component wiring, and there are no tests for the deploy() method's parameter forwarding logic (the kwargs population at line 4213).
🤖 Iteration #4 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
There was a problem hiding this comment.
Add 1-2 integration tests that verify the new IC-level parameters (data_cache_config, variant_name) work end-to-end through ModelBuilder.deploy(). Place the new test file at sagemaker-serve/tests/integ/test_ic_deploy_params_integration.py.
Follow the exact patterns from the existing integ tests for structure, cleanup, and assertions.
Test 1: Deploy with data_cache_config via the standard IC path (_deploy_core_endpoint). Use a JumpStart model (like the pattern in test_jumpstart_integration.py, ideally use the same JumpStart model) but deploy with inference_config=ResourceRequirements to trigger the IC-based path. Pass data_cache_config={"enable_caching": True} and a custom variant_name. After deployment, use boto3 sagemaker_client.describe_inference_component() to verify:
- The IC was created with DataCacheConfig.EnableCaching == True
- The variant name matches what was passed (not "AllTraffic")
Test 2: Deploy with data_cache_config via the model customization path (_deploy_model_customization). Use a TrainingJob-based ModelBuilder (like test_model_customization_deployment.py lines 131-170) and pass data_cache_config={"enable_caching": True}. After deployment, describe the inference component and verify DataCacheConfig.EnableCaching is True. Also verify the variant_name defaults to endpoint_name (backward compat) when variant_name is not explicitly provided.
Both tests should include proper cleanup (delete endpoint, endpoint config, model, inference components) in a finally block. Use unique names with uuid to avoid collisions. Mark both with @pytest.mark.slow_test.
$context sagemaker-serve/tests/integ/test_jumpstart_integration.py
$context sagemaker-serve/tests/integ/test_model_customization_deployment.py
🤖 Iteration #5 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
Description
The issue requests exposing additional CreateInferenceComponent API parameters through ModelBuilder.deploy(), primarily DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName. The _deploy_core_endpoint method in model_builder.py builds InferenceComponentSpecification but does not pass through these parameters. The sagemaker.core.shapes module already has InferenceComponentDataCacheConfig and related shapes. The fix requires: (1) adding new optional parameters to the deploy() method and _deploy_core_endpoint(), (2) wiring those parameters into the InferenceComponentSpecification and InferenceComponent.create() call, and (3) making variant_name configurable instead of hardcoded to 'AllTraffic'. The deploy wrappers in model_builder_servers.py pass **kwargs through to _deploy_core_endpoint, so they require no changes.
Related Issue
Related issue: 5750
Changes Made
sagemaker-serve/src/sagemaker/serve/model_builder.pysagemaker-serve/src/sagemaker/serve/model_builder_utils.pyAI-Generated PR
This PR was automatically generated by the PySDK Issue Agent.
Merge Checklist
prefix: descriptionformat