Skip to content

fix: ModelBuilder.deploy() should expose DataCacheConfig and other CreateInferenceCom (5750)#5753

Open
sagemaker-bot wants to merge 6 commits intoaws:masterfrom
sagemaker-bot:fix/modelbuilder-deploy-should-expose-datacacheconfig-5750
Open

fix: ModelBuilder.deploy() should expose DataCacheConfig and other CreateInferenceCom (5750)#5753
sagemaker-bot wants to merge 6 commits intoaws:masterfrom
sagemaker-bot:fix/modelbuilder-deploy-should-expose-datacacheconfig-5750

Conversation

@sagemaker-bot
Copy link
Copy Markdown
Collaborator

Description

The issue requests exposing additional CreateInferenceComponent API parameters through ModelBuilder.deploy(), primarily DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName. The _deploy_core_endpoint method in model_builder.py builds InferenceComponentSpecification but does not pass through these parameters. The sagemaker.core.shapes module already has InferenceComponentDataCacheConfig and related shapes. The fix requires: (1) adding new optional parameters to the deploy() method and _deploy_core_endpoint(), (2) wiring those parameters into the InferenceComponentSpecification and InferenceComponent.create() call, and (3) making variant_name configurable instead of hardcoded to 'AllTraffic'. The deploy wrappers in model_builder_servers.py pass **kwargs through to _deploy_core_endpoint, so they require no changes.

Related Issue

Related issue: 5750

Changes Made

  • sagemaker-serve/src/sagemaker/serve/model_builder.py
  • sagemaker-serve/src/sagemaker/serve/model_builder_utils.py

AI-Generated PR

This PR was automatically generated by the PySDK Issue Agent.

  • Confidence score: 85%
  • Classification: bug
  • SDK version target: V3

Merge Checklist

  • Changes are backward compatible
  • Commit message follows prefix: description format
  • Unit tests added/updated
  • Integration tests added (if applicable)
  • Documentation updated (if applicable)

Copy link
Copy Markdown
Collaborator Author

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds imports and two helper methods for resolving DataCacheConfig and ContainerSpecification, but is critically incomplete — the deploy() and _deploy_core_endpoint() methods are never actually modified to accept or wire through the new parameters. The helper methods also lack type annotations and the overall change doesn't achieve the stated goal.

return None

from sagemaker.core.shapes import InferenceComponentDataCacheConfig

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The dict-to-object conversion silently defaults enable_caching to False if the key is missing. Consider raising a ValueError if the dict doesn't contain the required enable_caching key, since the error message already states it must have that key. Silent defaults can mask user mistakes, which violates the core tenet of making it hard for users to make mistakes.

@sagemaker-bot
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #1 — Review Comments Addressed

Description

Expose additional CreateInferenceComponent API parameters through ModelBuilder.deploy() when deploying Inference Components. This enables users to configure:

  • data_cache_config — Cache model artifacts and container images on instances for faster auto-scaling cold starts (DataCacheConfig.EnableCaching)
  • base_inference_component_name — Adapter component deployment (e.g., LoRA adapters attached to a base model)
  • container — Custom container images, artifact URLs, and environment variables at the IC level
  • variant_name — Configurable production variant name (previously hardcoded to "AllTraffic")

Changes Made

sagemaker-serve/src/sagemaker/serve/model_builder.py

  • Added data_cache_config, base_inference_component_name, container, and variant_name as optional parameters to deploy()
  • Updated deploy() docstring with descriptions for all new parameters
  • Wired new parameters through kwargs to _deploy_core_endpoint()
  • In _deploy_core_endpoint(), integrated the new parameters into InferenceComponentSpecification and InferenceComponent.create() call:
    • data_cache_config is resolved via _resolve_data_cache_config() and added as DataCacheConfig
    • base_inference_component_name is added as BaseInferenceComponentName
    • container is resolved via _resolve_container_spec() and added as Container
    • variant_name defaults to "AllTraffic" but is now configurable

sagemaker-serve/src/sagemaker/serve/model_builder_utils.py

  • Added module-level imports for InferenceComponentDataCacheConfig and InferenceComponentContainerSpecification
  • Added proper type annotations to _resolve_data_cache_config() and _resolve_container_spec()
  • Removed redundant local imports inside helper methods
  • _resolve_data_cache_config() now raises ValueError if dict input is missing the required enable_caching key (instead of silently defaulting to False)

tests/unit/sagemaker/serve/test_resolve_ic_params.py (new)

  • Comprehensive unit tests for _resolve_data_cache_config() and _resolve_container_spec()
  • Covers: None input, dict input, already-typed input passthrough, invalid types, missing/extra keys

Example Usage

builder.deploy(
    endpoint_name="my-endpoint",
    inference_component_name="my-ic",
    instance_type="ml.g5.2xlarge",
    initial_instance_count=1,
    inference_config=ResourceRequirements(
        requests={"memory": 8192, "num_accelerators": 1, "num_cpus": 2, "copies": 1}
    ),
    data_cache_config={"enable_caching": True},
    variant_name="MyVariant",
)

Backward Compatibility

All new parameters are optional with None defaults. Existing code continues to work without changes. The variant_name defaults to "AllTraffic" matching the previous hardcoded behavior.

Comments reviewed: 6
Files modified: sagemaker-serve/src/sagemaker/serve/model_builder.py, sagemaker-serve/src/sagemaker/serve/model_builder_utils.py, tests/unit/sagemaker/serve/test_resolve_ic_params.py

  • sagemaker-serve/src/sagemaker/serve/model_builder.py: Add new parameters (data_cache_config, base_inference_component_name, container, variant_name) to deploy() and _deploy_core_endpoint(), wire them into InferenceComponentSpecification and InferenceComp
  • sagemaker-serve/src/sagemaker/serve/model_builder_utils.py: Add type annotations to _resolve_data_cache_config and _resolve_container_spec, remove local imports (use module-level from model_builder.py), raise ValueError for missing required keys
  • tests/unit/sagemaker/serve/test_resolve_ic_params.py: Unit tests for _resolve_data_cache_config and _resolve_container_spec helper methods

Copy link
Copy Markdown
Collaborator Author

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName parameters to ModelBuilder.deploy(). The resolver methods and tests are well-structured, but there are several issues: missing tests for the core wiring logic in _deploy_core_endpoint, a line length violation, missing from __future__ import annotations, and the DataCacheConfig dict serialization doesn't forward all possible fields from the shape.

container_dict = {}
if hasattr(resolved_container, "image") and resolved_container.image:
container_dict["Image"] = resolved_container.image
if hasattr(resolved_container, "artifact_url") and resolved_container.artifact_url:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hasattr checks on a Pydantic BaseModel (which InferenceComponentContainerSpecification likely is) is unnecessary — the attributes are always present (possibly None). Simplify to:

container_dict = {}
if resolved_container.image:
    container_dict["Image"] = resolved_container.image
if resolved_container.artifact_url:
    container_dict["ArtifactUrl"] = resolved_container.artifact_url
if resolved_container.environment:
    container_dict["Environment"] = resolved_container.environment

This is cleaner and more idiomatic for Pydantic models.

@sagemaker-bot
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #2 — Review Comments Addressed

Description

Expose additional CreateInferenceComponent API parameters through ModelBuilder.deploy() when deploying Inference Components. This enables users to configure:

  • data_cache_config — Cache model artifacts and container images on instances for faster auto-scaling cold starts (DataCacheConfig.EnableCaching)
  • base_inference_component_name — Adapter component deployment (e.g., LoRA adapters attached to a base model)
  • container — Custom container images, artifact URLs, and environment variables at the IC level
  • variant_name — Configurable production variant name (previously hardcoded to "AllTraffic")

Changes Made

sagemaker-serve/src/sagemaker/serve/model_builder.py

  • Removed unnecessary hasattr checks on Pydantic BaseModel attributes in _deploy_core_endpoint() — attributes are always present (possibly None) on Pydantic models, so direct truthiness checks are cleaner and more idiomatic
  • Added forward-compatibility comment on DataCacheConfig serialization
  • Improved variant_name docstring to clarify that None means "AllTraffic" is used

sagemaker-serve/src/sagemaker/serve/model_builder_utils.py

  • _resolve_data_cache_config() raises ValueError if dict input is missing the required enable_caching key (instead of silently defaulting to False)
  • _resolve_container_spec() now filters input dict to only pass known keys (image, artifact_url, environment) to the Pydantic constructor, avoiding ValidationError if the model has extra='forbid' configured

tests/unit/sagemaker/serve/test_resolve_ic_params.py

  • Comprehensive unit tests for _resolve_data_cache_config() and _resolve_container_spec() helper methods
  • New: Tests for core wiring logic in _deploy_core_endpoint() that verify:
    • variant_name defaults to "AllTraffic" when not provided
    • Custom variant_name is correctly forwarded
    • data_cache_config dict/object is resolved and added to inference_component_spec
    • base_inference_component_name is added to inference_component_spec
    • container dict is resolved and added to inference_component_spec
    • No extra keys appear in spec when optional params are not provided
  • Added clarifying comments on extra-key tests explaining that extra keys are ignored/filtered

Example Usage

builder.deploy(
    endpoint_name="my-endpoint",
    inference_component_name="my-ic",
    instance_type="ml.g5.2xlarge",
    initial_instance_count=1,
    inference_config=ResourceRequirements(
        requests={"memory": 8192, "num_accelerators": 1, "num_cpus": 2, "copies": 1}
    ),
    data_cache_config={"enable_caching": True},
    variant_name="MyVariant",
)

Backward Compatibility

All new parameters are optional with None defaults. Existing code continues to work without changes. The variant_name defaults to "AllTraffic" matching the previous hardcoded behavior.

Comments reviewed: 7
Files modified: sagemaker-serve/src/sagemaker/serve/model_builder.py, sagemaker-serve/src/sagemaker/serve/model_builder_utils.py, tests/unit/sagemaker/serve/test_resolve_ic_params.py

  • sagemaker-serve/src/sagemaker/serve/model_builder.py: Fix DataCacheConfig serialization, remove hasattr checks on Pydantic models, improve variant_name docstring
  • sagemaker-serve/src/sagemaker/serve/model_builder_utils.py: Raise ValueError when dict is missing required enable_caching key (already done), and filter extra keys in _resolve_container_spec to avoid Pydantic validation errors
  • tests/unit/sagemaker/serve/test_resolve_ic_params.py: Comprehensive unit tests for resolver helpers AND core wiring logic in _deploy_core_endpoint

Copy link
Copy Markdown
Collaborator Author

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container, and VariantName parameters to ModelBuilder.deploy(). The approach is reasonable but has several issues: line length violations, inconsistent handling of variant_name=None vs not provided, missing type annotations on the deploy method's return path, and the resolver methods are placed in the wrong file (model_builder_utils.py) but used in model_builder.py without proper wiring.

resolved_cache_config = self._resolve_data_cache_config(ic_data_cache_config)
if resolved_cache_config is not None:
cache_dict = {"EnableCaching": resolved_cache_config.enable_caching}
# Forward any additional fields from the shape as they become available
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DataCacheConfig is being manually serialized to a dict ({"EnableCaching": ...}), but the spec dict already uses PascalCase API keys. Consider whether create_inference_component expects the Pydantic shape object directly (as sagemaker-core typically handles serialization) rather than a manually constructed dict. If the session method handles serialization, passing the resolved InferenceComponentDataCacheConfig object directly would be more robust and future-proof as new fields are added to the shape.

@sagemaker-bot
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #3 — Review Comments Addressed

Description

Expose additional CreateInferenceComponent API parameters through ModelBuilder.deploy() when deploying Inference Components. This enables users to configure:

  • data_cache_config — Cache model artifacts and container images on instances for faster auto-scaling cold starts (DataCacheConfig.EnableCaching)
  • base_inference_component_name — Adapter component deployment (e.g., LoRA adapters attached to a base model)
  • container — Custom container images, artifact URLs, and environment variables at the IC level
  • variant_name — Configurable production variant name (previously hardcoded to "AllTraffic")
  • inference_component_name — Promoted from hidden **kwargs to explicit named parameter

Changes Made

sagemaker-serve/src/sagemaker/serve/model_builder.py

  • Added inference_component_name, data_cache_config, base_inference_component_name, container, and variant_name as explicit parameters to deploy() with full docstrings
  • Centralized variant_name defaulting to "AllTraffic" in deploy() rather than splitting between deploy() and _deploy_core_endpoint()
  • In _deploy_core_endpoint(): wired new parameters into InferenceComponentSpecification dict and create_inference_component() call; ensured variant_name is passed consistently to both production_variant() and create_inference_component()
  • Removed pre-existing copy-paste bug: duplicate validation block (routing_config, _update_params, async/serverless/explainer checks, _is_sharded_model checks) that appeared twice
  • Extended _update_inference_component() to accept and pass through data_cache_config, container, and base_inference_component_name
  • Updated _deploy_model_customization() to accept and wire data_cache_config and variant_name into InferenceComponentSpecification and ProductionVariant
  • Removed hasattr checks on Pydantic BaseModel attributes (always present, possibly None)

sagemaker-serve/src/sagemaker/serve/model_builder_utils.py

  • _resolve_data_cache_config() raises ValueError if dict input is missing the required enable_caching key
  • _resolve_container_spec() filters input dict to only pass known keys (image, artifact_url, environment) to avoid Pydantic validation errors
  • Reorganized imports: grouped sagemaker.core.shapes imports together, separate from MLflow imports

tests/unit/sagemaker/serve/test_resolve_ic_params.py (new)

  • Comprehensive unit tests for _resolve_data_cache_config() and _resolve_container_spec() helper methods
  • Tests for core wiring logic in _deploy_core_endpoint() verifying:
    • variant_name defaults to "AllTraffic" when not provided
    • Custom variant_name is correctly forwarded
    • data_cache_config dict/object is resolved and added to spec
    • base_inference_component_name is added to spec
    • container dict is resolved and added to spec
    • No extra keys appear in spec when optional params are not

Comments reviewed: 17
Files modified: sagemaker-serve/src/sagemaker/serve/model_builder.py, sagemaker-serve/src/sagemaker/serve/model_builder_utils.py, tests/unit/sagemaker/serve/test_resolve_ic_params.py

  • sagemaker-serve/src/sagemaker/serve/model_builder.py: Add new IC parameters to deploy(), fix _deploy_core_endpoint (remove duplicate block, wire variant_name consistently, add IC params), extend _update_inference_component, wire data_cache_config into _d
  • sagemaker-serve/src/sagemaker/serve/model_builder_utils.py: Fix _resolve_data_cache_config to raise ValueError on missing key and pass all dict keys through to Pydantic constructor, fix _resolve_container_spec, reorganize imports
  • tests/unit/sagemaker/serve/test_resolve_ic_params.py: Comprehensive unit tests for resolver helpers and core wiring logic, addressing all reviewer feedback

Copy link
Copy Markdown
Collaborator Author

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container, and VariantName parameters to ModelBuilder.deploy(). While the feature is useful, there are several issues: a large block of validation logic was accidentally deleted from _deploy_core_endpoint, there's significant code duplication between deploy and update paths, the new parameters in deploy() are added before **kwargs breaking convention, and line length violations exist.

mock_pv.assert_called_once()
pv_kwargs = mock_pv.call_args
assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant" or \
(len(pv_kwargs.args) > 0 and False) # variant_name is always a kwarg
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test file ends abruptly with an incomplete assertion and a backslash continuation. This line:

assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant" or \
    (len(pv_kwargs.args) > 0 and False)  # variant_name is always a kwarg

The or branch always evaluates to False, making it dead code. Simplify to:

assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant"

Also, the test file appears to be missing tests for _update_inference_component wiring, and there are no tests for the deploy() method's parameter forwarding logic (the kwargs population at line 4213).

@sagemaker-bot
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #4 — Review Comments Addressed

Description

Expose additional CreateInferenceComponent API parameters through ModelBuilder.deploy() when deploying Inference Components. This enables users to configure:

  • data_cache_config — Cache model artifacts and container images on instances for faster auto-scaling cold starts (DataCacheConfig.EnableCaching)
  • base_inference_component_name — Adapter component deployment (e.g., LoRA adapters attached to a base model)
  • container — Custom container images, artifact URLs, and environment variables at the IC level
  • variant_name — Configurable production variant name (previously hardcoded to "AllTraffic")
  • inference_component_name — Promoted from hidden **kwargs to explicit named parameter

Changes Made

sagemaker-serve/src/sagemaker/serve/model_builder.py

  • Added inference_component_name, data_cache_config, base_inference_component_name, container, and variant_name as explicit parameters to deploy() with full docstrings
  • Centralized variant_name defaulting to "AllTraffic" in deploy() and always forwarded via kwargs
  • Extracted _apply_optional_ic_params() static helper to DRY up the logic for wiring data_cache_config, base_inference_component_name, and container into inference component spec dicts — used by both _deploy_core_endpoint() and _update_inference_component()
  • Removed hasattr checks on Pydantic BaseModel attributes (always present, possibly None) — uses direct truthiness checks instead
  • Preserved backward compatibility in _deploy_model_customization(): variant_name defaults to endpoint_name (not "AllTraffic") to match previous behavior where the variant was named after the endpoint

sagemaker-serve/src/sagemaker/serve/model_builder_utils.py

  • _resolve_data_cache_config() raises ValueError if dict input is missing the required enable_caching key (no silent defaults)
  • _resolve_container_spec() filters input dict to only pass known keys (image, artifact_url, environment) to avoid Pydantic validation errors with extra='forbid'

tests/unit/sagemaker/serve/test_resolve_ic_params.py (new)

  • Comprehensive unit tests for _resolve_data_cache_config() and _resolve_container_spec() helper methods
  • Tests for _apply_optional_ic_params() static helper (all combinations)
  • Tests for core wiring logic in _deploy_core_endpoint() verifying variant_name, data_cache_config, base_inference_component_name, container, and absence of extra keys
  • Tests for _update_inference_component() wiring
  • Tests for deploy() parameter forwarding logic (kwargs population)
  • Fixed dead-code assertion (or branch that always evaluated to False)

Example Usage

builder.deploy(
    endpoint_name="my-endpoint",
    inference_component_name="my-ic",
    instance_type="ml.g5.2xlarge",
    initial_instance_count=1,
    inference_config=ResourceRequirements(
        requests={"memory": 8192, "num_accelerators": 1, "

**Comments reviewed:** 7
**Files modified:** `sagemaker-serve/src/sagemaker/serve/model_builder.py`, `sagemaker-serve/src/sagemaker/serve/model_builder_utils.py`, `tests/unit/sagemaker/serve/test_resolve_ic_params.py`

- `sagemaker-serve/src/sagemaker/serve/model_builder.py`: Add _apply_optional_ic_params helper, wire new IC params through deploy/deploy_core_endpoint/update_inference_component, fix variant_name in _deploy_model_customization
- `sagemaker-serve/src/sagemaker/serve/model_builder_utils.py`: The resolver methods are already correct in the current source (raise ValueError on missing key, filter extra keys). No changes needed since the source already addresses all reviewer comments.
- `tests/unit/sagemaker/serve/test_resolve_ic_params.py`: Comprehensive unit tests for resolver helpers, core wiring logic, _update_inference_component wiring, and deploy() parameter forwarding

@aviruthen aviruthen marked this pull request as ready for review April 14, 2026 20:04
mollyheamazon
mollyheamazon previously approved these changes Apr 14, 2026
Copy link
Copy Markdown
Contributor

@nargokul nargokul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing INteg tests

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add 1-2 integration tests that verify the new IC-level parameters (data_cache_config, variant_name) work end-to-end through ModelBuilder.deploy(). Place the new test file at sagemaker-serve/tests/integ/test_ic_deploy_params_integration.py.

Follow the exact patterns from the existing integ tests for structure, cleanup, and assertions.

Test 1: Deploy with data_cache_config via the standard IC path (_deploy_core_endpoint). Use a JumpStart model (like the pattern in test_jumpstart_integration.py, ideally use the same JumpStart model) but deploy with inference_config=ResourceRequirements to trigger the IC-based path. Pass data_cache_config={"enable_caching": True} and a custom variant_name. After deployment, use boto3 sagemaker_client.describe_inference_component() to verify:

  • The IC was created with DataCacheConfig.EnableCaching == True
  • The variant name matches what was passed (not "AllTraffic")

Test 2: Deploy with data_cache_config via the model customization path (_deploy_model_customization). Use a TrainingJob-based ModelBuilder (like test_model_customization_deployment.py lines 131-170) and pass data_cache_config={"enable_caching": True}. After deployment, describe the inference component and verify DataCacheConfig.EnableCaching is True. Also verify the variant_name defaults to endpoint_name (backward compat) when variant_name is not explicitly provided.

Both tests should include proper cleanup (delete endpoint, endpoint config, model, inference components) in a finally block. Use unique names with uuid to avoid collisions. Mark both with @pytest.mark.slow_test.

$context sagemaker-serve/tests/integ/test_jumpstart_integration.py
$context sagemaker-serve/tests/integ/test_model_customization_deployment.py

@sagemaker-bot
Copy link
Copy Markdown
Collaborator Author

🤖 Iteration #5 — Review Comments Addressed

Description

Expose additional CreateInferenceComponent API parameters through ModelBuilder.deploy() when deploying Inference Components. This enables users to configure:

  • data_cache_config — Cache model artifacts and container images on instances for faster auto-scaling cold starts (DataCacheConfig.EnableCaching)
  • base_inference_component_name — Adapter component deployment (e.g., LoRA adapters attached to a base model)
  • container — Custom container images, artifact URLs, and environment variables at the IC level
  • variant_name — Configurable production variant name (previously hardcoded to "AllTraffic")
  • inference_component_name — Promoted from hidden **kwargs to explicit named parameter

Changes Made

sagemaker-serve/src/sagemaker/serve/model_builder.py

  • Added inference_component_name, data_cache_config, base_inference_component_name, container, and variant_name as explicit parameters to deploy() with full docstrings
  • Centralized variant_name defaulting to "AllTraffic" in deploy() and always forwarded via kwargs
  • Extracted _apply_optional_ic_params() static helper to DRY up the logic for wiring data_cache_config, base_inference_component_name, and container into inference component spec dicts — used by both _deploy_core_endpoint() and _update_inference_component()
  • Removed hasattr checks on Pydantic BaseModel attributes (always present, possibly None) — uses direct truthiness checks instead
  • Preserved backward compatibility in _deploy_model_customization(): variant_name defaults to endpoint_name (not "AllTraffic") to match previous behavior

sagemaker-serve/src/sagemaker/serve/model_builder_utils.py

  • _resolve_data_cache_config() raises ValueError if dict input is missing the required enable_caching key (no silent defaults)
  • _resolve_container_spec() filters input dict to only pass known keys (image, artifact_url, environment) to avoid Pydantic validation errors with extra='forbid'

tests/unit/sagemaker/serve/test_resolve_ic_params.py

  • Comprehensive unit tests for _resolve_data_cache_config() and _resolve_container_spec() helper methods
  • Tests for _apply_optional_ic_params() static helper (all combinations)
  • Tests for core wiring logic in _deploy_core_endpoint() verifying variant_name, data_cache_config, base_inference_component_name, container, and absence of extra keys
  • Tests for _update_inference_component() wiring
  • Tests for deploy() parameter forwarding logic (kwargs population)

sagemaker-serve/tests/integ/test_ic_deploy_params_integration.py (new)

  • Test 1: Deploy JumpStart model via IC path with data_cache_config={"enable_caching": True} and custom variant_name. Verifies via describe_inference_component() that both are correctly set.
  • Test 2: Deploy fine-tuned model via model customization path with data_cache_config. Verifies EnableCaching=True and that

Comments reviewed: 5
Files modified: sagemaker-serve/tests/integ/test_ic_deploy_params_integration.py

  • sagemaker-serve/tests/integ/test_ic_deploy_params_integration.py: New integration test file verifying data_cache_config and variant_name work end-to-end through ModelBuilder.deploy()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants