Skip to content

Commit 77f91e9

Browse files
author
AWS
committed
Amazon SageMaker Service Update: This release adds b300 and g7e instance types for SageMaker inference endpoints.
1 parent 60660d2 commit 77f91e9

2 files changed

Lines changed: 16 additions & 2 deletions

File tree

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"type": "feature",
3+
"category": "Amazon SageMaker Service",
4+
"contributor": "",
5+
"description": "This release adds b300 and g7e instance types for SageMaker inference endpoints."
6+
}

services/sagemaker/src/main/resources/codegen-resources/service-2.json

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38041,7 +38041,7 @@
3804138041
},
3804238042
"InferenceAmiVersion":{
3804338043
"shape":"ProductionVariantInferenceAmiVersion",
38044-
"documentation":"<p>Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads.</p> <p>By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions.</p> <p>The AMI version names, and their configurations, are the following:</p> <dl> <dt>al2-ami-sagemaker-inference-gpu-2</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535</p> </li> <li> <p>CUDA version: 12.2</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-2-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535</p> </li> <li> <p>CUDA version: 12.2</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-3-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 550</p> </li> <li> <p>CUDA version: 12.4</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-neuron-2</dt> <dd> <ul> <li> <p>Accelerator: Inferentia2 and Trainium</p> </li> <li> <p>Neuron driver version: 2.19</p> </li> </ul> </dd> </dl>"
38044+
"documentation":"<p>Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads.</p> <p>By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions.</p> <p>The AMI version names, and their configurations, are the following:</p> <dl> <dt>al2-ami-sagemaker-inference-gpu-2</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535</p> </li> <li> <p>CUDA version: 12.2</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-2-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535</p> </li> <li> <p>CUDA version: 12.2</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-3-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 550</p> </li> <li> <p>CUDA version: 12.4</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2023-ami-sagemaker-inference-gpu-4-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 580</p> </li> <li> <p>CUDA version: 13.0</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-neuron-2</dt> <dd> <ul> <li> <p>Accelerator: Inferentia2 and Trainium</p> </li> <li> <p>Neuron driver version: 2.19</p> </li> </ul> </dd> </dl>"
3804538045
},
3804638046
"CapacityReservationConfig":{
3804738047
"shape":"ProductionVariantCapacityReservationConfig",
@@ -38132,7 +38132,8 @@
3813238132
"al2-ami-sagemaker-inference-gpu-2",
3813338133
"al2-ami-sagemaker-inference-gpu-2-1",
3813438134
"al2-ami-sagemaker-inference-gpu-3-1",
38135-
"al2-ami-sagemaker-inference-neuron-2"
38135+
"al2-ami-sagemaker-inference-neuron-2",
38136+
"al2023-ami-sagemaker-inference-gpu-4-1"
3813638137
]
3813738138
},
3813838139
"ProductionVariantInstanceType":{
@@ -38266,6 +38267,12 @@
3826638267
"ml.g6e.16xlarge",
3826738268
"ml.g6e.24xlarge",
3826838269
"ml.g6e.48xlarge",
38270+
"ml.g7e.2xlarge",
38271+
"ml.g7e.4xlarge",
38272+
"ml.g7e.8xlarge",
38273+
"ml.g7e.12xlarge",
38274+
"ml.g7e.24xlarge",
38275+
"ml.g7e.48xlarge",
3826938276
"ml.p4d.24xlarge",
3827038277
"ml.c7g.large",
3827138278
"ml.c7g.xlarge",
@@ -38400,6 +38407,7 @@
3840038407
"ml.c6in.24xlarge",
3840138408
"ml.c6in.32xlarge",
3840238409
"ml.p6-b200.48xlarge",
38410+
"ml.p6-b300.48xlarge",
3840338411
"ml.p6e-gb200.36xlarge",
3840438412
"ml.p5.4xlarge"
3840538413
]

0 commit comments

Comments
 (0)