To deploy the solution, you must have Administrator (or Power User) permissions to package the CloudFormation templates, upload templates in your Amazon S3 bucket, and run the deployment commands.
You must also have AWS CLI. If you do not have it, see Installing, updating, and uninstalling the AWS CLI. If you would like to use the multi-account model deployment option, you need access to minimum two AWS accounts, recommended three accounts for development, staging and production environments.
β At the time of writing there is a limit of one SageMaker domain per region per account. You cannot deploy this solution if you have already a domain in the deployment region in your AWS account. Chose a different region or delete the existing domain in the deployment region.
To check if you have a domain in the deployment region, run the following CLI command:
aws sagemaker list-domainsIf any of domains in the returned list is in the status InService, you cannot deploy a new domain in this region.
β By default you have a limit of five Amazon Virtual Private Cloud (Amazon VPC) VPCs per region per account. Check how many VPCs you have in the deployment region of your AWS account. You must have not more than four VPCs to be able to deploy the solution.
You can run the following CLI command to see all VPCs in your account in the deployment region:
aws ec2 describe-vpcsThe solution deploys a VPC in your AWS account. This VPC is not deleted if you remove solution's CloudFormation stack or if the deployment fails. Subsequent failures of the deployment or re-deployments might reach the limit of five VPCs in the deployment region and any new deployment of the solution will fail. Always check how many VPCs you have in the deployment region in the main and target accounts before launching the solution.
SageMaker uses two execution global roles AmazonSageMakerServiceCatalogProductsLaunchRole and AmazonSageMakerServiceCatalogProductsUseRole to launch SageMaker portfolio of products and operate a CI/CD model deployment pipeline. Refer to SageMaker documentation to understand these roles.
β If you have already deployed a SageMaker domain in your AWS account in any region, you might already have these roles. In this case you must delete these two roles before you start solution deployment. The solution will create two IAM roles with the same names but different IAM policies attached.
Please check if these roles exist in your account before deployment.
To check if AmazonSageMakerServiceCatalogProductsLaunchRole role exist, run the following CLI command in the deployment account:
aws iam get-role --role-name AmazonSageMakerServiceCatalogProductsLaunchRoleIf role doesn't exist, you get a NoSuchEntity error, otherwise you get information about the role.
To check if AmazonSageMakerServiceCatalogProductsUseRole role exist, run the following CLI command:
aws iam get-role --role-name AmazonSageMakerServiceCatalogProductsUseRoleIf any of the roles exists in your AWS account, delete the existing role or roles, otherwise the deployment of the solution might fail or MLOps projects will not operate properly.
Please go through these step-by-step instructions to package and upload the solution templates into a S3 bucket for the deployment.
If you would like to run a security scan on the CloudFormation templates using cfn_nag (recommended), you have to install cfn_nag:
brew install ruby brew-gem
brew gem install cfn-nagTo initiate the security scan, run the following command:
make cfn_nag_scanYou have a choice of different independent deployment options using the delivered CloudFormation templates:
- Easiest option Data science environment quickstart: deploy an end-to-end data science environment with the majority of options set to default values. This deployment type supports single-account model deployment workflow only. You can change only a few deployment options
- Most common and flexible option
Two-step deployment via CloudFormation: deploy the core infrastructure in the first step and then deploy a data science environment, both as CloudFormation templates. CLI
aws cloudformation create-stackis used for deployment. You can change any deployment option - Most advanced option
Two-step deployment via CloudFormation and AWS Service Catalog: deploy the core infrastructure in the first step via
aws cloudformation create-stackand then provision a data science environment via AWS Service Catalog. You can change any deployment option
The following sections give step-by-step deployment instructions for each of the options.
You can also find all CLI commands in the delivered shell scripts in the project folder test.
The following diagram visualizes the deployment options:
This special type of deployment is designed for an environment, where all iam: API calls, such as role and policy creation, are separated from the main deployment. All IAM roles for users and services and related IAM permission policies should be created as part of a separate process following the separation of duties principle.
The IAM part can be deployed using the delivered CloudFormation templates or completely separated out-of-stack in your own process.
You provide the ARNs for the IAM roles as CloudFormation template parameters to deploy the data science environment.
If you use this deployment option, at no point you need any of iam:* permissions in your IAM permission policy to deploy a data science environment.
See Appendix B
β Skip this section if you don't use multi-account setup and jump to Deployment types.
In the following instructions we use the these concepts:
- Main account: AWS account where you deploy the data science environment with SageMaker domain and studio
- Target accounts: Staging and production AWS accounts used for model deployment
Multi-account model deployment requires VPC infrastructure and specific execution roles in the target accounts. The provisioning of the infrastructure and the roles is done automatically during the deployment of the data science environment as a part of the overall deployment process. To enable multi-account setup you must provide the staging and production organizational unit (OUs) IDs OR staging and production account lists as CloudFormation parameters for the deployment.
This diagram shows how CloudFormation stack sets are used to deploy the needed infrastructure to the target accounts.
Two stack sets - one for the VPC infrastructure and another for the roles - are deployed for each environment type, staging and production.
One-off setup is needed to enable multi-account model deployment workflow with SageMaker MLOps projects.
β You don't need to perform this setup if you are going to use single-account deployment only.
The provisioning of a data science environment uses CloudFormation stack set to deploy the required IAM roles and VPC infrastructure into the target accounts.
The solution uses SELF_MANAGED stack set permission model and needs to bootstrap two IAM roles:
AdministratorRolein the development account where you are going to deploy the SageMaker domainSetupStackSetExecutionRolein each of the target accounts
These roles must exist before you launch the solution deployment. The AdministratorRole is automatically created during the solution deployment. For the SetupStackSetExecutionRole you can use the delivered CloudFormation template env-iam-setup-stacksest-role.yaml or your own process of provisioning of an IAM role.
To provision SetupStackSetExecutionRole launch the following CloudFormation template in each of the target accounts, but not in the main account:
# STEP 1:
# SELF_MANAGED stack set permission model:
# Deploy a stack set execution role to _EACH_ of the target accounts in both staging and prod OUs
# This stack set execution role is used to deploy the target accounts stack sets in env-main.yaml
# ENV_NAME needs to be unique to be able to create s3 buckets for the environment and consistent with service catalog
# !!!!!!!!!!!! RUN THIS COMMAND IN EACH OF THE TARGET ACCOUNTS !!!!!!!!!!!!
ENV_NAME="sm-mlops" # Or use your own unique environment names like "sm-mlops-$DEPARTMENT_NAME"
ENV_TYPE=# use your own consistent environment stage names like "staging" and "prod"
ADMIN_ACCOUNT_ID=<DATA SCIENCE DEVELOPMENT ACCOUNT ID>
#Β Change or leave on default
SETUP_STACKSET_ROLE_NAME=$ENV_NAME-setup-stackset-execution-role
STACK_NAME=$ENV_NAME-setup-stackset-role
# Delete stack if it exists
aws cloudformation delete-stack --stack-name $STACK_NAME
aws cloudformation deploy \
--template-file cfn_templates/env-iam-setup-stackset-role.yaml \
--stack-name $STACK_NAME \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
EnvName=$ENV_NAME \
EnvType=$ENV_TYPE \
StackSetExecutionRoleName=$SETUP_STACKSET_ROLE_NAME \
AdministratorAccountId=$ADMIN_ACCOUNT_ID
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--output table \
--query "Stacks[0].Outputs[*].[OutputKey, OutputValue]"Take a note of StackSetExecutionRoleName in the stack output. The name, not ARN, of the provisioned IAM role must be passed to the env-main.yaml template or used in Service Catalog-based deployment as SetupStackSetExecutionRoleName parameter.
This step is only needed if you use AWS Organizations setup.
You must enable AWS CloudFormation StackSets service integration with your AWS Organizations. When you enable integration, you allow the specified service to create a service-linked role in all the accounts in your organization. This allows the service to perform operations on your behalf in your organization and its accounts.
You can check the integration status and enable it in the AWS Organizations Services console or run the following CLI commands in the management account of your AWS Organizations.
- Check if AWS CloudFormation StackSet service is enabled:
aws organizations list-aws-service-access-for-organization \
--query 'EnabledServicePrincipals[?ServicePrincipal==`member.org.stacksets.cloudformation.amazonaws.com`]'If the command returns an empty array [], you must enable StackSets. If you get AccessDeniedException, make sure you run the command in the management account of the AWS Organizations and you have the corresponding permissions.
- If stack sets are not enabled, run the following command:
aws organizations enable-aws-service-access \
--service-principal=member.org.stacksets.cloudformation.amazonaws.comA delegated administrator account must be registered in order to enable ListAccountsForParent AWS Organization API call. If the data science account is already the management account in the AWS Organizations, this step must be skipped.
Run the following CLI command under administrator credentials in the AWS Organization management account:
# STEP 2:
# Register a delegated administrator to enable AWS Organizations API permission for non-management account
#Β Must be run under administrator in the AWS Organizations _management account_
ADMIN_ACCOUNT_ID=<DATA SCIENCE DEVELOPMENT ACCOUNT ID>
aws organizations register-delegated-administrator \
--service-principal=member.org.stacksets.cloudformation.amazonaws.com \
--account-id=$ADMIN_ACCOUNT_IDCheck that the development account is listed as a delegated administrator account:
aws organizations list-delegated-administrators \
--service-principal=member.org.stacksets.cloudformation.amazonaws.comThe solution is designed for multi-region deployment. You can deploy end-to-end stack in any region of the same AWS account. The following limitations and considerations apply:
- The shared IAM roles (
DSAdministratorRole,SageMakerDetectiveControlExecutionRole,SCLaunchRole) are created each time you deploy a new core infrastructure (core-main) or "quickstart" (data-science-environment-quickstart) stack. They created with<StackName>-<RegionName>prefix and designed to be unique within your end-to-end data science environment. For example, if you deploy one stack set (including core infrastructure and team data science environment) in one region and another stack in another region, these two stacks do not share any IAM roles and any users assuming any persona roles have an independent set of permissions per stack set and per region. - The environment IAM roles (
DSTeamAdministratorRole,DataScientistRole,SageMakerExecutionRole,SageMakerPipelineExecutionRole,SCProjectLaunchRole,SageMakerModelExecutionRole) are created with unique names. Each deployment of a new data science environment (via CloudFormation or via AWS Service Catalog) creates a set of unique roles. - SageMaker Studio uses two pre-defined roles
AmazonSageMakerServiceCatalogProductsLaunchRoleandAmazonSageMakerServiceCatalogProductsUseRole. These roles are global for the AWS account and created by the first deployment of core infrastructure. These two roles haveRetaindeletion policy and are not deleted when you delete the stack which has created these roles. These roles are global for the AWS account.
The following three sections describes each deployment type and deployment use case in detail.
This option deploys the end-to-end infrastructure and a data science Environment in one go. You can change only few deployment options. The majority of the options are set to their default values.
This is the recommended options for majority of deployments. Use other options only if you need to test or model a specific use case, for example multi-account model deployment.
π Use this option if you want to provision a completely new set of the infrastructure and do not want to parametrize the deployment.
β With Quickstart deployment type you can uses single-account model deployment only. Do not select YES for MultiAccountDeployment parameter for model deploy SageMaker project template:
The only deployment options you can change are:
CreateSharedServices: defaultNO. Set toYESif you want to provision a shared services VPC with a private PyPI mirror (not implemented at this stage)VPCIDR: default10.0.0.0/16. CIDR block for the new VPC- Private and public subnets CIDR blocks: default
10.0.0.0/19
Make sure you specify the CIDR blocks which do not conflict with your existing network IP ranges.
β You cannot use existing VPC or existing IAM roles to deploy this stack. The stack provisions a new own set of network and IAM resources.
Initiate the stack deployment with the following command. Use the S3 bucket name you used to upload CloudFormation templates in Package CloudFormation templates section.
[[ unset"${S3_BUCKET_NAME}" == "unset" ]] && echo "\e[1;31mERROR: S3_BUCKET_NAME is not set" || echo "\e[1;32mS3_BUCKET_NAME is set to ${S3_BUCKET_NAME}"
STACK_NAME="ds-quickstart"
ENV_NAME="sm-mlops"
aws cloudformation create-stack \
--template-url https://s3.$AWS_DEFAULT_REGION.amazonaws.com/$S3_BUCKET_NAME/sagemaker-mlops/data-science-environment-quickstart.yaml \
--region $AWS_DEFAULT_REGION \
--stack-name $STACK_NAME \
--disable-rollback \
--capabilities CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=EnvName,ParameterValue=$ENV_NAME \
ParameterKey=EnvType,ParameterValue=dev \
ParameterKey=SeedCodeS3BucketName,ParameterValue=$S3_BUCKET_NAMEThe full end-to-end deployment takes about 25 minutes.
Refer to data science environment clean up section.
With this option you provision a data science environment in two steps, each with its own CloudFormation template. You can control all deployment parameters.
π Use this option if you want to parametrize every aspect of the deployment based on your specific requirements and environment.
β You can select your existing VPC and network resources (subnets, NAT gateways, route tables) and existing IAM resources to be used for stack set deployment. Set the corresponding CloudFormation parameters to names and ARNs or your existing resources.
β You must specify the valid OU IDs for the OrganizationalUnitStagingId/OrganizationalUnitProdId or StagingAccountList/ProductionAccountList parameters for the env-main.yaml template to enable multi-account model deployment.
You can use the provided shell script to run this deployment type or follow the commands below.
β Make sure you package the CloudFormation templates as described in these instructions.
In this step you deploy the shared core infrastructure into your AWS Account. The stack (core-main.yaml) provisions:
- Shared IAM roles for data science personas and services (optional if you bring your own IAM roles)
- A shared services VPC and related networking resources (optional if you bring your own network configuration)
- An AWS Service Catalog portfolio to provide a self-service deployment for the data science administrator user role
- An ECS Fargate cluster to run a private PyPi mirror (not implemented at this stage)
- Security guardrails for your data science environment (detective controls are not implemented at this stage)
The deployment options you can use are:
CreateIAMRoles: defaultYES. Set toNOif you have created the IAM roles outside of the stack (e.g. via a separate process) - such as "Bring Your Own IAM Role (BYOR IAM)" use caseCreateSharedServices: defaultNO. Set toYESif you would like to create a shared services VPC and an ECS Fargate cluster for a private PyPi mirror (not implemented at this stage)CreateSCPortfolio: defaultYES. Set toNOif you don't want to to deploy an AWS Service Catalog portfolio with data science environment productsDSAdministratorRoleArn: required ifCreateIAMRoles=NO, otherwise automatically provisionedSCLaunchRoleArn: required ifCreateIAMRoles=NO, otherwise automatically provisionedSecurityControlExecutionRoleArn: required ifCreateIAMRoles=NO, otherwise automatically provisioned
The following command uses the default values for the deployment options. You can specify parameters via ParameterKey=<ParameterKey>,ParameterValue=<Value> pairs in the aws cloudformation create-stack call:
STACK_NAME="sm-mlops-core"
[[ unset"${S3_BUCKET_NAME}" == "unset" ]] && echo "\e[1;31mERROR: S3_BUCKET_NAME is not set" || echo "\e[1;32mS3_BUCKET_NAME is set to ${S3_BUCKET_NAME}"
aws cloudformation create-stack \
--template-url https://s3.$AWS_DEFAULT_REGION.amazonaws.com/$S3_BUCKET_NAME/sagemaker-mlops/core-main.yaml \
--region $AWS_DEFAULT_REGION \
--stack-name $STACK_NAME \
--disable-rollback \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=StackSetName,ParameterValue=$STACK_NAMEThe previous command launches the stack deployment and returns the StackId. You can track the stack deployment status in AWS CloudFormation console or in your terminal with the following command:
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--query "Stacks[0].StackStatus"After a successful stack deployment, the status turns to from CREATE_IN_PROGRESS to CREATE_COMPLETE. You can print the stack output:
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--output table \
--query "Stacks[0].Outputs[*].[OutputKey, OutputValue]"The step 2 CloudFormation template (env-main.yaml) provides two deployment options:
- Deploy Studio into a new VPC: This option provisions a new AWS network infrastructure consisting of:
- VPC
- private subnet in each Availability Zone (AZ)
- optional public subnet in each AZ. The public subnets are provisioned only if you chose to create NAT gateways
- VPC endpoints to access public AWS services and Amazon S3 buckets for data science projects
- security groups
- optional NAT gateway in each AZ
- if you select the NAT gateway option, the template creates an internet gateway and attaches it to the VPC
- routing tables and routes for private and public subnets
You specify up to four AZs and CIDR blocks for VPC and each of the subnets.
After provisioning the network infrastructure, the solution deploys Studio into this VPC.
- Deploy Studio into an existing VPC: This option provisions Studio in your existing AWS network infrastructure. You have several options to choose between existing or create new network resources:
- VPC: you must provide a valid existing VPC Id
- Subnets: you can choose between:
- providing existing subnet CIDR blocks (set
CreatePrivateSubnetstoNO) - in this case no new subnets are provisioned and NAT gateway option is not available. All SageMaker resources are deployed into your existing VPC and private subnets. You use your existing NAT (if available) to access internet from the private subnets - provisioning new private (set
CreatePrivateSubnetstoYES) and optional (only if the NAT gateway option is selected,CreateNATGateways=YES) public subnets. The deployment creates new subnets with specified CIDR blocks inside your existing VPC.
- providing existing subnet CIDR blocks (set
You must specify up to four AZs you would like to deploy the network resources into.
β The CloudFormation template creates a new internet gateway attaches it to the VPC in "existing VPC" scenario if you select the NAT gateway option by setting CreateNATGateways to YES. The stack creation fails if there is an internet gateway already attached to the existing VPC and you select the NAT gateway option.
Example of the existing VPC infrastructure:
The data science environment deployment provisions the following resources in your AWS account:
- environment-specific IAM roles (optional, if
CreateEnvironmentIAMRolesset toYES) - a VPC with all network infrastructure for the environment (optional) - see the considerations above
- VPC endpoints to access the environment's Amazon S3 buckets and AWS public services via private network
- AWS KMS keys for data encryption
- two S3 buckets for environment data and model artifacts
- AWS Service Catalog portfolio with environment-specific products
- Studio domain and default user profile
If you choose the multi-account model deployment option by providing values for OrganizationalUnitStagingId/OrganizationalUnitProdId or StagingAccountList/ProductionAccountList, the deployment provisions the following resources in the target accounts:
- VPC with a private subnet in each of the AZs, no internet connectivity
- Security groups for SageMaker model hosting and VPC endpoints
- Execution roles for stack set operations and SageMaker models
You can change any deployment options via CloudFormation parameters for core-main.yaml and env-main.yaml templates.
Run command providing the deployment options for your environment. The following command uses the minimal set of the options:
# Use default or provide your own names
STACK_NAME="sm-mlops-env"
ENV_NAME="sm-mlops"
[[ unset"${S3_BUCKET_NAME}" == "unset" ]] && echo "\e[1;31mERROR: S3_BUCKET_NAME is not set" || echo "\e[1;32mS3_BUCKET_NAME is set to ${S3_BUCKET_NAME}"
aws cloudformation create-stack \
--template-url https://s3.$AWS_DEFAULT_REGION.amazonaws.com/$S3_BUCKET_NAME/sagemaker-mlops/env-main.yaml \
--region $AWS_DEFAULT_REGION \
--stack-name $STACK_NAME \
--disable-rollback \
--capabilities CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=EnvName,ParameterValue=$ENV_NAME \
ParameterKey=EnvType,ParameterValue=dev \
ParameterKey=AvailabilityZones,ParameterValue=${AWS_DEFAULT_REGION}a\\,${AWS_DEFAULT_REGION}b \
ParameterKey=NumberOfAZs,ParameterValue=2 \
ParameterKey=SeedCodeS3BucketName,ParameterValue=$S3_BUCKET_NAMEIf you would like to use multi-account model deployment, you must provide the valid values for AWS Organizations OU IDs or account lists and the name for the SetupStackSetExecutionRole from env-iam-setup-stacksest-role.yaml stack output.
STACK_NAME="sm-mlops-env"
ENV_NAME="sm-mlops" # must be the same environment name you used for multi-account bootstrapping for the stack set execution role
STAGING_OU_ID=<OU id>
PROD_OU_ID=<OU id>
[[ unset"${S3_BUCKET_NAME}" == "unset" ]] && echo "\e[1;31mERROR: S3_BUCKET_NAME is not set" || echo "\e[1;32mS3_BUCKET_NAME is set to ${S3_BUCKET_NAME}"
[[ unset"${STAGING_OU_ID}" == "unset" ]] && echo "\e[1;31mERROR: STAGING_OU_ID is not set" || echo "\e[1;32mSTAGING_OU_ID is set to ${STAGING_OU_ID}"
[[ unset"${PROD_OU_ID}" == "unset" ]] && echo "\e[1;31mERROR: PROD_OU_ID is not set" || echo "\e[1;32mPROD_OU_ID is set to ${PROD_OU_ID}"
# provide the actual name of the role if you don't use the default name
SETUP_STACKSET_ROLE_NAME=$ENV_NAME-setup-stackset-execution-role
aws cloudformation create-stack \
--template-url https://s3.$AWS_DEFAULT_REGION.amazonaws.com/$S3_BUCKET_NAME/sagemaker-mlops/env-main.yaml \
--region $AWS_DEFAULT_REGION \
--stack-name $STACK_NAME \
--disable-rollback \
--capabilities CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=EnvName,ParameterValue=$ENV_NAME \
ParameterKey=EnvType,ParameterValue=dev \
ParameterKey=AvailabilityZones,ParameterValue=${AWS_DEFAULT_REGION}a\\,${AWS_DEFAULT_REGION}b \
ParameterKey=NumberOfAZs,ParameterValue=2 \
ParameterKey=SeedCodeS3BucketName,ParameterValue=$S3_BUCKET_NAME \
ParameterKey=OrganizationalUnitStagingId,ParameterValue=$STAGING_OU_ID \
ParameterKey=OrganizationalUnitProdId,ParameterValue=$PROD_OU_ID \
ParameterKey=SetupStackSetExecutionRoleName,ParameterValue=$SETUP_STACKSET_ROLE_NAMESTACK_NAME="sm-mlops-env"
ENV_NAME="sm-mlops" # must be the same environment name you used for multi-account bootstrapping for the stack set execution role
STAGING_ACCOUNTS=<comma-delimited account list>
PROD_ACCOUNTS=<comma-delimited account list>
[[ unset"${S3_BUCKET_NAME}" == "unset" ]] && echo "\e[1;31mERROR: S3_BUCKET_NAME is not set" || echo "\e[1;32mS3_BUCKET_NAME is set to ${S3_BUCKET_NAME}"
[[ unset"${STAGING_ACCOUNTS}" == "unset" ]] && echo "\e[1;31mERROR: STAGING_ACCOUNTS is not set" || echo "\e[1;32mSTAGING_ACCOUNTS is set to ${STAGING_ACCOUNTS}"
[[ unset"${PROD_ACCOUNTS}" == "unset" ]] && echo "\e[1;31mERROR: PROD_ACCOUNTS is not set" || echo "\e[1;32mPROD_ACCOUNTS is set to ${PROD_ACCOUNTS}"
# provide the actual name of the role if you don't use the default name
SETUP_STACKSET_ROLE_NAME=$ENV_NAME-setup-stackset-execution-role
aws cloudformation create-stack \
--template-url https://s3.$AWS_DEFAULT_REGION.amazonaws.com/$S3_BUCKET_NAME/sagemaker-mlops/env-main.yaml \
--region $AWS_DEFAULT_REGION \
--stack-name $STACK_NAME \
--disable-rollback \
--capabilities CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=EnvName,ParameterValue=$ENV_NAME \
ParameterKey=EnvType,ParameterValue=dev \
ParameterKey=AvailabilityZones,ParameterValue=${AWS_DEFAULT_REGION}a\\,${AWS_DEFAULT_REGION}b \
ParameterKey=NumberOfAZs,ParameterValue=2 \
ParameterKey=SeedCodeS3BucketName,ParameterValue=$S3_BUCKET_NAME \
ParameterKey=StagingAccountList,ParameterValue=$STAGING_ACCOUNTS \
ParameterKey=ProductionAccountList,ParameterValue=$PROD_ACCOUNTS \
ParameterKey=SetupStackSetExecutionRoleName,ParameterValue=$SETUP_STACKSET_ROLE_NAMEThe previous command launches the stack deployment and returns the StackId. You can track the stack deployment status in AWS CloudFormation console or in your terminal with the following command:
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--query "Stacks[0].StackStatus"After a successful stack deployment, the status turns to from CREATE_IN_PROGRESS to CREATE_COMPLETE. You can print the stack output:
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--output table \
--query "Stacks[0].Outputs[*].[OutputKey, OutputValue]"Refer to two step deployment via CloudFormation clean up section.
This deployment option first deploys the core infrastructure including the AWS Service Catalog portfolio of data science products. In the second step, the data science Administrator deploys a data science environment via the AWS Service Catalog.
π Use this option if you want to model the end user experience in provisioning a data science environment via AWS Service Catalog
β You can select your existing VPC and network resources (subnets, NAT gateways, route tables) and existing IAM resources to be used for stack set deployment. Set the corresponding CloudFormation and AWS Service Catalog product parameters to names and ARNs or your existing resources.
Same as Step 1 from Two-step deployment via CloudFormation.
After the base infrastructure is provisioned, data scientists and other users must assume the DS Administrator IAM role (AssumeDSAdministratorRole) via link in the CloudFormation output. In this role, the users can browse the AWS Service Catalog and then provision a secure Studio environment.
First, print the output from the stack deployment in Step 1:
aws cloudformation describe-stacks \
--stack-name $STACK_NAME \
--output table \
--query "Stacks[0].Outputs[*].[OutputKey, OutputValue]"Copy and paste the AssumeDSAdministratorRole link to a web browser and switch role to DS Administrator.
Go to AWS Service Catalog in the AWS console and select Products on the left pane:
You see the list of available products for your user role:
Click on the product name and and then on the Launch product on the product page:
Fill the product parameters with values specific for your environment. Provide the valid values for OU ids OR for the staging and production account lists and the name for the SetupStackSetExecutionRole if you would like to enable multi-account model deployment, otherwise keep these parameters empty.
There are only two required parameters you must provide in the product launch console:
- S3 bucket name with MLOps seed code - use the S3 bucket where you packaged the CloudFormation templates, see the
S3_BUCKET_NAMEshell variable:
- Availability Zones: you need at least two AZs for SageMaker model deployment workflow:

Wait until AWS Service Catalog finishes the provisioning of the data science environment stack and the product status becomes Available. The data science environment provisioning takes about 20 minutes to complete.
Now you provisioned the data science environment and can start working with it.
Refer to two step deployment via CloudFormation and AWS Service Catalog clean up section.
After you successfully deploy all CloudFormation stacks and created the needed infrastructure, you can start Studio and start working with the delivered notebooks.
To launch Studio you must go to SageMaker console, click Control panel and click on the Studio in Launch app button on the Users panel:
To use the provided notebooks you must clone the source code repository into your Studio environment. Open a system terminal in Studio in the Launcher window:
Run the following command in the terminal:
git clone https://github.com/aws-samples/amazon-sagemaker-secure-mlops.gitThis command downloads and saves the code repository in your home directory in Studio.
Now go to the file browser and open 00-setup notebook:
The first start of the notebook kernel on a new KernelGateway app takes about 5 minutes. Continue with the setup instructions in the notebook after Kernel is ready.
β You have to run the whole notebook to setup your SageMaker environment.
Now you setup your data science environment and can start experimenting.
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0









