import sagemaker
from sagemaker.mxnet.model import MXNetModel
s3_path = "s3://{}/{}".format(sess.default_bucket(), s3_bucket_name)
sagemaker_model = MXNetModel(model_data=s3_model_path, #train_data_upload,
image="kdd20200813:latest", # "{}/{}".format(ecr_image_name, "latest"),
role=sagemaker.get_execution_role(),
py_version='py3', # python version
framework_version="1.6",
entry_point='serve.py',
source_dir='.')
Attaching to tmp454nevxt_algo-1-qjr8q_1
algo-1-qjr8q_1 | Warning: Calling MMS with mxnet-model-server. Please move to multi-model-server.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,467 [INFO ] main com.amazonaws.ml.mms.ModelServer -
algo-1-qjr8q_1 | MMS Home: /usr/local/lib/python3.6/site-packages
algo-1-qjr8q_1 | Current directory: /
algo-1-qjr8q_1 | Temp directory: /home/model-server/tmp
algo-1-qjr8q_1 | Number of GPUs: 0
algo-1-qjr8q_1 | Number of CPUs: 8
algo-1-qjr8q_1 | Max heap size: 13646 M
algo-1-qjr8q_1 | Python executable: /usr/local/bin/python3.6
algo-1-qjr8q_1 | Config file: /etc/sagemaker-mms.properties
algo-1-qjr8q_1 | Inference address: http://0.0.0.0:8080
algo-1-qjr8q_1 | Management address: http://0.0.0.0:8080
algo-1-qjr8q_1 | Model Store: /.sagemaker/mms/models
algo-1-qjr8q_1 | Initial Models: ALL
algo-1-qjr8q_1 | Log dir: /logs
algo-1-qjr8q_1 | Metrics dir: /logs
algo-1-qjr8q_1 | Netty threads: 0
algo-1-qjr8q_1 | Netty client threads: 0
algo-1-qjr8q_1 | Default workers per model: 8
algo-1-qjr8q_1 | Blacklist Regex: N/A
algo-1-qjr8q_1 | Maximum Response Size: 6553500
algo-1-qjr8q_1 | Maximum Request Size: 6553500
algo-1-qjr8q_1 | Preload model: false
algo-1-qjr8q_1 | Prefer direct buffer: false
algo-1-qjr8q_1 | 2020-08-13 20:05:51,555 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-model
algo-1-qjr8q_1 | 2020-08-13 20:05:51,705 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /home/model-server/tmp/.mms.sock.9000 --handler sagemaker_mxnet_serving_container.handler_service --model-path /.sagemaker/mms/models/model --model-name model --preload-model false --tmp-dir /home/model-server/tmp
algo-1-qjr8q_1 | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - [PID] 71
algo-1-qjr8q_1 | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - MMS worker started.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Python runtime: 3.6.10
algo-1-qjr8q_1 | 2020-08-13 20:05:51,707 [INFO ] main com.amazonaws.ml.mms.wlm.ModelManager - Model model loaded.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,713 [INFO ] main com.amazonaws.ml.mms.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,725 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,725 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1 | 2020-08-13 20:05:51,797 [INFO ] main com.amazonaws.ml.mms.ModelServer - Inference API bind to: http://0.0.0.0:8080
algo-1-qjr8q_1 | Model server started.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,802 [WARN ] pool-2-thread-1 com.amazonaws.ml.mms.metrics.MetricCollector - worker pid is not available yet.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,811 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,813 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,815 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,817 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:51,819 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1 | 2020-08-13 20:05:54,764 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000000-17ec808cce3dc3f5-fc4cfd4e
algo-1-qjr8q_1 | 2020-08-13 20:05:54,806 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2901
algo-1-qjr8q_1 | 2020-08-13 20:05:54,809 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-7
algo-1-qjr8q_1 | 2020-08-13 20:05:54,809 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000003-7d6a408cce3dc3f5-98e5582f
algo-1-qjr8q_1 | 2020-08-13 20:05:54,823 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2918
algo-1-qjr8q_1 | 2020-08-13 20:05:54,823 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-1
algo-1-qjr8q_1 | 2020-08-13 20:05:54,845 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000002-fe83808cce3dc3f5-c2242e3f
algo-1-qjr8q_1 | 2020-08-13 20:05:54,846 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2941
algo-1-qjr8q_1 | 2020-08-13 20:05:54,850 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-8
algo-1-qjr8q_1 | 2020-08-13 20:05:54,849 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000001-1dda808cce3dc3f5-8771b606
algo-1-qjr8q_1 | 2020-08-13 20:05:54,850 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2947
algo-1-qjr8q_1 | 2020-08-13 20:05:54,851 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-2
algo-1-qjr8q_1 | 2020-08-13 20:05:54,852 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000007-1d3bc08cce3dc3f5-2c312a2b
algo-1-qjr8q_1 | 2020-08-13 20:05:54,853 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2939
algo-1-qjr8q_1 | 2020-08-13 20:05:54,853 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-4
algo-1-qjr8q_1 | 2020-08-13 20:05:54,906 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000005-7888c08cce3dc3f5-db7c23ef
algo-1-qjr8q_1 | 2020-08-13 20:05:54,906 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 3001
algo-1-qjr8q_1 | 2020-08-13 20:05:54,906 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-3
algo-1-qjr8q_1 | 2020-08-13 20:05:54,909 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000008-5c4fc08cce3dc3f5-26e3f942
algo-1-qjr8q_1 | 2020-08-13 20:05:54,909 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2970
algo-1-qjr8q_1 | 2020-08-13 20:05:54,910 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-5
algo-1-qjr8q_1 | 2020-08-13 20:05:54,937 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000006-d83ac08cce3dc3f5-941d61cb
algo-1-qjr8q_1 | 2020-08-13 20:05:54,937 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 3032
algo-1-qjr8q_1 | 2020-08-13 20:05:54,937 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-6
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-19-fdda3d4ea866> in <module>
----> 1 predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='local') # 'ml.c4.xlarge')
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, accelerator_type, endpoint_name, update_endpoint, tags, kms_key, wait, data_capture_config)
515 kms_key=kms_key,
516 wait=wait,
--> 517 data_capture_config_dict=data_capture_config_dict,
518 )
519
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict)
2903
2904 self.sagemaker_client.create_endpoint_config(**config_options)
-> 2905 return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
2906
2907 def expand_role(self, role):
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
2420
2421 self.sagemaker_client.create_endpoint(
-> 2422 EndpointName=endpoint_name, EndpointConfigName=config_name, Tags=tags
2423 )
2424 if wait:
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/local_session.py in create_endpoint(self, EndpointName, EndpointConfigName, Tags)
264 endpoint = _LocalEndpoint(EndpointName, EndpointConfigName, Tags, self.sagemaker_session)
265 LocalSagemakerClient._endpoints[EndpointName] = endpoint
--> 266 endpoint.serve()
267
268 def update_endpoint(self, EndpointName, EndpointConfigName): # pylint: disable=unused-argument
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/entities.py in serve(self)
472
473 serving_port = get_config_value("local.serving_port", self.local_session.config) or 8080
--> 474 _wait_for_serving_container(serving_port)
475 # the container is running and it passed the healthcheck status is now InService
476 self.state = _LocalEndpoint._IN_SERVICE
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/entities.py in _wait_for_serving_container(serving_port)
507 i += 5
508 if i >= HEALTH_CHECK_TIMEOUT_LIMIT:
--> 509 raise RuntimeError("Giving up, endpoint didn't launch correctly")
510
511 logger.info("Checking if serving container is up, attempt: %s", i)
RuntimeError: Giving up, endpoint didn't launch correctly
Describe the bug
Deployment fails when
instance_type='local'.To reproduce
Expected behavior
Deployment works correctly local and at remote GPU
Screenshots or logs
-System information
SageMaker Python SDK version: sagemaker 1.71.0
Framework name (eg. PyTorch) or algorithm (eg. KMeans):
Framework version:
Python version:
CPU or GPU:
Custom Docker image (Y/N):
Additional context
I am new to Sagemaker SDK. So I wonder if something I was missing. Thanks for the help!