Skip to content

MXNetModel deployment error #1831

@goldmermaid

Description

@goldmermaid

Describe the bug
Deployment fails when instance_type='local'.

To reproduce

import sagemaker
from sagemaker.mxnet.model import MXNetModel

s3_path = "s3://{}/{}".format(sess.default_bucket(), s3_bucket_name)
sagemaker_model = MXNetModel(model_data=s3_model_path, #train_data_upload,
                             image="kdd20200813:latest", # "{}/{}".format(ecr_image_name, "latest"),
                             role=sagemaker.get_execution_role(), 
                             py_version='py3',            # python version
                             framework_version="1.6",
                             entry_point='serve.py',
                             source_dir='.')

Expected behavior
Deployment works correctly local and at remote GPU

Screenshots or logs

Attaching to tmp454nevxt_algo-1-qjr8q_1
algo-1-qjr8q_1  | Warning: Calling MMS with mxnet-model-server. Please move to multi-model-server.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,467 [INFO ] main com.amazonaws.ml.mms.ModelServer - 
algo-1-qjr8q_1  | MMS Home: /usr/local/lib/python3.6/site-packages
algo-1-qjr8q_1  | Current directory: /
algo-1-qjr8q_1  | Temp directory: /home/model-server/tmp
algo-1-qjr8q_1  | Number of GPUs: 0
algo-1-qjr8q_1  | Number of CPUs: 8
algo-1-qjr8q_1  | Max heap size: 13646 M
algo-1-qjr8q_1  | Python executable: /usr/local/bin/python3.6
algo-1-qjr8q_1  | Config file: /etc/sagemaker-mms.properties
algo-1-qjr8q_1  | Inference address: http://0.0.0.0:8080
algo-1-qjr8q_1  | Management address: http://0.0.0.0:8080
algo-1-qjr8q_1  | Model Store: /.sagemaker/mms/models
algo-1-qjr8q_1  | Initial Models: ALL
algo-1-qjr8q_1  | Log dir: /logs
algo-1-qjr8q_1  | Metrics dir: /logs
algo-1-qjr8q_1  | Netty threads: 0
algo-1-qjr8q_1  | Netty client threads: 0
algo-1-qjr8q_1  | Default workers per model: 8
algo-1-qjr8q_1  | Blacklist Regex: N/A
algo-1-qjr8q_1  | Maximum Response Size: 6553500
algo-1-qjr8q_1  | Maximum Request Size: 6553500
algo-1-qjr8q_1  | Preload model: false
algo-1-qjr8q_1  | Prefer direct buffer: false
algo-1-qjr8q_1  | 2020-08-13 20:05:51,555 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-model
algo-1-qjr8q_1  | 2020-08-13 20:05:51,705 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /home/model-server/tmp/.mms.sock.9000 --handler sagemaker_mxnet_serving_container.handler_service --model-path /.sagemaker/mms/models/model --model-name model --preload-model false --tmp-dir /home/model-server/tmp
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - [PID] 71
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - MMS worker started.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Python runtime: 3.6.10
algo-1-qjr8q_1  | 2020-08-13 20:05:51,707 [INFO ] main com.amazonaws.ml.mms.wlm.ModelManager - Model model loaded.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,713 [INFO ] main com.amazonaws.ml.mms.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,725 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,725 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,797 [INFO ] main com.amazonaws.ml.mms.ModelServer - Inference API bind to: http://0.0.0.0:8080
algo-1-qjr8q_1  | Model server started.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,802 [WARN ] pool-2-thread-1 com.amazonaws.ml.mms.metrics.MetricCollector - worker pid is not available yet.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,811 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,813 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,815 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,817 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,819 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:54,764 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000000-17ec808cce3dc3f5-fc4cfd4e
algo-1-qjr8q_1  | 2020-08-13 20:05:54,806 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2901
algo-1-qjr8q_1  | 2020-08-13 20:05:54,809 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-7
algo-1-qjr8q_1  | 2020-08-13 20:05:54,809 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000003-7d6a408cce3dc3f5-98e5582f
algo-1-qjr8q_1  | 2020-08-13 20:05:54,823 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2918
algo-1-qjr8q_1  | 2020-08-13 20:05:54,823 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-1
algo-1-qjr8q_1  | 2020-08-13 20:05:54,845 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000002-fe83808cce3dc3f5-c2242e3f
algo-1-qjr8q_1  | 2020-08-13 20:05:54,846 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2941
algo-1-qjr8q_1  | 2020-08-13 20:05:54,850 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-8
algo-1-qjr8q_1  | 2020-08-13 20:05:54,849 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000001-1dda808cce3dc3f5-8771b606
algo-1-qjr8q_1  | 2020-08-13 20:05:54,850 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2947
algo-1-qjr8q_1  | 2020-08-13 20:05:54,851 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-2
algo-1-qjr8q_1  | 2020-08-13 20:05:54,852 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000007-1d3bc08cce3dc3f5-2c312a2b
algo-1-qjr8q_1  | 2020-08-13 20:05:54,853 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2939
algo-1-qjr8q_1  | 2020-08-13 20:05:54,853 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-4
algo-1-qjr8q_1  | 2020-08-13 20:05:54,906 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000005-7888c08cce3dc3f5-db7c23ef
algo-1-qjr8q_1  | 2020-08-13 20:05:54,906 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 3001
algo-1-qjr8q_1  | 2020-08-13 20:05:54,906 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-3
algo-1-qjr8q_1  | 2020-08-13 20:05:54,909 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000008-5c4fc08cce3dc3f5-26e3f942
algo-1-qjr8q_1  | 2020-08-13 20:05:54,909 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2970
algo-1-qjr8q_1  | 2020-08-13 20:05:54,910 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-5
algo-1-qjr8q_1  | 2020-08-13 20:05:54,937 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000006-d83ac08cce3dc3f5-941d61cb
algo-1-qjr8q_1  | 2020-08-13 20:05:54,937 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 3032
algo-1-qjr8q_1  | 2020-08-13 20:05:54,937 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-6
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-19-fdda3d4ea866> in <module>
----> 1 predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='local') # 'ml.c4.xlarge')

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, accelerator_type, endpoint_name, update_endpoint, tags, kms_key, wait, data_capture_config)
    515                 kms_key=kms_key,
    516                 wait=wait,
--> 517                 data_capture_config_dict=data_capture_config_dict,
    518             )
    519 

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict)
   2903 
   2904             self.sagemaker_client.create_endpoint_config(**config_options)
-> 2905         return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
   2906 
   2907     def expand_role(self, role):

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
   2420 
   2421         self.sagemaker_client.create_endpoint(
-> 2422             EndpointName=endpoint_name, EndpointConfigName=config_name, Tags=tags
   2423         )
   2424         if wait:

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/local_session.py in create_endpoint(self, EndpointName, EndpointConfigName, Tags)
    264         endpoint = _LocalEndpoint(EndpointName, EndpointConfigName, Tags, self.sagemaker_session)
    265         LocalSagemakerClient._endpoints[EndpointName] = endpoint
--> 266         endpoint.serve()
    267 
    268     def update_endpoint(self, EndpointName, EndpointConfigName):  # pylint: disable=unused-argument

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/entities.py in serve(self)
    472 
    473         serving_port = get_config_value("local.serving_port", self.local_session.config) or 8080
--> 474         _wait_for_serving_container(serving_port)
    475         # the container is running and it passed the healthcheck status is now InService
    476         self.state = _LocalEndpoint._IN_SERVICE

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/entities.py in _wait_for_serving_container(serving_port)
    507         i += 5
    508         if i >= HEALTH_CHECK_TIMEOUT_LIMIT:
--> 509             raise RuntimeError("Giving up, endpoint didn't launch correctly")
    510 
    511         logger.info("Checking if serving container is up, attempt: %s", i)

RuntimeError: Giving up, endpoint didn't launch correctly

-System information

  • SageMaker Python SDK version: sagemaker 1.71.0

  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):

  • Framework version:

  • Python version:

  • CPU or GPU:

  • Custom Docker image (Y/N):

Additional context
I am new to Sagemaker SDK. So I wonder if something I was missing. Thanks for the help!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions