Skip to content

Latest commit

 

History

History
162 lines (115 loc) · 10.2 KB

File metadata and controls

162 lines (115 loc) · 10.2 KB

Code sample using AWS IoT Jobs

Solution architecture

The idea is to use Amazon SageMaker to train an AutoEncoder using the Turbine sensors data acquired from the simulated device, export the model to the ONNX format, and deploy it on the edge device (Raspberry Pi). Each edge device has a python application which reads the raw data from the simulated device, and performs prediction using the ONNX model to detect anomalies. For each edge device there is an IoT Thing. This is required by the application that runs on the edge device to send logs to the cloud and for the OTA model update mechanism. The application collects some metrics from the predictions as well as the simulated device readings and sends them to an MQTT topic. Lambda functions process the data from the MQTT topic and ingests it to Amazon Cloudwatch logs. This data can then be visualized using an Amazon Cloudwatch dashboard.

arch.png

  1. A data scientist uses Sagemaker Studio to prepare a ML model
  2. Sensor data preparation
  3. A model is trained (windturbine)
  4. The model is registered in the Sagemaker Model Registry with a new version, awaiting approval
  5. An IIoT engineer validates manually the model version
  6. When a model is approved, an Amazon Eventbridge rule triggers a new deployment
  7. Model is exported to the ONNX format along with an AWS IoT job file, then saved in an Amazon Simple Storage Service (S3) bucket
  8. A notification is triggered once the new object is put into S3. An AWS IoT job is created through an AWS Lambda function and a notification is sent to the edge device
  9. The application is subscribed to an MQTT topic, and receives a JSON document with the model metadata
  10. The application downloads the model from the S3 bucket to a local directory
  11. The model is loaded and a new ONNX runtime inference session is started
  12. The application reads raw data from the simulated device through a local MQTT broker
  13. The application runs a prediction based on the acquired raw data
  14. Application logs are captured and published to an MQTT topic
  15. An IoT rule reads the application logs
  16. A Lambda function parses the application logs
  17. Parsed data are ingested to Amazon Cloudwatch logs
  18. An operator can access the Cloudwatch dashboard and visualize anomalies and other information

Getting started

Pre-requisites

  • An AWS account. We recommend to deploy this solution in a new account
  • AWS CLI: configure your credentials
aws configure --profile [your-profile] 
AWS Access Key ID [None]: xxxxxx
AWS Secret Access Key [None]:yyyyyyyyyy
Default region name [None]: us-east-1 
Default output format [None]: json
npm install -g npm aws-cdk projen
  • jq: jq-1.6
  • Raspberry Pi (tested with a Raspberry Pi 4)

Deploy the solution

This project is built using Cloud Development Kit (CDK) and projen. See Getting Started With the AWS CDK for additional details and prerequisites. When running the commands below, projen will run python and not python3, so make sure your python command runs the correct Python version.

  1. Clone this repository.

    $ git clone https://github.com/aws-samples/ml-edge-getting-started/
  2. Enter the code sample directory.

    $ cd samples/onnx_accelerator_sample1
  3. Verify that the project is configured to deploy the AWS IoT Jobs version. Open the file .projenrc.py and verify that the value for use_greengrass on line 27 is set to False. If not, update the value and then run the follwing command to synthesize the project:

    $ npx projen
  4. Activate virtualenv, install dependencies and synthesize.

    $ npx projen build
  5. Boostrap AWS CDK resources on the AWS account.

    $ npx cdk bootstrap
  6. Deploy the sample in your account

    $ npx cdk deploy

Once the stack is deployed, in the console go to Cloudformation -> Stacks -> onnxacceleratorsampleone-dev -> Outputs

The following outputs will be available to you:

  • cfnoutputdatascientistteamA : The User Arn user for the sagemaker user representing the Data science team
  • cfnoutputIIoTengineeringteam : The User Arn for the sagemaker user representing the IIoT Engineers team
  • CodeBuildInputArtifactsS3BucketName : The S3 bucket containing the input artifacts for codebuild (python script)
  • DashboardOutput : URL of the Cloudwatch dashboard providing visualization of anomalies and raw data
  • DeploymentPackageS3BucketName : The S3 bucket containing the deployment artifacts for edge devices (onnx exported model + job json file)
  • DomainIdSagemaker : The sagemaker domain ID

Note Sagemaker Studio will be provisioned using the default VPC, thus it needs to exist. If you want to use a different VPC, udpate default_vpc_id = ec2.Vpc.from_lookup(self, "DefaultVPC", is_default=True) in main_stack.py

Edge device

After successfully deploying the project, you need to configure your edge device (Raspberry Pi).

Simulated device

In this directory you'll find the Python application that runs on each edge device and streams synthetic raw turbine data, which has been collected from real sensors installed in a 3D printed mini wind turbine. The README in that folder provides instructions on how to configure and run that application.

Warning To keep things simple for this code sample, the simulated device and the detector application are communicating through a local MQTT broker, unauthenticated. Do not use this in production. It is a security best practice to always use an encrypted, secured connection.

Edge Application

In this directory you'll find the Python application that runs on each edge device and performs predictions as well as communication with the cloud. The README in that folder provides instructions on how to configure and run that application.

Notebooks

Once your edge device is configured correctly, you can train and deploy the ML model.

  1. In the AWS console, go to Amazon Sagemaker and select Studio.
  2. In the Get Started right panel, select the datascientist-team-a and click Open Studio.
  3. On the left menu bar, select Git and Clone a Repository
  4. In the drop-down enter https://github.com/aws-samples/ml-edge-getting-started.git
  5. Select the explorer view, select ml-edge-getting-started/samples/onnx_accelerator_sample1/notebooks and open the notebook 01 - Data Preparation.ipynb

If prompted to setup a notebook environment, select the image Data Science 3.0

notebook_environment.png

  1. Execute the cells in the notebook to transform (feature selection, cleaning, denoising, normalizing, etc) the sensors data (raw) into a dataset used to train the model
  2. In the explorer view, select ml-edge-getting-started/samples/onnx_accelerator_sample1/notebooks and open the notebook 02 - Training with Pytorch.ipynb
  3. Execute the cells in the notebook to train the model (Pytorch Autoencoder), and register it in the Amazon Sagemaker Model registry. The model artifacts are stored in the sagemaker default Amazon Simple Storage Service (S3) bucket.
  4. On the left menu bar, select Home -> Models -> Model registry
  5. Double click the modelPackageGroupTurbine model group name
  6. Select the model version you just created, double click on it and update its status in the top right corner from PendingManualApproval to Approved

approved_model.png

Model export and deployment

Approving the model version in the Amazon Sagemaker Model registry will trigger a model deployment at the edge. The Eventbridge rule sends an event to Codebuild with information about the model you just approved. A new build step is then triggered, pulling the model artifact and exporting it to the ONNX format. This step is performed in build_deployment_package.py.

The new model artifact, along with an IoT Job file is pushed to the Amazon S3 deployment bucket. A notification is configured on this bucket to invoke asynchronously an AWS Lambda function (lambda.py) when a file ending in .json is pushed. The Lambda function receives an event that contains details about the object.

The lambda function creates an IoT Job targetting all the devices in the specified thing group. Each device receives a notification that a new model is available, and download it using the pre-signed S3 URL present in the job document. When done, the device reports its status (job succeeded or not). You can visualize these jobs by clicking, in the AWS console, AWS IoT -> Remote actions -> Jobs

Visualization

Access the Amazon Cloudwatch dashboard using the URL output generated by your cloudformation stack. Four widgets are available with sample queries to visualize useful information (input data, anomalies). You can modify those queries in main_stack.py if you want to display different data.

dashboard.png

Clean up

Do not forget to delete the stack to avoid unexpected charges

First make sure to remove all data (model versions) from the model registry. Then:

    $ cdk destroy onnxacceleratorsampleone-dev

Then in the AWS console delete the iot thing, thing group, thing type, device certificates and the S3 buckets