Hi,

In this blog post we’re going to learn how to dockerize your own ML algorithms into a Docker image which will be then published into Amazon ECR.

We based our approach on the advanced examples on the AWS Labs Github repository (https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality) and the main reason was to save costs.

As you might know, AWS Sagemaker starts costing you money from the second you open up a Jupyter notebook instance.

Not only that, but also you’ll end up receiving charges for the time spent while you were training an ML model and also for the time and endpoint is alive to predict and make inferences on features the end-user gives.

Intro

Here’s the US East (N. Virginia) region pricing details from the less to the most expensive on each of the Sagemaker subcategories we’re gonna receive charges.

  • Building – On-Demand ML Notebook Instances

Lowest Standard Instances

$00464per hour
  • ml.t2.medium

Highest Standard Instances

$6451per hour
  • ml.m5.24xlarge

Lowest Compute Optimized

$0279per hour
  • ml.c4.xlarge

Highest Compute Optimized

$4838per hour
  • ml.c5d.18xlarge

Lowest GPU Instance

$126per hour
  • ml.p2.xlarge

Highest GPU Instance

$34272per hour
  • ml.p3.16xlarge
  • Model Training On-Demand ML Training Instances

Lowest Standard Instances

$0134per hour
  • ml.m5.large

Highest Standard Instances

$448per hour
  • ml.m4.16xlarge

Lowest Compute Optimized

$0238per hour
  • ml.c5.xlarge

Highest Compute Optimized

$2227per hour
  • ml.c4.8xlarge

Lowest GPU Instance

$126per hour
  • ml.p2.xlarge

Highest GPU Instance

$34272per hour
  • ml.p3.16xlarge
  • Model Deployment – On-Demand ML Hosting Instances for Real-Time Inference

Lowest Standard Instances

$0065per hour
  • ml.t2.medium

Highest Standard Instances

$448per hour
  • ml.t2.medium

Lowest Compute Optimized

$0119per hour
  • ml.c5.large

Highest Compute Optimized

$2227per hour
  • ml.c4.8xlarge

Lowest GPU Instance

$126per hour
  • ml.p2.xlarge

Highest GPU Instance

$34272per hour
  • ml.p3.16xlarge

Cost Calculation

So, let’s say we spend an hour training (this will involve having the Jupyter notebook instance too for the duration of an hour) and then leaving the endpoint alive for real time inference 24-hours a day.

Avg cost = 1 x $0.0464 + 1 x $0.134 + 24 x $0.065 = $1.7404, which rounds up to more than $50 dollars a month.

And this is just using the minimums!

That’s the reason why we switched to use Docker and train and deploy our ML models locally.

After you are satisfied with the results and ready to use your model on production you just have to publish the Docker image to AWS ECR.

In our case, we wanted to use a premade Tensorflow DNNClassifier estimator.

Required Steps

Write a script called train which will contain the estimator, but it could have any other training spec that you want to use, not necessarily using Tensorflow, but MXNet or another ML framework.

Create this folder structure:

Define a script file that is responsible for packing the scripts. Also for building, tagging and pushing the image on ECR.

  • build_and_push.sh

We’ll start by defining a Dockerfile to get the required contents available on our image

Firstly, we are using as a base image “tensorflow/tensorflow:1.8.0-py3”, secondly installing nginx and curl and in addition the package tensorflow model serving that handle and run the predictions against our model.

After that, we mount the current path where our code resides from the local directory to the virtual dir.

Define also 2 files:

  • Serve

Copy to Clipboard
  • Train

Pack your custom tensorflow (or any other ML framework code) script into a .py file with the custom training code

Eventually, the file will be called by the train script when launching the training job.

These files will be invoked when using docker-compose automatically when calling the 2 main high-level methods when using the python sagemaker api.

After that, launch a Jupyter Notebook instance on our local environment (https://anaconda.org/anaconda/anaconda-navigator).

Remember that is convenient that you previously had installed Anaconda Navigator and created an environment.

Here is the code we have on the notebook.

The basics are:

  1. Define a generic Estimator
  2. Point to an existing role that has access to execute tasks on Sagemaker
  3. Use the URI of the image previously published on ECR using the build_and_push.sh script.

When we make the call to the .fit() method a Docker container with the train arguments will be orchestrated. As a result, the train script will be run.

In addition, a script with our custom tensorflow python file will be called within a subprocess.

As instructed previously, the training job is created on Sagemaker and the compressed model file is uploaded to an S3 bucket.

On the other hand, when we make the call to the .deploy() method a Docker container with the serve argument and our serv script will be called.

This script could be taken as is.

For most common cases this could be reused since the only thing it does is use the library tensorflow-serving (as you might have guess: installation step is defined on the Docker image).

The instance will pick the uploaded model and create an endpoint configuration and the actual endpoint which will be used for making predictions.

In conclusion, when you are happy with the results on your model and want to train and host your algorithm in Amazon SageMaker, you just have to switch the instance_type from ‘local’ to one of the available instances on AWS.

That’s all folks! Hope you like it.