Using Chalice to serve SageMaker predictions

Oh, you’re the front-end service, I suppose.

Training and deploying our SageMaker model

I’ve already written a couple of posts (here and here) on training and deploying SageMaker models, so I won’t go into these details again. For our purpose, we’ll use the built-in algorithm for image classification to train a model on the Caltech-256 data set, as presented in this tutorial.

Invoking a SageMaker model

First, we need to figure out what data format the algo expects: as explained in the documentation, we need to post binary data with the application/x-image content type.

$ python
b'[1.4174277566780802e-05, 1.6996695194393396e-05, 0.00011234339035581797, 0.0002156866539735347, 3.110157194896601e-05, 0.00025752029614523053, 2.8299189580138773e-05, 0.00012142073683207855, 9.117752779275179e-05, 4.787529178429395e-05, 3.5472228773869574e-05, 0.00019932845316361636, 5.5317421356448904e-05, 6.963817668292904e-06, 4.422592246555723e-05, 9.264561231248081e-05, 2.4938033675425686e-05, 0.0002587089838925749, 0.00026409601559862494, 1.0121700142917689e-05, 0.0038431661669164896, 2.7548372599994764e-05, 3.41928462148644e-05, 7.225569424917921e-05, 1.1924025784537662e-05, 7.16273789294064e-05, 0.000851281511131674, 3.8102120015537366e-05, 8.411310773226433e-06, [output removed]

A quick recap on Chalice

Chalice is an AWS Open Source project that lets developers build Python-based web services with minimal fuss. The programming model is extremely close to Flask, so you should feel right at home. Installation is a breeze (pip install chalice) and the CLI is idiot-proof (*I* can use it).

Writing an image prediction service with chalice

As we just saw, the endpoint returns a raw prediction, which contains probably more information than we need to send back to the client. Thus, our service will only return the top k classes and probabilities, in descending order of probability.

  • a mandatory base64-encoded image,
  • an optional value for ‘k’. If it’s missing, the service will use the default value of 257.
  • decode the base64-encoded image,
  • read the endpoint name from an environment variable,
  • invoke the endpoint using the InvokeEndpoint API in boto3,
  • read the response,
  • sort categories by descending order of probability,
  • return only the top k categories and probabilities.

Configuring the service

Configuration is pretty simple and is stored in .chalice/config.json:

  • an environment variable to store the endpoint name,
  • a custom IAM policy, allowing the Lambda function to call the InvokeEndpoint API. Once again, we could let Chalice generate it for us, but it’s good to know how to customize your policy :)
Configuration file
IAM policy

Running the service locally

Let’s test the service locally by running ‘chalice local’ and then curl to invoke it.

Deploying the service

Now it’s time to deploy on AWS by running ‘chalice deploy’. Let’s run the same test.

Bonus: an image resizer service with Chalice

Computer vision models require input images to have the same size as training images. I believe the SageMaker built-in algorithm handles image resizing for us, but it’s definitely something we’d have to take care of if we worked with your own custom model.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store