Finding optimal settings for inference with AWS Lambda, using SageMaker Hyperparameter Tuning and Locust

I have recently published a step-by-step guide to serverless model deployments with Amazon SageMaker Pipelines, Amazon API Gateway, and AWS Lambda.

With AWS Lambda, you pay only for what you use. Lambda charges based on the number of requests, execution duration, and amount of memory allocated to the function. So how much memory should you allocate to your inference function?

In this post, I will show how you can use SageMaker Hyperparameter Tuning (HPO) jobs and a load-testing tool to automatically optimize the price/performance ratio of your serverless inference service.

Photo by Pineapple Supply Co. on Unsplash

We will reuse the XGBoost model binary and Lambda inference…

Hands-on Tutorials

Step-by-step guide to serverless model deployments with SageMaker

Deploying some of your ML models into serverless architectures allows you to create scalabale inference services, eliminate operational overhead, and move faster to production. I have published examples here and here showing how you can adopt such architecture in your projects.

In this post, we will go a step further and automate the deployment of such serverless inference service using Amazon SageMaker Pipelines.

With SageMaker Pipelines, you can accelerate the delivery of end-to-end ML projects. It combines ML workflow orchestration, model registry, and CI/CD into one umbrella so you can quickly get your models into production.

Photo by SpaceX on Unsplash

We will create a…

Step-by-step guide to serverless inference with a DAG on AWS

I have recently published a post explaining core concepts on how to deploy an ML model in a serverless inference service using AWS Lambda, Amazon API Gateway, and the AWS CDK.

For some use cases you and your ML team may need to implement a more complex inference workflow where predictions come from multiple models and are orchestrated with a DAG. On AWS, Step Functions Synchronous Express Workflows allow you to easily build that orchestration layer for your real-time inference services.

Photo by Jon Tyson on Unsplash

In this post, I will show how you can create a multi-model serverless inference service with AWS Lambda, Step…

Keep SageMaker Studio cost under control with Amazon EventBridge, AWS Lambda, and Boto3

Amazon SageMaker Studio is a fully integrated IDE unifying the tools needed for ML development. With Studio you can write code, track experiments, visualize data, and perform debugging and monitoring in a Jupyterlab-based interface. SageMaker manages the creation of the underlying instances and resources so you can get started quickly in your ML project.

When creating or launching a Notebook, an Interactive Shell, or a Terminal based on a SageMaker Image, the resources run as Apps on Amazon EC2 instances for which you incur cost, and you must shut them down to stop the metering.

Currently, although you can install…

Step-by-step guide to serverless inference for R models

R is one the most popular languages used in data science. It is open source and has many packages for statistical analysis, data wrangling, visualization, and machine learning.

After training an R model, you and your ML team might explore ways to deploy it as an inference service. AWS offers many options for this so you can adapt the deployment scenario to your needs. Among those, adopting a serverless architecture allows you to build a scalable R inference service while freeing your team from the infrastructure management.

Lambda + R + CDK = ❤

In this post I will show how you can create a serverless R…

Jupyter Notebooks provide useful environments to interactively explore and experiment during an ML project. However, by helping many teams deliver ML solutions for large enterprises on AWS, I often noticed a time in the project when data scientists and ML engineers needed to work with a full-fledged cloud-based IDE offering better code-completion and debugging capabilities for containers running in SageMaker.

In this post, I will show how you can install and run the Theia IDE on a SageMaker Notebook Instance using a Lifecycle Configuration.

SageMaker + Theia = ❤

Amazon SageMaker is a fully managed service bringing together a broad set of capabilities to help…

Sofian Hamiti

ML Specialist Solutions Architect

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store