Step-by-step guide to SageMaker Studio management with the AWS Service Catalog Factory

In most large enterprises, standardizing, provisioning, and ensuring governance of ML environments is the responsibility of central IT teams.

I have recently published a guide to continuous delivery of custom images IT teams can use when setting up SageMaker Studio for their end-users.

In this post, we will go a step further and use the AWS Service Catalog to enable self-service provisioning of approved SageMaker Studio environments.

Photo by Dimitry Anikin on Unsplash

We will create a Service Catalog portfolio with approved Domain, User Profiles, templated MLOps Projects, and Custom Image pipelines for SageMaker Studio. …


Step-by-step guide to Triton deployment on ECS using CDK

NVIDIA Triton Inference Server is an open-source software ML teams can deploy their models with. It supports model formats from Tensorflow, PyTorch, ONNX, other popular frameworks, and provides a set of features to manage them.

Concurrent model execution and dynamic batching are 2 features you may find interesting in Triton, as they allow to run multiple models on the same GPU resources, and increase inference throughput.

You can host Triton in multiple ways on AWS. In this post, I show how you can deploy it on Amazon ECS, using the AWS CDK.

Image by author

We will deploy 2 image classification models in…


Step-by-step guide to continuous delivery of Custom Images in a Studio domain

Amazon SageMaker Studio is a fully integrated IDE unifying the tools needed for managing your ML projects and collaborating with your team members.

Alongside providing pre-built images for running your notebooks, SageMaker Studio allows you to create containers with your favourite libraries and attach them as custom images to your domain.

Source: Amazon SageMaker

In most large enterprises, ML platform administrators will manage those custom images to ensure only approved libraries are used by the Studio users. This can represent operational overhead for admins, if done manually.

In this post I show how you can automate a Studio domain setup by implementing simple…


Finding optimal settings for inference with AWS Lambda, using SageMaker Hyperparameter Tuning and Locust

I have recently published a step-by-step guide to serverless model deployments with Amazon SageMaker Pipelines, Amazon API Gateway, and AWS Lambda.

With AWS Lambda, you pay only for what you use. Lambda charges based on the number of requests, execution duration, and amount of memory allocated to the function. So how much memory should you allocate to your inference function?

In this post, I will show how you can use SageMaker Hyperparameter Tuning (HPO) jobs and a load-testing tool to automatically optimize the price/performance ratio of your serverless inference service.

Photo by Pineapple Supply Co. on Unsplash

We will reuse the XGBoost model binary and Lambda inference…


Hands-on Tutorials

Step-by-step guide to serverless model deployments with SageMaker

Deploying some of your ML models into serverless architectures allows you to create scalabale inference services, eliminate operational overhead, and move faster to production. I have published examples here and here showing how you can adopt such architecture in your projects.

In this post, we will go a step further and automate the deployment of such serverless inference service using Amazon SageMaker Pipelines.

With SageMaker Pipelines, you can accelerate the delivery of end-to-end ML projects. It combines ML workflow orchestration, model registry, and CI/CD into one umbrella so you can quickly get your models into production.

Photo by SpaceX on Unsplash

We will create a…


Step-by-step guide to serverless inference with a DAG on AWS

I have recently published a post explaining core concepts on how to deploy an ML model in a serverless inference service using AWS Lambda, Amazon API Gateway, and the AWS CDK.

For some use cases you and your ML team may need to implement a more complex inference workflow where predictions come from multiple models and are orchestrated with a DAG. On AWS, Step Functions Synchronous Express Workflows allow you to easily build that orchestration layer for your real-time inference services.

Photo by Jon Tyson on Unsplash

In this post, I will show how you can create a multi-model serverless inference service with AWS Lambda, Step…


Keep SageMaker Studio cost under control with Amazon EventBridge, AWS Lambda, and Boto3

Amazon SageMaker Studio is a fully integrated IDE unifying the tools needed for ML development. With Studio you can write code, track experiments, visualize data, and perform debugging and monitoring in a Jupyterlab-based interface. SageMaker manages the creation of the underlying instances and resources so you can get started quickly in your ML project.

When creating or launching a Notebook, an Interactive Shell, or a Terminal based on a SageMaker Image, the resources run as Apps on Amazon EC2 instances for which you incur cost, and you must shut them down to stop the metering.

Currently, although you can install…


Step-by-step guide to serverless inference for R models

R is one the most popular languages used in data science. It is open source and has many packages for statistical analysis, data wrangling, visualization, and machine learning.

After training an R model, you and your ML team might explore ways to deploy it as an inference service. AWS offers many options for this so you can adapt the deployment scenario to your needs. Among those, adopting a serverless architecture allows you to build a scalable R inference service while freeing your team from the infrastructure management.

Lambda + R + CDK = ❤

In this post I will show how you can create a serverless R…


Jupyter Notebooks provide useful environments to interactively explore and experiment during an ML project. However, by helping many teams deliver ML solutions for large enterprises on AWS, I often noticed a time in the project when data scientists and ML engineers needed to work with a full-fledged cloud-based IDE offering better code-completion and debugging capabilities for containers running in SageMaker.

In this post, I will show how you can install and run the Theia IDE on a SageMaker Notebook Instance using a Lifecycle Configuration.

SageMaker + Theia = ❤

Amazon SageMaker is a fully managed service bringing together a broad set of capabilities to help…

Sofian Hamiti

ML Solutions Architect

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store