What is AWS SageMaker and can it really democratise machine learning in the enterprise?

Amazon Web Services has launched a machine learning platform called SageMaker as it seeks to ease the adoption of AI algorithms for customers.

SageMaker is essentially a platform for authoring, training and deploying machine learning algorithms to business applications without much of the manual heavy lifting generally involved, such as provisioning infrastructure and managing and tuning training models.

As Randall Hunt, senior technical evangelist at AWS wrote in a blog post: "Amazon SageMaker is a fully managed end-to-end machine learning service that enables data scientists, developers, and machine learning experts to quickly build, train, and host machine learning models at scale.

"This drastically accelerates all of your machine learning efforts and allows you to add machine learning to your production applications quickly."

How does it work?

Under the covers this means hosted Jupyter notebook integrated development environments (IDEs) for data exploration, cleaning, and preprocessing.

Then there is a distributed model building, training, and validation service where users can pick an AWS algorithm off the shelf, import a popular framework like TensorFlow or write and deploy their own algorithm with Docker containers, directly within SageMaker.

For training, you simply specify a location in S3 and the instance you want to use and in one click SageMaker spins up an isolated cluster and software defined network with autoscaling and data pipelines to start training. Then, when you are done it tears down the cluster.

HTTPs endpoints are used for model hosting, which can scale to support traffic and allow you to A/B test multiple models simultaneously. The algorithms can be deployed straight into production using EC2 instances with one click, after which it will be deployed with autoscaling across availability zones.

Tuning models is traditionally a trial and error exercise but SageMaker comes with what AWS calls 'hyper parameter optimisation (HPO)'. By simply checking a box SageMaker will spin up multiple copies of the training model and uses machine learning to look at each change in parallel and tune parameters accordingly.

Democratising machine learning

The key message for AWS CEO Andy Jassy is democratising machine learning and AI. "If you want to enable most enterprises and companies to be able to use machine learning in an expansive way, we have to solve the problem of accessibility of everyday developers and scientists," he said during his re:Invent keynote.

As a result, SageMaker will be fairly model agnostic, supporting all popular frameworks from TensorFlow and Caffe2 to AWS' own Gluon library.

Jassy said that Google's popular machine learning framework TensorFlow is already being run on AWS more than anywhere else, which will no doubt annoy the people at Google Cloud Platform. However, Jassy said the the general principle is "we provide all major solutions so you have the tools you need for the right job."

SageMaker joins a crowded space in machine learning platforms, with Microsoft providing Azure Machine Learning, Google's Cloud Machine Learning Engine, various 'workbench' offerings from major analytics vendors, and pure-play startups like Domino Data Lab and Dataiku.

Where AWS seeks to stand out is the depth of features it can bring to bear and the trusted nature of its underlying infrastructure. Customers can start to leverage the full stack from AWS, starting with its secure and robust infrastructure, all the way up to easy deployment on AWS.

As Jassy said during a press Q&A at re:Invent: "At the bottom layer you won't see anyone that supports the array of frameworks and interfaces we do that make it much easier for expert machine learning practitioners to choose the right framework and algorithm for the right job.

"Then at the middle layer of the stack nobody has anything close to SageMaker, which is totally going to change the accessibility and ease at which everyday developers are able to build machine learning models.

"Then at this top layer we have this diverse set of application services [for example: Lex, Polly, Rekognition]. So looking at that collective set of capabilities you need to be successful, no one has as strong a collection of services as AWS."

Read next: Best data science tools: Data science platforms for modelling and deploying machine learning and predictive algorithms

UK accounting software maker Sage is already eyeing the capabilities of SageMaker as it looks to deploy more AI-rich features across its range of products for customers.

Speaking to Computerworld UK at re:Invent, Kriti Sharma, VP of AI at Sage said: "It's very important from a skills and talent perspective to be able to retrain existing engineering staff to use these tools and drive a lot more innovation because the expectations of our users are very high.

"So giving as many people these superpowers of machine learning experts is great," she added.

Amazon SageMaker is available via the AWS console across certain regions, including Ireland in Europe, and can be tried for free using the AWS free tier.

Beyond the free tier, the pricing differs by region but is billed per-second of instance usage, per-GB of storage, and per-GB of data transfer into and out of the service.

Copyright © 2017 IDG Communications, Inc.

8 simple ways to clean data with Excel