Review: Amazon puts machine learning in reach

Amazon Machine Learning gives data science newbies easy-to-use solutions for the most common problems

As a physicist, I was originally trained to describe the world in terms of exact equations. Later, as an experimental high-energy particle physicist, I learned to deal with vast amounts of data with errors and with evaluating competing models to describe the data. Business data, taken in bulk, is often messier and harder to model than the physics data on which I cut my teeth. Simply put, human behavior is complicated, inconsistent, and not well understood, and it's affected by many variables.

If your intention is to predict which previous customers are most likely to subscribe to a new offer, based on historical patterns, you may discover there are nonobvious correlations in addition to obvious ones, as well as quite a bit of randomness. When graphing the data and doing exploratory statistical analyses don’t point you at a model that explains what’s happening, it might be time for machine learning.

Amazon’s approach to a machine learning service is intended to work for analysts to understand the business problem being solved, whether or not they understand data science and machine learning algorithms. As we’ll see, that intention gives rise to different offerings and interfaces than you’ll find in Microsoft Azure Machine Learning (click for my review), although the results are similar.

With both services, you start with historical data, identify a target for prediction from observables, extract relevant features, feed them into a model, and allow the system to optimize the coefficients of the model. Then you evaluate the model, and if it’s acceptable, you use it to make predictions. For example, a bank may want to build a model to predict whether a new credit card charge is legitimate or fraudulent, and a manufacturer may want to build a model to predict how much a potential customer is likely to spend on its products.

In general, you approach Amazon Machine Learning by first uploading and cleaning up your data; then creating, training, and evaluating an ML model; and finally by creating batch or real-time predictions. Each step is iterative, as is the whole process. Machine learning is not a simple, static, magic bullet, even with the algorithm selection left to Amazon.

To continue reading this article register now

9 steps to lock down corporate browsers