Review: The best frameworks for machine learning and deep learning

TensorFlow, Spark MLlib, Scikit-learn, MXNet, Microsoft Cognitive Toolkit, and Caffe do the math

Become An Insider

Sign up now and get FREE access to hundreds of Insider articles, guides, reviews, interviews, blogs, and other premium content. Learn more.
At a Glance

Over the past year I've reviewed half a dozen open source machine learning and/or deep learning frameworks: Caffe, Microsoft Cognitive Toolkit (aka CNTK 2), MXNet, Scikit-learn, Spark MLlib, and TensorFlow. If I had cast my net even wider, I might well have covered a few other popular frameworks, including Theano (a 10-year-old Python deep learning and machine learning framework), Keras (a deep learning front end for Theano and TensorFlow), and DeepLearning4j (deep learning software for Java and Scala on Hadoop and Spark). If you’re interested in working with machine learning and neural networks, you’ve never had a richer array of options.  

There's a difference between a machine learning framework and a deep learning framework. Essentially, a machine learning framework covers a variety of learning methods for classification, regression, clustering, anomaly detection, and data preparation, and it may or may not include neural network methods. A deep learning or deep neural network (DNN) framework covers a variety of neural network topologies with many hidden layers. These layers comprise a multistep process of pattern recognition. The more layers in the network, the more complex the features that can be extracted for clustering and classification.

Caffe, CNTK, DeepLearning4j, Keras, MXNet, and TensorFlow are deep learning frameworks. Scikit-learn and Spark MLlib are machine learning frameworks. Theano straddles both categories.

In general, deep neural network computations run an order of magnitude faster on a GPU (specifically an Nvidia CUDA general-purpose GPU, for most frameworks), rather than on a CPU. In general, simpler machine learning methods don't need the speedup of a GPU.

While you can train DNNs on one or more CPUs, the training tends to be slow, and by slow I'm not talking about seconds or minutes. The more neurons and layers that need to be trained, and the more data available for training, the longer it takes. When the Google Brain team trained its language translation models for the new version of Google Translate in 2016, they ran their training sessions for a week at a time, on multiple GPUs. Without GPUs, each model training experiment would have taken months.

To continue reading this article register now

Download the 2018 Best Places to Work in IT special report
Shop Tech Products at Amazon