Best open source projects from tech company giants

Open source computing projects involve collaborative work with different members working on a common goal. Some of the most high-profile projects come from the Linux Foundation, a long-reigning open source ecosystem.

However, in recent years, more companies have climbed on board the open source bandwagon, with tech giants like Facebook, Microsoft and Google all crowdsourcing various open-source projects.

Sometimes these take the form of businesses attempting to solve an internal problem, before offering up the solution to the wider community. Owning some of the biggest data centres in the world, these companies are forced to manage data on an unprecedented scale, requiring the development of pioneering techniques.

But why share their intellectual property with the wider world? In return for offering others a glimpse of their internal workings, these internet firms gain access to a vibrant community of developers that can improve their own technology for free. For example, Facebook has claimed that its Open Compute Project has saved it $2 billion in data centre costs.

Here, we run through some of the most innovative and exciting open-source projects from tech giants.

Additional reporting from Laurie Clarke

Google - Kubernetes
Getty Images

Google - Kubernetes

Google is one of the world's companies most actively involved in open source technology. On open source developer platform, GitHub, for example, Google boasts over 900 contributors and over 1000 repositories.

Containerisation has been one of the biggest buzz words of recent years and Google reportedly used around two billion containers to manage applications in its data centres, relying on its secretive Borg and Omega technologies to run workloads internally for years.

And these platforms have provided the basis for its open source Kubernetes container cluster management platform, which has been made available publically since June 2014. Kubernetes has been picked by a range of large businesses looking for a lightweight alternative to virtual machines.

Although originally developed by Google, it is now run by the Cloud Native Computing Foundation.

Google - TensorFlow
Getty Images

Google - TensorFlow

At the heart of Google’s impressive search capabilities for Google Photos, voice recognition tools and Google Translate sits its AI system TensorFlow.

The machine learning tool was open sourced in 2016 in order to help accelerate wider developments around the technology.

It now functions as an open-source software library that can be used for dataflow and programming purposes.

Read next: What is TensorFlow? How are businesses using it?

Facebook - Open Compute Project (OCP)
Getty Images

Facebook - Open Compute Project (OCP)

Facebook has taken an interesting approach to its open source endeavours, focusing on hardware in addition to software.

The social media giant launched the OCP initiative in 2011. This is a collaborative project which aims to redesign hardware technology to ease the growing demand on computing infrastructure.

“The result is that today we have open-sourced every major physical component of our data centre stack — a stack that is powerful enough to connect 1.39 billion people around the world and is efficient enough to have saved us $2 billion in infrastructure costs over the last three years. But we’re not finished — not even close,” the company said.

Initiatives include Yosemite - what Facebook claims to be the 'first open source modular chassis for high powered microservers'.

Also involved in the project are companies including Intel, Goldman Sachs and Microsoft.

Facebook – PyTorch
iStock

Facebook – PyTorch

Facebook uses deep learning internally to, for example, filter information on Facebook feeds, and has open-sourced some of the modules it created as part of the PyTorch deep learning framework, hosted on GitHub.

There are a number of organisations developing this framework, including Twitter, Uber and Salesforce, and there are plenty of example projects and tutorials to be found on the site.

Twitter – Aurora, Storm
Getty Images

Twitter – Aurora, Storm

Twitter is a major open source software user, and has contributed back to the community in a number of ways.

Its Aurora framework was created by Google developer, Bill Farner, taking a lead from Google’s Borg microservices architecture.

Aurora builds on top of Apache Mesos and provides common features that allow any site to run large-scale production applications. It is able to make scheduling decisions, such as moving a service onto a healthy machine in the event of a failure, ensuring greater reliability.

Other projects include Bootstrap and Storm, which is used to analyse large-scale data streams created by millions of Twitter feeds.

For a full list of Twitter open source projects, see here.

Netflix – Chaos Monkey
Getty Images

Netflix – Chaos Monkey

As a major AWS user, Netflix wanted a way to test resiliency of its applications running in the cloud. Chaos Monkey was born with the aim of artificially creating problems with virtual machines hosted by the public cloud provider – the project was absorbed into the wider, Simian Army, to test that its systems are able to react to random failures on the network.

Netflix currently contributes to a number of open source projects such as big data tools Hadoop, Hive, Pig, Parquet, Presto, and Spark. They've also open sourced a number of their Gradle plugins, under the Nebula umbrella. A full list of their open source projects can be found here.

LinkedIn – Kafka
Getty Images

LinkedIn – Kafka

Kafka was created by business networking site LinkedIn for internal use, before being open sourced in 2011.

The team of engineers that created the real-time, distributed messaging system left the company last year, to set up a new business focusing on Kafka, called Confluent.

Kafka counts a number of large tech companies among its users, such as Spotify, NetFlix and Uber.

Airbnb - Air Flow
Getty Images

Airbnb - Air Flow

Airflow is a data workflow management framework that is available under the Apache licence, supporting authoring, scheduling and monitoring of data pipelines.

Airbnb also opened up its Aerosolve machine learning tool which is used internally to support features such as its price recommendations engine for those renting properties.

Google and Netflix - Kayenta
Getty Images/iStockphoto

Google and Netflix - Kayenta

Google and Netflix partnered early in 2018 on the release of Kayenta, an open source project aiming to allow a wider audience to access the canary analysis tools Netflix developed internally. It's currently integrated into Spinnaker - an open source, multi-cloud continuous delivery platform.

Canary analysis focuses on providing early warnings about vulnerabilities in updates introduced within a company's infrastructure. Netflix now wants to share the software with the wider community, and benefit from their knowledge too.

Copyright © 2018 IDG Communications, Inc.