CIO - Facebook today "re-open-sourced" the Thrift binary communication protocol with its own internal branch of Thrift, which is designed to provide a new set of core features and crank up performance.
Facebook Software Engineer Dave Watson explains that the company always wants to choose the best tools and implementations for its backend services, regardless of programming language. By using programming languages on a case-by-case basis, it can optimize performance, ease and speed of development, leverage existing libraries and so on.
"To support this practice, in 2006 we created Thrift, a cross-language framework for handling RPC [remote procedure calls], including serialization/deserialization, protocol transport and server creation," Watson says. "Since then, usage of Thrift at Facebook has continued to grow. Today, it powers more than 100 services used in production, most of which are written in C++, Java, PHP or Python."
After a year of internal use, Facebook released Thrift to the open source community, where development of Apache Thrift continues. But, as Watson notes, while Apache Thrift gained wide use outside Facebook, IT organizations using it ran into performance concerns and issues separating the serialization and transport logic.
Inside Facebook, IT was running into similar issues as it gained experience running Thrift infrastructure. Watson says the team realized that Thrift was missing a core set of features, and that a lot more could be done for performance.
"For example, one issue we ran into was that internal service owners were constantly reinventing the same features again and again-such as transport compression, authentication and counters - to track the health of their servers. Engineers were also spending a lot of time trying to eke more performance from their services."
"When Thrift was originally conceived, most services were relatively straightforward in design," Watson adds. "A Web server would make a Thrift request to some backend service, and the service would respond. But as Facebook grew, so did the complexity of the services. Making a Thrift request was no longer so simple. Not only did we have tiers of services (services calling other services), but we also started seeing unique future demands for each service, such as the various compression or trace/debug needs.
Over time, Watson says, it became obvious that Thrift was in need of an upgrade for some of our specific use cases. In particular, we sought to improve performance for asynchronous workloads, and we wanted a better way to support per-request features."
The end result is fbthrift, which Facebook released today on GitHub. Watson says the largest changes are in the new C++ code generator (available as the new target language cpp2), as well as header transport and protocol changes for several languages, including C++, Python and Java. He adds that a number of services that have moved to the new cpp2-generated code have achieved up to a 50 percent decrease in latency and large decreases in memory footprint.
Watson notes that it doesn't reflect all Apache Thrift changes, but the team did track the upstream changes closely, and he adds that Facebook hopes to work with the Apache Thrift maintainers to incorporate the work on fbthrift.
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.
Read more about enterprise architecture in CIO's Enterprise architecture Drilldown.
- 15 Non-Certified IT Skills Growing in Demand
- How 19 Tech Titans Target Healthcare
- Twitter Suffering From Growing Pains (and Facebook Comparisons)
- Agile Comes to Data Integration
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- HP HAVEn: See the big picture in Big Data HP HAVEn is the industry's first comprehensive, scalable, open, and secure platform for Big Data. Enterprises are drowning in a sea of data...
- Piecing Together the Business Intelligence Puzzle Business intelligence (BI) technology collects and analyzes company data, delivering relevant information to corporate decision-makers in an effort to produce favorable outcomes.
- Harness IT -- An Introduction to Business Intelligence Solutions Learn the key selection criteria required to provide your organization with the capability to address structured data, unstructured data and mobile demands so...
- Business Intelligence Shows its Smarts Today's Business Intelligence (BI) tools provide a new way to think about data with self-service capabilities and user-friendly analytics that can be used...
- Cloud Knowledge Vault Learn how your organization can benefit from the scalability, flexibility, and performance that the cloud offers through the short videos and other resources...
- Testimonial: Cystic Fibrosis Trust Peter Hawkins, the Head of IT for Cystic Fibrosis Trust, discusses the role CommVault's Simpana software platform plays in improving the company's information... All Data Center White Papers | Webcasts