Massive data volumes making Hadoop hot
Computerworld - Corporate efforts to glean business intelligence from the massive volumes of data generated by Web server logs and social media have led to a surge of interest in open-source Hadoop software.
Hadoop is designed to process terabytes and even petabytes of unstructured and structured data. It breaks large workloads into smaller data blocks that are distributed across a cluster of commodity hardware for faster processing.
The technology -- already used by Web giants such as Facebook, eBay, Amazon and Yahoo -- is increasingly being adopted by banking, advertising, biotech and pharmaceutical companies, said Stephen O'Grady, an analyst at RedMonk.
Tynt Multimedia, a Web analytics firm that collects and analyzes nearly 1TB of data per day, switched to Hadoop about 18 months ago when its MySQL database system began collapsing under the sheer volume of data it was collecting, said Cameron Befus, Tynt's vice president of engineering.
Relational database systems are good at data retrieval and queries but don't accept new data quickly. "Hadoop reverses that. You can put data into Hadoop at ridiculously fast rates," Befus said. But Hadoop requires programming tools such as Pig or Hive to write SQL-like queries to retrieve the data.
This version of this story was originally published in Computerworld's print edition. It was adapted from an article that appeared earlier on Computerworld.com.
Read more about Applications in Computerworld's Applications Topic Center.


- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- Establishing a Strategy for Database Security is No Longer Optional
- The options for securing increasingly valuable databases are very broad and deep, and can be confusing. This research provides an overview of three...
- Driving Secure Enterprise File Sharing and Syncing in the Enterprise
- GroupLogic's new activEcho is the industry's only secure Enterprise File Sharing and Synching solution that balances the need for simplicity for the end...
- The Enterprise File Sharing Option
- Enterprises and IT departments need to address several critical security issues when considering file sharing and syncing products. Many of today's solutions do...
- Activities Streams Base An Integrated Social Layer
- The enterprise social software market is exploding thanks to converging trends of consumerization, cloud, and mobile. In this must-read report, "The Forrester Wave:...
- Converged Infrastructure for Dummies
- As you know, everything is mobile, connected, interactive, and immediate. This is exactly why organizations need a highly agile IT infrastructure in order... All Applications White Papers
- Delivery Management -- Extending Lifecycle Management
- Date: Wednesday, June 20, 2012, 1:00 PM EDT
Siloed organizations continue doing the wrong things and doing things wrong, leading to increased costs,... - Leverage automation today to reduce IT complexity
- Date: Tuesday, June 5, 2012, 2:00 PM EDT
Whether your B2B complexity is caused by multiple technologies due to M&A, business or application specific... - BMC Control-M - Single Point of Control Demo
- With BMC Control-M, you schedule and manage everything - down to the very last platform and application - from one simple interface. It's...
- Operational Analytics - Changing the Competitive Dynamics of the Business
- Date/Time: June 5, 2012, 11:00 a.m., EDT, 4:00 p.m. BST / 3:00 p.m. UTC
Please join us for this webcast, as Dr. Barry... - Oracle Database Appliance Best Practices
- Business users increasingly demand 24x7 availability of their data while IT departments face the challenge of ensuring maximum availability while operating with limited... All Applications Webcasts