Massive data volumes making Hadoop hot
Computerworld - Corporate efforts to glean business intelligence from the massive volumes of data generated by Web server logs and social media have led to a surge of interest in open-source Hadoop software.
Hadoop is designed to process terabytes and even petabytes of unstructured and structured data. It breaks large workloads into smaller data blocks that are distributed across a cluster of commodity hardware for faster processing.
The technology -- already used by Web giants such as Facebook, eBay, Amazon and Yahoo -- is increasingly being adopted by banking, advertising, biotech and pharmaceutical companies, said Stephen O'Grady, an analyst at RedMonk.
Tynt Multimedia, a Web analytics firm that collects and analyzes nearly 1TB of data per day, switched to Hadoop about 18 months ago when its MySQL database system began collapsing under the sheer volume of data it was collecting, said Cameron Befus, Tynt's vice president of engineering.
Relational database systems are good at data retrieval and queries but don't accept new data quickly. "Hadoop reverses that. You can put data into Hadoop at ridiculously fast rates," Befus said. But Hadoop requires programming tools such as Pig or Hive to write SQL-like queries to retrieve the data.
This version of this story was originally published in Computerworld's print edition. It was adapted from an article that appeared earlier on Computerworld.com.
Read more about Applications in Computerworld's Applications Topic Center.
- 10 Hot Big Data Startups to Watch
- 11 Unique Uses for Google Glass, Demonstrated by Celebs
- How to Export Your Google Reader Account
- How to Better Engage Millennials (and Why They Aren't Really so Different)
- Telltale signs of ATM skimming
- 20 security and privacy apps for Androids and iPhones
- Big screen con artists: 7 great movies about social engineering
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- How Application Aware Networks Make the Impossible Possible Realizing Business Value and ROI with Application-Aware Network Performance Management
- Enabling Ubiquitous Visibility in Virtualized Environments Enterprises are rapidly adopting virtualization for dynamic service delivery and service management agility. IT challenges already exist in virtual environments and will only...
- The Importance of Performance Management in Software-defined Networking Riverbed Technology and VMware have joined forces to help address these problems and make it easy to deploy and manage VXLAN overlay networks...
- Network Monitoring and Troubleshooting for Dummies The Network Monitoring and Troubleshooting for Dummies Book introduces you to common network performance management (NPM) issues and give you a new way...
- Live Webcast
Virtustream (Vayence) video taking a 3000-Seat SAP Environment to the Cloud - How can public cloud services help your organization reduce costs and increase security for your mission
- Virtustream (Vayence) video taking a 3000-Seat SAP Environment to the Cloud How can public cloud services help your organization reduce costs and increase security for your mission
- Innovation in the Cloud Managing HR and financial information in the modern business requires efficient business practices and technology. All Applications White Papers | Webcasts