Look before you leap into Hadoop
Analysts and early users warn that most data centers lack the analytics expertise needed for the open-source big data technology.
Computerworld - Now that Apache.org has listed more than 150 enterprises as Hadoop users -- including JPMorgan Chase, IBM, Google, Booz Allen Hamilton and the New York Times -- it seems likely that the big data management system could soon become all the rage among corporate IT executives.
But analysts and early users warn that companies should move slowly to take advantage of the open-source technology, noting that Hadoop requires extensive training along with analytics expertise not seen in many IT shops today.
Some also noted that the swollen ranks of suppliers of Hadoop technology could soon thin out, leaving some users without vendor support for the complex technology.
To be sure, Hadoop clearly has some technical advantages over traditional database management systems, especially its ability to simultaneously handle both structured data and unstructured information such as video, audio and email messages. Hadoop systems can also scale with minimal fuss and bother.
Forrester Research analyst James Kobielus pointed out that only about 1% of U.S. enterprises are currently using Hadoop in production environments. That figure should remain small for now, perhaps growing to 2% or 3% over the course of the year, he projected.
Concurrent Computer and eBay may be more typical of today's early Hadoop adopters; they use the big data technology for specific applications while maintaining traditional relational database technology for the bulk of their IT operations.
As such IT operations build up expertise, they can figure out more things to do with Hadoop, Kobielus said.
Online auction house eBay stores unstructured data on Hadoop-based clusters running on "thousands" of nodes, while using relational databases for key tasks like transaction processing, said Hugh Williams, vice president of experience, search and platforms.
"We see value in using multiple technologies to work with our data," Williams said. "Hadoop is a terrific choice for certain uses, while other technologies work alongside it for other purposes."
In the long term, he said, the idea is to remain "flexible in what technologies we use; we don't see a world [with] one unifying technology."
Concurrent, a maker of video-streaming systems, uses Hadoop to "do the heavy lifting, such as large-scale data processing," said William Lazzaro, director of engineering.
Concurrent continues to use multiple relational databases, including MySQL, PostgreSQL and Oracle for other tasks, Lazzaro added.
Kobielus also warned that today's market for Hadoop technology is "turbulent," with a fast-growing community of vendors that continues to "rapidly evolve."
Marcus Collins, an analyst at Gartner, suggested that IT managers take the time needed to seek out hard-to-find Hadoop experts before getting too immersed in the technology. "You need to train your staff and invest in analytics," he said.
"It's not trivial," agreed eBay's Williams. "We've put a lot of training in place, so our engineers know how to use Hadoop and can write code. Don't underestimate that."
Analysts and users also stressed the need to educate corporate executives on the use of an open-source system for mission-critical applications.
Using it for a few under-the-radar kinds of projects is one thing, but using it to develop a massive system for all the world to see is another thing entirely.
Weiss is a freelance technology writer.
This version of this story was originally published in Computerworld's print edition. It was adapted from an article that appeared earlier on Computerworld.com.
Read more about Applications in Computerworld's Applications Topic Center.
- 15 Non-Certified IT Skills Growing in Demand
- How 19 Tech Titans Target Healthcare
- Twitter Suffering From Growing Pains (and Facebook Comparisons)
- Agile Comes to Data Integration
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- The value of smarter oil and gas fields With global energy requirements continuing to rise, the exploration, development and production of new oil and gas resources are shifting to increasingly challenging...
- Smarter Environmental Analytics Solutions: Offshore Oil and Gas Installations Example This IBM Redbooks® Solution Guide describes a solution for implementing smarter environmental monitoring and analytics for oil and gas industries. The solution implements...
- Piecing Together the Business Intelligence Puzzle Business intelligence (BI) technology collects and analyzes company data, delivering relevant information to corporate decision-makers in an effort to produce favorable outcomes.
- Harness IT -- An Introduction to Business Intelligence Solutions Learn the key selection criteria required to provide your organization with the capability to address structured data, unstructured data and mobile demands so...
- Live Webcast Increasing the Value of Your Reports and Dashboards Learn how incorporating other analytical capabilities such as predictive modeling and visualization can increase the value of your reports and dashboards by providing...
- The Software-Defined Data Center: Is your ADC ready? Data center transformation is accelerating beyond virtualization to next-generation cloud architectures and software-defined data centers, bringing new challenges for application performance, scalability and...
- Application Acceleration: Optimize the End-User Experience Watch this on-demand webcast and learn how you can optimize your web content, accelerate performance across any device and browser combination, and offload... All Business Intelligence/Analytics White Papers | Webcasts