Hadoop is ready for the enterprise, IT execs say
Big companies are using Hadoop systems in big projects, despite concerns about issues such as security.
Computerworld - Despite some lingering user concerns about security and other issues, Hadoop is ready for enterprise use, according to IT executives at the Hadoop World conference in New York earlier this month.
Larry Feinsmith, managing director of IT at JPMorgan Chase, told a keynote audience that the financial services firm has been increasingly using the open-source storage and data analysis framework for almost three years.
JPMorgan Chase still relies heavily on relational database systems for transaction processing, but it uses Hadoop technology for a growing number of purposes, including fraud detection, IT risk management and self service, Feinsmith said.
With over 150 petabytes of data stored online, 30,000 databases and 3.5 billion log-ins to user accounts, data is the lifeblood of JPMorgan Chase, Feinsmith said.
Hadoop's ability to store vast volumes of unstructured data allows the company to collect and store Web logs, transaction data and social media data. "Hadoop allows us to store data that we never stored before," he said.
The data is aggregated into a common platform for use in a range of customer-focused data mining and data analytics tools, Feinsmith said.
Meanwhile, eBay is using Hadoop technology and the Hbase database, which supports real-time analysis of Hadoop data, to build a new search engine for its auction site.
Hugh Williams, vice president of experience, search and platforms at eBay, said the new engine, code-named Cassini, will replace technology the company has used since the early 2000s. The update is needed in part to handle surging volumes of data.
He noted that eBay has more than 97 million active buyers and sellers and over 200 million items for sale in 50,000 categories. The site handles close to 2 billion page views, 250 million search queries and tens of billions of database calls daily, he added.
The company has 9 petabytes of data stored on Hadoop and Teradata clusters, and the amount is growing quickly, he said.
Williams said about 100 eBay engineers are working on the Cassini project, making it one of the company's largest development efforts.
The new engine, slated to go live next year, is expected to respond to user queries with results that are context-based and more accurate than those provided by the current system, he said.
Feinsmith warned that IT shops interested in Hadoop should be aware of potential security issues. And he explained that aggregating and storing data from multiple sources can create a slew of problems related to access control and data management, while raising questions about data entitlement and data ownership.
Feinsmith also listed other potential Hadoop drawbacks that users should be aware of before embarking on big projects.
For instance, he said the Hadoop marketplace is "very confusing," featuring an oft-changing slate of vendors, products and standards. In addition, skilled Hadoop engineers are scarce.
And Williams noted that related technologies, such as Hbase, are still somewhat immature, which raises questions about system stability.
But Hadoop has plenty of potential. Feinsmith said that IT workers at JPMorgan Chase are debating whether relational database technologies will evolve to meet the bank's emerging big data needs, or if Hadoop-based systems will become adept at transaction processing.
This version of this story was originally published in Computerworld's print edition. It was adapted from an article that appeared earlier on Computerworld.com.
Read more about Data Center in Computerworld's Data Center Topic Center.
- The 20 Best iPhone/iPad Games of 2013 So Far
- 9 Steps to Build Your Personal Brand (and Your Career)
- 7 Consumer Technologies Coming to an Enterprise Near You
- 11 Signs Your IT Project is Doomed
- A walking tour: 33 questions to ask about your company's security
- 15 social media scams
- The 7 elements of a successful security awareness program
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- Harness IT -- An Introduction to Business Intelligence Solutions Learn the key selection criteria required to provide your organization with the capability to address structured data, unstructured data and mobile demands so...
- Business Intelligence Shows its Smarts Today's Business Intelligence (BI) tools provide a new way to think about data with self-service capabilities and user-friendly analytics that can be used...
- Proactive Planning for Big Data Big data is less about the terabytes and more about the query tools and business intelligence needed to make sense of massive amounts...
- Security Empowers Business Every magazine article, presentation or blog about the topic seems to start the same way: trying to scare the living daylights out of...
- Becoming An Analytics Driven Organization Join us on Tuesday, June 18, 2013, 11:00 AM EDT and learn how your agency can create an analytics culture that will enable...
- 3 Reasons Why Sepaton is the World's Fastest Backup Solution Leading analyst, Storage Switzerland learns how Sepaton backs up and deduplicates massive data volumes while maintaining the industry's fastest performance - all in... All Data Center White Papers | Webcasts
Rising salaries boost IT optimism, though not everyone is feeling upbeat. Our survey of 4,000+ IT workers shows who's riding the wave and why. Use our interactive tool and compare your own paycheck. Read more...