Hadoop ready for corporate IT, execs say
Despite some concerns, Hadoop has a growing place in the enterprise, say IT execs from JP Morgan Chase, eBay
Computerworld - NEW YORK -- Despite some lingering technology issues, Hadoop is ready for enterprise use, IT executives said Tuesday at the Hadoop World conference here.
Larry Feinsmith, managing director at JP Morgan Chase, told a keynote audience that the financial services firm has been using the open source storage and data analysis framework for close to three years now and is currently leveraging the technology for fraud detection, IT risk management, self service and other applications.
Chase still relies heavily on core relational database technologies for transaction processing, but uses Hadoop-based products for a growing number of tasks, Feinsmith said. Five out of seven Chase business units use Hadoop in some way, he added.
Hadoop's ability to store vast volumes of unstructured data has allowed Chase to collect and store weblogs, transaction data and social media data, Feinsmith said.
The company is aggregating the data into a common platform, and runs a range of customer-focused data mining and data analytics applications to utilize it, he said.
With over 150 petabytes of online storage, 30,000 databases and 3.5 billion logins to Chase user accounts, data is the lifeblood of the company, Feinsmith said.
For the moment at least, relational database technologies appear to be more suited for running transaction applications, he said.
The big debate among technologists at the bank right now is whether incumbent relational database technologies will evolve to meet the bank's emerging big data needs, or Hadoop-based technology can become adept at transaction processing, Feinsmith said.
Hugh Williams, vice president of experience, search and platforms at eBay, said that the auction site is revamping its core search engine technology using Hadoop and Hbase, a technology that enables real-time analysis of data in Hadoop environments.
The new eBay search engine, code-named Cassini, will replace the Voyager technology that's been used since the early 2000s. The update is needed in part due to surging volumes of data that needs to be managed, Williams said.
Williams said that eBay currently has more than 97 million active buyers and sellers and over 200 million items across 50,000 categories for sale. The auction site handles close to 2 billion page views, 250 million search queries and tens of billions of database calls each day, he said.
The company has 9 petabytes of data stored on Hadoop and Teradata clusters, and the amount of data is growing quickly, he said
Hadoop and Hbase allow EBay to build a far more sophisticated search engine than Voyager. Cassini will deliver more accurate and more context-based results to user search queries, he said.
With more than 100 engineers assigned to Project Cassini full time, the development effort is one of the largest ever at EBay.
BI and analytics
- For Univ. of Kentucky, SAP's HANA is 'disruptive'
- Enterprise BI models undergo radical transformation
- Investors are pouring funds into big data
- ClearStory to launch big data service for business users
- Oracle's Big Data Appliance brings focus to bundled approach
- Hadoop challenger works to add developers
- Kelley Blue Book taps data analytics tools to improve car valuation
- IBM rolls out cloud-based Web analytics tool
- Starbucks begins BI trial on tablets
- Self-service BI, SaaS, real-time analytics will dominate 2011 agenda


- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- Thinking Outside The Data Warehouse
- This high level, business problem focused eBook uses 5 customer scenarios to show how people and organizations are tackling real issues using IBM...
- Using BD for Smarter Decision Making
- This paper looks at new developments in business analytics and discusses the benefits analyzing big data bring to the business.
- Measuring the Business Value of CI in the Data Center
- One of the key strategies that IT teams are pursuing to reduce capital costs while boosting asset utilization and employee productivity is the...
- Switching Schedulers - Not As Complicated As You Think
- Changing or consolidating job schedulers may seem daunting. However, the benefits of switching to enterprise workload automation outweigh the risks. Read how BMC...
- Capture-Enabled Business Process Management
- Organizations today must deal with a vast amount of incoming information from many different sources. Efficient, automated business processes are critical to managing... All BI and Analytics White Papers
- InfoSphere Warehouse Packs Demo
- These flash modules make warehousing more tangible and relevant to business users through detailed explanations of the InfoSphere Warehouse Packs.
- Delivery Management -- Extending Lifecycle Management
- Date: Wednesday, June 20, 2012, 1:00 PM EDT
Siloed organizations continue doing the wrong things and doing things wrong, leading to increased costs,... - Leverage automation today to reduce IT complexity
- Date: Tuesday, June 5, 2012, 2:00 PM EDT
Whether your B2B complexity is caused by multiple technologies due to M&A, business or application specific... - BMC Control-M - Single Point of Control Demo
- With BMC Control-M, you schedule and manage everything - down to the very last platform and application - from one simple interface. It's...
- BMC Control-M - Single Point of Control Demo
- With BMC Control-M, you schedule and manage everything - down to the very last platform and application - from one simple interface. It's... All BI and Analytics Webcasts
