Q&A: Netezza to focus on workload optimized systems, CEO says

One-size-fits-all model doesn't work in data warehouse market, Jim Baum says

With its $1.7 billion purchase of Netezza last year, IBM has acquired a company widely regarded as one of the most disruptive in the data warehouse market.

In this interview with Computerworld Netezza CEO Jim Baum talks about what the acquisition means for enterprises and he discusses trends that are driving the market and shaping the company's products.

What does IBM's acquisition of Netezza mean for customers? Why should they care? The Netezza acquisition by IBM is very much along the lines of supporting and creating infrastructure that is required to support business analytics [applications]. Netezza is still Netezza. Our team is together, our engineering is together, our field support mechanisms, are together. So, from a customer perspective, they are dealing by and large with the Netezza they have been dealing with forever. That said, we are growing the business substantially. We have more resources available in the field and in more geographies where we haven't been before. In general, what our customers are seeing is more scale, they are seeing more resources behind us and they are seeing opportunities for us to gain leverage from the rest of IBM.

Netezza has made quite an impact in the data warehouse market with its appliance approach. Why has it worked so well? What really created the opportunity here is many, if not most of the early data warehousing environments were very complex. A customer would have to go and procure storage and compute capability in the form of whatever server they wanted to use. They would have to go and buy software, then they would have go find a service provider. And then all of that stuff would get integrated in the customer's environment to create a data warehouse. That data warehouse would then serve the various analytics and reporting needs of the business. Then the other driver of course is the number of actual end users accessing the information in these warehouses. So many of the installations we have dealt with over the years have been plagued by the very high cost of scaling. The appliance model has given us an opportunity to dramatically improve the performance of those environments with a very easy-to-deploy, easy-to-maintain, very fast time-to-value solution.

How much better are your systems really? A lot of people in the industry talk about the cost per terabyte of building these environments. The cost per terabyte is typically not the issue. These are mission-critical applications that people are using to drive near-real-time business decisions. So the real driver here becomes performance. How much data, how fast can you access that data, how many users can access that data? One of our customers is a company called MediaMath in New York. They are in the business of pricing [advertising] real estate in near real time. Sort of microsecond response times to actually set a price for a piece of Internet advertising real estate and then have an auction run against that price to actually sell that piece of advertising real estate. This is a business that in fact can't exist without the ability to run complex analytics on very large data. For them, the issue is not about cost per terabyte. It is about price performance. If you go in and make a customer's performance two times faster, that's interesting. But when you can make it an order of magnitude or two or three greater then it actually changes their business.

What's driving demand for your kind of products? I think there are very real and undeniable trends on a couple of fronts here. Businesses have recognized the ability and the need to be able to extract usable business decision-making information from the data that defines their business and the ecosystem in which it operates. If you look at just about any company in the world today, they are producing extraordinary volumes of data. They are producing data in their supplier systems, in their customer systems, in their customer relationship management systems, in their ERP systems. [Companies are] going beyond traditional BI into what I would call business analytics and business optimization. Customers are now looking at the data and they are seeing that they actually are able to make predictions based on that data. Trend No. 2 is the amount of data we have to deal with here. It wasn't long ago that we all thought that a terabyte of data was a lot. Now we are starting to talk in terms of hundreds of terabytes and petabytes and soon we will be talking in terms of exabytes. Our whole idea of it is if we can make the data more manageable, then we can make it more accessible to a broader group of end users.

How do you see these trends affecting Netezza's product strategy over the next few years? In the Netezza world, you will see products that are purpose-built to support a task of archival of very large volumes of data. It is one of the things that I think makes us fit with IBM extremely well. This idea of workload-optimized environments. If you look at one of the latest workload-optimized computers that IBM is talking about, it's the Watson computer. That's a system that's purpose-built to play Jeopardy. It is, frankly, a breakthrough in natural language processing and text analytics in being able to provide responses to these Jeopardy questions in under three seconds. It is a very good example of a workload-optimized system. So that's the theme here.

Do emerging BI applications even need a data warehouse anymore? Yes absolutely. I think one of the things you hear a lot about in the market right now, especially on this topic, comes from Oracle. They are out there positioning Exadata as all things to all people. It will do OLTP [Online Transaction Processing], it will do ODS, it will do data warehousing, it will fold your shirts. It's a very interesting positioning of theirs. Because what we found at Netezza and certainly what IBM has found over many decades is there really is no one size fits all here. This idea of a workload optimized system for the task at hand is really quite important. I think when you take a product like Exadata and you say to the market that it does it all, what have you done? What you have really done is create a system that is a very complex integration of hardware and software that is designed to meet the needs of everyone and therefore probably doesn't fully meet the needs of anyone. We still see a very clear need for this idea of purpose-built workload optimized solutions like Netezza for warehousing, like Watson for winning Jeopardy games and like other solutions that IBM has for OLTP, transaction processing and other workloads.

Jaikumar Vijayan covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at @jaivijayan, or subscribe to Jaikumar's RSS feed . His e-mail address is jvijayan@computerworld.com.

Copyright © 2011 IDG Communications, Inc.

Shop Tech Products at Amazon