Data Warehouse Boost on a Budget
Start-ups are challenging established data warehouse vendors with products that increase performance for ad hoc queries but cost less.
Computerworld - When Premier Inc.'s medical databases began bogging down last year, the San Diego-based provider of clinical data put its data warehouse in a boxliterally.
Premier sells access to clinical data it gathers from 400 hospitals to pharmaceutical manufacturers. Last year, the company's IBM Red Brick data warehouse had grown to 3TB, and one table included 3 billion entries. "When you go through 3 billion rows of data, you get long runtimes," says Chris Stewart, director of data warehouse architecture.
The problem wasn't just the size of the database, however, but how clients used the data. "Our users want to access all of the data from top to bottom," says Stewart, and the complex, multipass queries created by Premier's 4,000 users each week were slowing performance. Some wouldn't run at all.
Instead of adding to its 24-processor Solaris server infrastructure or making further attempts to optimize the database, Stewart brought in an all-inclusive data warehouse appliance from Netezza Corp. in Framingham, Mass. Some calculations that took one or two days now finish in six to eight minutes on the appliance's 108 processors. Premier still uses Red Brick for most queries, but the NPS 8150 appliance handles the "really, really ugly questions" that weren't possible to process before, he says. "We couldn't offer the product offerings we do today" without the appliance, Stewart says.
As data warehouses continue to grow, more users are demanding access to business intelligence (BI) tools to conduct data-mining exercises across large data sets. "We're talking about using every single call-detail record generated in the last three years," says Claudia Imhoff, president of Intelligent Solutions Inc., a consulting firm in Boulder, Colo. It's hard for database administrators (DBA) to create aggregations of data, such as summarizations, that can facilitate the processing of these complex queries because users often don't know in advance what they're looking for. "These unplanned questions are the ones that knock the stuffing out of databases," she says.
But such queries are increasingly seen as business-critical, says William Fellows, an analyst at The 451 Group in New York. "The problem of querying data sets that are growing at over 100% a year has led to what might be called a data warehouse capability gap," he says. While market leaders like Teradata, a division of NCR Corp. in Dayton, Ohio, offer integrated systems to address this for high-end applications, Netezza and others are jumping in with moderately priced systems that don't require the same high-end hardware and software investments as those from IBM, Oracle Corp. and Teradata.
It's an interesting trend but still a small part of the $16 billion market for data warehouse hardware and software, says Dan Vesset, an analyst at IDC.



- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- Overcome Top 7 Admin Challenges of Active Directory
- As Active Directory's role in the enterprise has drastically increased, so has the need to secure the data. Gain insight on creating repeatable,...
- Insiders Can Ruin Your Company. Take Action.
- Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in...
- Top Solutions and Tools to Prevent Devastating Malware
- Custom malware frequently goes undetected. According to Forrester Research, the best way to reduce risk of breach is to deploy file integrity monitoring...
- Streamline Compliance and Increase ROI
- Streamline, simplify, and automate compliance related activities; especially those that impact multiple business units. This white paper from NetIQ, outlines solutions that will...
- X-Ray of the PCI Process-4 Proactive Steps
- This white paper from Forrester Research Inc., helps break PCI into understandable components. Security and risk professionals will gain knowledge and insight into... All Topic Center White Papers
- Optimizing Networks for the Cloud
- Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
- Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
- Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
- Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
- Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
- Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
- Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn...
- Virtualize Business-Critical Applications with Confidence
- Virtualizing business-critical applications has become a key focus for organizations as they move along their virtualization journey. With the launch of VMware vSphere®... All Topic Center Webcasts