Ads by TechWords

See your link here
Receive the latest technology news and information.
Storage
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

Auto market researcher revs up Oracle grid for massive data warehouse

R.L. Polk plans to bring all 2.5 petabytes of data into Oracle 10g

October 19, 2006 12:00 PM ET

Computerworld - Like a muscle car driving 55 mph on the freeway, R.L. Polk & Co.'s new grid-based data warehouse boasts gobs of untapped power under the hood, according to Kevin Vasconi, the company's CIO.

In May, the Southfield, Mich.-based automotive industry market research company finished moving its main 4TB customer-facing data warehouse to an Oracle 10g grid comprised of Dell PowerEdge servers running Linux.

The move has helped R.L. Polk save money and improve data redundancy, availability and access time. It also supports Polk's new service-oriented architecture, which is improving customer service, Vasconi said.

"We are getting more bang for our buck," he said. The data warehouse is doing 10 million transactions a day "without any issues."

Encouraged by the experience so far, R.L. Polk is bringing onto the grid other databases, both domestic and overseas, that total 2.5 petabytes of actively managed data. It's a process that will take at least 18 months, Vasconi said. And the amount of data is expected to grow 30% per year for the foreseeable future.

Founded in 1870 -- the same year the automobile's predecessor, a motorized handcart, was invented in Germany -- R.L. Polk started as a publisher of business directories. It became a car information supplier in 1921 and began using computer punch cards in 1951. The company is best known to consumers for its Carfax database of car histories.

Before its recent move to Oracle grid technology, R.L. Polk stored most of its data on Oracle 9 or 10 databases running Sun Solaris servers, connected to EMC gear running in storage-area networks.

Now, R.L. Polk's grid is comprised of 100 two- and four-way servers all running Red Hat Enterprise Linux. It also serves up applications and powers the rule processing engine. It can "easily double" to 200 servers, providing room for growth.

Only a tiny portion of the grid – four four-way servers – is apportioned now to the data warehouse. Much of it is devoted to running R.L. Polk's new Web-based applications, which both import data into the data warehouse from 260 discrete sources, such as car dealers or state licensing boards, and streams it out to paying customers, such as carmakers, car dealers and parts suppliers.

The data warehouse serves as R.L. Polk's "single source of truth" on a massive database that includes 500 million individual cars, or almost 85% of all cars in the world as of 2002. It also includes data on 250 million households and 3 billion transactions.

R.L. Polk cleanses the names and addresses of all incoming records, adds location data such as latitude and longitude, and, in the case of the 17-digit vehicle identification numbers unique to every car, extrapolates each car's individual features and styling. It's a complicated process, but as his team continues to tweak the Oracle grid engine, Vasconi expects to be able to shorten the importation time to less than 24 hours.

Looking forward, Vasconi said data already stored on vehicles' on-board computers -- such as engine-trouble history, GPS-based location history, average speeds and so on -- will soon be imported into the data warehouse, too, if privacy issues can be resolved.

"The car is a gold mine of consumer information," Vasconi said.

Read more about business intelligence in Computerworld's Business Intelligence Knowledge Center.



Jump to comments

R.L. Polk

Additional Resources

EFD vs. HDD - What You Need to Know
WHITE PAPER
Enterprise flash drives provide a new Tier 0 storage layer capable of delivering high I/O performance at a very low latency. Proper use of EFDs in an Oracle environment can deliver increased performance compared to fibre channel drives. Read the recommendations for identification of the best DB components for EFDs.
Gartner Research Report: Magic Quadrant for Application Delivery Controllers, 2009
WHITE PAPER
The market for products to improve the delivery of application software over networks remains dynamic and innovative. Vendors focused on solving enterprises' most-pressing application problems have become the top players.
Eight Criteria for Server Load Balancing
WHITE PAPER
Server load balancers are a simple yet highly effective means to scale an application environment while ensuring its availability. Today's solutions should also address application performance and security. Read about the top eight criteria you should consider when choosing a server load balancer and how Citrix NetScaler meets those requirements.

White Papers & Webcasts

Cache Tier Memory Efficiency with Gear6 Web Cache
Download this valuable white paper!  

Connecting to the Cloud with F5 and VMware VMotion
F5 and VMware partner to enable live application and storage migrations between datacenters and clouds, over short or long distances.  

Virtualize Microsoft Applications on VMware
Register for this live webcast now!

F5 Virtualization Guide: Seven Key Challenges You Can't Ignore
Seven Key Challenges You Can't Ignore  

Strategic ECM Webinar
Learn what new strategic business benefits can be realized through ECM!


IT Jobs

 

Partnered Content
Hitachi - Inspire the Next
Storage Economics: Understanding Tiered Storage Solutions
Storage Economics is a suite of methodologies, tools, and services that help customers identify the total cost of storage ownership and provide a tiered storage solution to reduce ongoing costs. Understand the benefits of implementing a tiered storage architecture which include improving storage capacities and easing the access demands to any single storage tier. Learn more.
Download this white paper 
Strategies for an Increasingly Cost-Conscious Data Storage World
Whatever word you use, we can all agree that the global economy continues to face challenging times. Yet, the essential challenge remains the same: IT demands continue to increase but the resources to address such challenges are being flattened or cut. However, we truly have an opportunity here to do more with less and focus on efficiency. Hitachi can help. Learn more.
Download this white paper 
Four Principles to Reduce TCO
Yes, good news! The good news is that there are proven strategic investments available today for storage infrastructure cost reduction. Smart organizations will follow the principles of Storage Economics to evaluate them not just for their technical prowess but also for how well they can support business performance and particularly efforts to economize. Learn more.
Download this white paper