Subscribe to our e-mail newsletters
For more info on a specific newsletter, click the title. Details will be displayed in a new window.
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
More E-Mail Newsletters 
Computerworld 2007Subscribe to Computerworld
40 years of the most authoritative source of news and information for IT leaders.

Piecing Together the Data Picture

Data quality translates into companies having the right information at the right time to make decisions.
 

Sign up to receive Security Resource Alerts

August 11, 2003 (Computerworld) -- Poor data quality can confuse your customers, undermine your applications or even put you out of business—and there's everything in the world you can do about it. More than simple data-cleansing, which involves correcting a misspelled name or changing "Avenue" to "Street," a data quality initiative addresses more complex and subtle problems.


For example, one New York bank that had a 3% to 5% bad-debt ratio on its credit card operation acquired another bank, says Aaron Zornes, a San Francisco-based analyst at Meta Group Inc. "It turns out that the acquired bank had a 15% bad-debt ratio. The New York bank took over, and the bad debt nearly put them out of business," he says.


If the acquiring bank had had a data quality initiative to run large database-comparison jobs off-line, the problem could have been averted, says Zornes. Bank managers could have predicted the loan default rate by comparing the outstanding debt, incomes and even partial ZIP codes of the acquired bank's credit card customers against a historical database of similar customer profiles.


"They would have been able to tell that this company wasn't a good buy," Zornes says. "Enterprises cannot afford to wait on data quality efforts."


Data quality initiatives are critical to enterprise applications such as CRM and ERP systems, Zornes notes. And according to The Data Warehousing Institute in Seattle, data quality problems cost U.S. businesses more than $600 billion per year.


"The basis of any CRM system is the integrity of the data," says Steve Deeb, vice president for CRM at Monster Worldwide Inc. in Maynard, Mass. "Any and all processes are driven by that data."


In addition to business needs, there are now regulatory pressures to maintain better data, Zornes says. "If someone has bought a large amount of ammonia-based fertilizer, then rents a car," the U.S. Department of Homeland Security wants to know about it, he says. "And this isn't information you can wait months or even a week to find out."


The tools to to improve data quality exist, says Zornes, but although "businesses give lip service to the need for data quality, too often they don't do anything about it."


James Eardley, a managing director of CRM at FleetBoston Financial Corp., agrees. "Data quality gets short shrift too often. It's not important until you need it," he says.


Although in dissimilar industries, FleetBoston and Monster both use CRM software from Siebel Systems Inc. in San Mateo, Calif., and faced similar data quality problems. Duplicate records in customer and contact databases meant one department didn't know what another was doing.


"What we were missing was a total picture of the customer relationship. We have multiple business sales forces following a single customer. It's hard enough to get one business unit's data clean. We now have 24," Eardley says.


"There's no consistency with how users enter customer and contact records," he continues. "Some people use upper- and lowercase; others use all uppercase." Today FleetBoston's system standardizes the data elements and does ZIP code lookups.


The company opted for data quality software from FirstLogic Inc. in La Crosse, Wis. Those tools, coupled with the Siebel software, "seemed to do exactly what we needed," Eardley says.


To prevent duplicate entries, when a user enters a record, the FirstLogic system generates a token, which it compares to others to see if the database has similar tokens. If it finds any, it shows them to the user to determine whether the record is a duplicate.


"We had to work a little bit to get the tokens to our liking, and then it worked fine," Eardley says. "We also run batch jobs monthly to identify and fix any duplicates." Any records that the system can't resolve go to the business side for review.


Monster Problem


Similar data inconsistencies undermined confidence in Monster's system, says Deeb. Duplicates and unidentified accounts in the Siebel system made it difficult to know which database to use for ordering or invoicing, he says. And the sales staff wasn't getting the support it needed.


Initially, Deeb says, "we didn't see a product that mapped directly into what we were doing." But after building its own address-matching application, the company found that it needed a more strategic tool and more sophisticated analysis than its in-house application could offer.


About a year and a half ago, Monster took another look at the field and chose the Trillium Siebel connector from Trillium Software, a division of Harte-Hanks Inc. in Billerica, Mass.


"When we were looking at the ROI, the ease with which the Trillium product could be integrated into our systems was attractive," Deeb says. "We leveraged the strength of the Trillium core product—such as the way name and address databases from around the world can be plugged in—and integrated it into our processes in a way that made sense to the way we do business."


Now, when a record is entered, the system evaluates in real time whether it's new or a modification of an existing record. The company also runs data quality checks in batches to ensure that duplicates aren't introduced when it incorporates a new mailing list into its existing database. They're also performed at regular intervals to minimize data degradation. In addition to the IT resources dedicated to maintaining data quality, business staffers are also assigned to monitor the system and resolve anomalies.


It's the essence of analytical CRM, Deeb says. "Real-time analysis to determine the right offer to the right customer at the right time in a predictable manner is driven by the quality of customer data supporting that analysis," he says.


But most companies believe that their data is cleaner and more accurate than it is, says Wayne Eckerson, The Data Warehousing Institute's education and research director. He cites as one example an insurance company that each month gets 2 million claims, each with 377 data elements. At an error rate of 0.1% for all claims data, that's more than 754,000 errors monthly, which amounts to 9.04 million errors annually. If 10% of data elements are critical to its business decisions, the company each year must correct more than 1 million errors that could damage its ability to conduct business. Estimating the risk cost at $10 per error, poor data quality costs the company $10 million annually in erroneous payouts.


"It's bewildering," says Eckerson, "but almost half of all companies have no plan for managing data quality." Responsibility for data quality often rests with IT staffers, who make their decisions based on the tools available.


Data Quality Means Business


"First and foremost, data quality is a business issue," says Ted Friedman, an analyst at Gartner Inc. in Stamford, Conn. "But the solution is the proverbial three-legged stool: people, process and technology."


The first step in a data quality initiative is to analyze what the data is and how it's used, Friedman says.


GMAC Mortgage Corp. in Horsham, Pa., followed this measured course in its data quality initiative. When interest rates went into free-fall a year and a half ago, the first thing the company's CEO wanted employees to do "was cope with a 300% to 400% increase in daily business of people refinancing mortgages," says David Adams, GMAC's enterprise data access manager.


Tuning the Oracle database that supported application processing improved performance, he says, "but it also opened our eyes to the need to go further and address the quality of the data itself." And with GMAC beginning a major overhaul of its data warehouse—"actually, it was more a large tank of data than a data warehouse," says Adams—the timing was right to launch a data quality initiative.


"To compete on the other side of the refinancing boom, we were going to have to have better, cleaner data to get the accurate analyses that the CEO wanted and that we needed to make the most of our operation," he says.


Adams brought in a data quality consultant to explain to the executive council what the project would entail. Adams and his team researched the data quality tools, ran two pilots and then selected software from Ascential Software Corp. in Westboro, Mass. The Ascential product was more expensive and took more work to get going than some less sophisticated tools, he says. But Adams was sold on the software's heuristic logic, which let it adapt to GMAC's operation.


"The ETL [extract, transform and load] technology is pretty mature, and it works well," says Adams. "But it's the data quality and metadata stuff that's going to give you the great advances."


Physically merging databases would have required that every division agree on a single definition for each data element, which was "probably impossible," Adams says.


Instead, metadata resides in Ascential DataStage and links divisional databases at the logical level, with "pointers" indicating the source of the data. Each division's database remains inviolate.


Each division can decide what data can be shared and with whom, which is important for adhering to government regulations. Other tools couldn't deliver that granularity of control, says Adams.


The team installed the software in January and, working with the data warehousing team, went live in May with a relatively small application for new credit policy reporting. The first large data mart, to support all reporting for GMAC's wholesale operations, will go live Aug. 15.


"Information is a critical asset," says Meta's Zornes. "We need to change the way we think about it. It may sound like science fiction now, but in the future, companies will certify information the way we certify works of art and financial instruments, i.e., by assigning that information asset's value and origination."


Lais is a Computerworld contributing writer in Takoma Park, Md.










FOUR WAYS TO BUILD A MASTER FILE







1 Synchronized master. Use middleware to synchronize data in its native store and create a logical master in real time. Best for companies with low data velocity.


2 Application-specific master. Pick one operational application, such as CRM, to be the master. Best for companies with data primarily in one application.


3 Customer master overlay. Use a third-party, application-agnostic overlay, a common choice of big banks and insurance companies. Best for vertical industries, such as banking, insurance and travel.


4 Data-warehouse-based master. Create a data-store-like structure to straddle operational and analytical environments. The store holds recent, transaction-level data; the warehouse holds summaries and data analyses. Best for companies with low operational data latency needs.



Source: Meta Group Inc., Stamford, Conn.












FOUR ENEMIES OF DATA QUALITY






Denial. IT managers assume that old data will serve new uses without being re-engineered.


Deception. They assume that their new ERP or CRM software will solve the problem.


Deflection. They shift responsibility for data quality to someone else—users, IT, those doing data entry or the systems integrator implementing the new system.


Deferral. IT managers think they can put off fixing data quality until after the new system is implemented.



Source: Stephen Brown, Ascential Software Corp., Westboro, Mass.





Print this Story Send Us Feedback E-mail this Story Digg! Digg this Story Slashdot this Story
Sidebar: Choosing the Data Quality Techniques Your Company Needs
Piecing Together the Data Picture
"Enterprise search continues to lag behind commerical search because companies lack a "findability" strategy, says one researcher...." Read more...
"It's IT Blogwatch: in which we all wonder how much we get paid and Glassdoor.com helps us out. Not to..." Read more...
Read more Business Intelligence posts or See all Blogs
Microsoft promises four patches next week
Google gives away home-cooked Web application security scanner
Storm botnet stages Fourth of July attacks
More top stories...
Microsoft trumpets security additions in upcoming IE8
Apple cuts price of high-end SSD MacBook Air by $500
Ultrathin showdown: Apple MacBook Air vs. Lenovo ThinkPad X300 vs. Toshiba Portege R500
All it takes is a couple hours and about $125 to breathe new life into an old laptop. Here's how.
Is Microsoft's Golden Age over? What are Gates' most memorable quotes? Find out in Computerworld's complete coverage of the end of the Bill Gates era at Microsoft.
There are some things your CIO definitely doesn't want to hear. Also don't miss the flipside, Five things you should always tell your boss.
With its latest version, Mozilla's browser continues to raise the bar for what Web browsers should be.
Reviews, analyses, how-tos, visual tours, hot issues and predictions about Microsoft's new OS.
Four years from now, the IT field will be a vastly different place. Will you be ready?
All Zones
Application Performance Zone
Business Continuity Zone
Data Center Management Zone
Enterprise-Class Security Zone
The File Data Management Zone
Grid Computing on Windows Zone
Security Management Zone
ITIL Best Practices Zone
The SAS Zone
Storage Virtualization Zone
Business Intelligence and Analytics Zone

Ads by TechWords

See your link here
Speeding the time to intelligence
Get this Computerworld report free for a limited time, compliments of SAS.
Time To Intelligence -- a concept defining how long it takes to get accurate and timely information into the hands of workers who need it most. Do it slower than your competitors and your company is toast. Do it faster, you scorch them. Business Intelligence is the key to optimizing Time To Intelligence, and success there is a combination of people, policies, and technology.
Download this executive briefing download
Why SaaS is Vital to Email and Web Security
Why SaaS is Vital to Email and Web Security
Download this webcast, free, compilments of Webroot Software
Go to the webcast 
Rapid application development, rapid results
Download this special report now!
(Source: Intersystems) All too many businesses suffer from IT infrastructures that are a hodge-podge of disconnected databases and applications. What's needed is the ability rapidly develop connected applications under a unified service-oriented architecture. InterSystems Ensemble integration environment and Cache database are effective tools in answering this need, delivering a rapid ROI.
Download this white paper go
White Papers
Read up on the latest ideas and technologies from companies that sell hardware, software and services.
Deploying Virtualized NetWare on Linux Whitepaper
Toward More Flexible, Next-Generation Collaboration Solutions
Driving Business Success Through Workgroup Choice and Flexibility
View more whitepapers