Skip the navigation

Data Stewards Seek Data Conformity

They have a variety of different titles, but these analysts work with the IT and business groups to improve data quality and standardization.

By Mary Brandel
March 15, 2004 12:00 PM ET

Computerworld - A customer is a customer is a customer, right? Actually, it's not that simple. Just ask Emerson Process Management, an Emerson Electric Co. unit in Austin that supplies process automation products. Four years ago, the company attempted to build a data warehouse to store customer information from over 85 countries. The effort failed in large part because the structure of the warehouse couldn't accommodate the many variations on customers' names.
For instance, different users in different parts of the world might identify Exxon as Exxon, Mobil, Esso or ExxonMobil, to name a few variations. The warehouse would see them as separate customers, and that would lead to inaccurate results when business users performed queries.
That's when the company hired Nancy Rybeck as data administrator. Rybeck is now leading a renewed data warehouse project that ensures not only the standardization of customer names but also the quality and accuracy of customer data, including postal addresses, shipping addresses and province codes.
To accomplish this, Emerson has done something unusual: It has started to build a department with six to 10 full-time "data stewards" dedicated to establishing and maintaining the quality of data entered into the operational systems that feed the data warehouse.
The practice of having formal data stewards is uncommon. Most companies recognize the importance of data quality, but many treat it as a "find-and-fix" effort, to be conducted at the end of a project by someone in IT. Others casually assign the job to the business users who deal with the data head-on. Still others may throw resources at improving data only when a major problem occurs.
"It's usually a seesaw effect," says Chris Enger, formerly manager of information management at Philip Morris USA Inc. "When something goes wrong, they put someone in charge of data quality, and when things get better, they pull those resources away."
Creating a data quality team requires gathering people with an unusual mix of business, technology and diplomatic skills. It's even difficult to agree on a job title. In Rybeck's department, they're called "data analysts," but titles at other companies include "data quality control supervisor," "data coordinator" or "data quality manager."
"When you say you want a data analyst, they'll come back with a DBA [database administrator]. But it's not the same at all," Rybeck says. "It's not the data structure, it's the content."
At Emerson, data analysts in each business unit review data and correct errors before it's put into the operational systems. They also research customer relationships, locations and corporate hierarchies; train overseas workers to fix data in their native languages; and serve as the main contact with the data administrator and database architect for new requirements and bug fixes.
As the leader of the group, Rybeck plays a role that includes establishing and communicating data standards, ensuring data integrity is maintained during database conversions and doing the logical design for the data warehouse tables. She'll also oversee implementation of the Group 1 Software Inc. data cleansing system and work with The Dun & Bradstreet Corp., whose database is used for company-name standardization and hierarchies.
The analysts have their work cut out for them. Bringing together customer records from the 75 business units yielded a 75% duplication rate, misspellings and fields with incorrect or missing data.
"Most of the divisions would have sworn they had great processes and standards and place," Rybeck says. "But when you show them they entered the customer name 17 different ways, or someone had entered, 'Loading dock open 8:00-4:00' into the address field, they realize it's not as clean as they thought."
Multitalented
Although the data steward may report to IT—as is the case at Emerson and at pharmaceuticals company Sanofi-Synthelabo Inc.—it's not a job for someone steeped in technical knowledge. Yet it's not right for a business person who's a technophobe, either.
What you need is someone who's familiar with both disciplines, like Seth Cohen. Cohen is the first data quality control supervisor at Sanofi in New York. He was hired a year ago to help design automated processes to ensure the data quality of the customer knowledge base that Sanofi was beginning to build.
Cohen has enough technical skills to be able to spec out a data-cleansing system and then work with a developer to make sure that the system is written correctly.
But having worked in the pharmaceuticals field for three and a half years, he also knows the industry's specific business rules and understands the most important data concerns that must be addressed during the requirements-gathering stage.
Data stewards should have business knowledge because they need to make frequent judgment calls, Cohen says. With Sanofi's data warehouse, for instance, if the system expects to get numbers in a field but gets a string of letters instead, Cohen must decide what's wrong and how to correct it.
Mary Pickett is another data steward who has a mix of skills. When she joined Winston & Strawn LLP, a law firm in Chicago, she considered herself a database specialist. Today, however, her title is "marketing applications specialist," and one of her primary duties is to ensure the quality of Winston & Strawn's contact database.
"Especially in this economy with people moving around, it's a highly charged, dynamic database that keeps changing," Pickett says. "If it sits for a month, it's dirty again."
Pickett prefers to train business users from within the company to keep the data clean. Likely candidates include paralegals or secretaries who manage contact lists for their practice groups. Still, she says, it takes a solid year for data clerks like them to gain the necessary experience to move up to data coordinator.
The reason: They need to learn not only how to sort through duplicate company names, make sure contact names are associated with companies and use the database's cleansing tools, but also how to prioritize which clients are the most important to work on. "We want to keep our top clients as clean as possible," Pickett says.
Perfection Unattainable
Indeed, judgment is a big part of the data steward's job—including the ability to determine where you don't need 100% perfection.
At OneSource Information Services Inc., a provider of business information products in Concord, Mass., orientation sessions include a speech on the inevitable dirtiness of data. But at the same time, says Beth Jacaruso, director of content management at OneSource, the company "lives and dies by data quality." So where do you draw the line? That's where data stewards come in: deciding what's "clean enough."
Cohen says that task is one of the biggest challenges of the job. "100% accuracy is just not achievable," he says. "Some things you're just going to have to let go or you'd have a data warehouse with [only] 15 to 20 records."
A good example is when Sanofi purchases data on doctors that includes their birth dates, Cohen says. If a birth date is given as Feb. 31 or the number of the month is listed as 13 but the rest of the data is good, do you throw out all of the data or just figure the birth date isn't all that important?
It comes down to knowing how much it costs to fix the data vs. the payback. "You can pay millions of dollars a year to get it perfect, but if the returns are in the hundreds of thousands, is it worth it?" asks Chuck Kelley, senior advisory consultant at Navigator Systems Inc., a corporate performance management consultancy in Addison, Texas.
Good Diplomats
Data stewards also need to be politically astute, diplomatic and good at conflict resolution—in part because the environment isn't always friendly. When Cohen joined Sanofi, some questioned why he was there. In particular, IT didn't see why he was "causing them so many headaches and adding several extra steps to the process," he says.
There are many political traps, as well. Take the issue of defining "customer address." If data comes from a variety of sources, you're likely to get different types of coding schemes, some of which overlap. "Everyone thinks theirs is the best approach, and you need someone to facilitate," says Robert Seiner, president and principal of KIK Consulting & Educational Services in Pittsburgh.
People may also argue about how data should be produced, he says. Should field representatives enter it from their laptops? Or should it first be independently checked for quality? Should it be uploaded hourly or weekly? If you have to deal with issues like that and "you're argumentative and confrontational, that would indicate you're not an appropriate steward," Seiner says.
Most of all, data stewards need to understand that data quality is a journey, not a destination. "It's not a one-shot deal—it's ongoing," Rybeck says. "You can't quit after the first task." Brandel is a freelance writer in Grand Rapids, Mich. Contact her at mary.brandel@comcast.net.

DATA STEWARD
Required Tech Skills


Additional Resources
Forrester Consulting - Optimizing Users and Applications in a Mobile World
WHITE PAPER
Solving application issues over the WAN requires careful consideration. Based on their independent research, Forrester Consulting offers recommendations on how to tackle application performance issues, insufficient bandwidth and the inability to quickly restore users in a disaster.

Read now.

Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

Careers White Papers
Overcome Top 7 Admin Challenges of Active Directory
As Active Directory's role in the enterprise has drastically increased, so has the need to secure the data. Gain insight on creating repeatable,...
Insiders Can Ruin Your Company. Take Action.
Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in...
Top Solutions and Tools to Prevent Devastating Malware
Custom malware frequently goes undetected. According to Forrester Research, the best way to reduce risk of breach is to deploy file integrity monitoring...
Streamline Compliance and Increase ROI
Streamline, simplify, and automate compliance related activities; especially those that impact multiple business units. This white paper from NetIQ, outlines solutions that will...
X-Ray of the PCI Process-4 Proactive Steps
This white paper from Forrester Research Inc., helps break PCI into understandable components. Security and risk professionals will gain knowledge and insight into...
All Careers White Papers
Careers Webcasts
Optimizing Networks for the Cloud
Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn...
Virtualize Business-Critical Applications with Confidence
Virtualizing business-critical applications has become a key focus for organizations as they move along their virtualization journey. With the launch of VMware vSphere®...
All Careers Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs