Computerworld
Quick Menu
Search



Ads by TechWords

See your link here


Subscribe to our e-mail newsletters
For more info on a specific newsletter, click the title. Details will be displayed in a new window.
Data Management
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
More E-Mail Newsletters 
Computerworld 2007Subscribe to Computerworld
40 years of the most authoritative source of news and information for IT leaders.

Data Stewards Seek Data Conformity

They have a variety of different titles, but these analysts work with the IT and business groups to improve data quality and standardization.
 

Sign up to receive Data Mining Resource Alerts

March 15, 2004 (Computerworld) -- A customer is a customer is a customer, right? Actually, it's not that simple. Just ask Emerson Process Management, an Emerson Electric Co. unit in Austin that supplies process automation products. Four years ago, the company attempted to build a data warehouse to store customer information from over 85 countries. The effort failed in large part because the structure of the warehouse couldn't accommodate the many variations on customers' names.
For instance, different users in different parts of the world might identify Exxon as Exxon, Mobil, Esso or ExxonMobil, to name a few variations. The warehouse would see them as separate customers, and that would lead to inaccurate results when business users performed queries.
That's when the company hired Nancy Rybeck as data administrator. Rybeck is now leading a renewed data warehouse project that ensures not only the standardization of customer names but also the quality and accuracy of customer data, including postal addresses, shipping addresses and province codes.
To accomplish this, Emerson has done something unusual: It has started to build a department with six to 10 full-time "data stewards" dedicated to establishing and maintaining the quality of data entered into the operational systems that feed the data warehouse.
The practice of having formal data stewards is uncommon. Most companies recognize the importance of data quality, but many treat it as a "find-and-fix" effort, to be conducted at the end of a project by someone in IT. Others casually assign the job to the business users who deal with the data head-on. Still others may throw resources at improving data only when a major problem occurs.
"It's usually a seesaw effect," says Chris Enger, formerly manager of information management at Philip Morris USA Inc. "When something goes wrong, they put someone in charge of data quality, and when things get better, they pull those resources away."
Creating a data quality team requires gathering people with an unusual mix of business, technology and diplomatic skills. It's even difficult to agree on a job title. In Rybeck's department, they're called "data analysts," but titles at other companies include "data quality control supervisor," "data coordinator" or "data quality manager."
"When you say you want a data analyst, they'll come back with a DBA [database administrator]. But it's not the same at all," Rybeck says. "It's not the data structure, it's the content."
At Emerson, data analysts in each business unit review data and correct errors before it's put into the operational systems. They also research customer relationships, locations and corporate hierarchies; train overseas workers to fix data in their native languages; and serve as the main contact with the data administrator and database architect for new requirements and bug fixes.
As the leader of the group, Rybeck plays a role that includes establishing and communicating data standards, ensuring data integrity is maintained during database conversions and doing the logical design for the data warehouse tables. She'll also oversee implementation of the Group 1 Software Inc. data cleansing system and work with The Dun & Bradstreet Corp., whose database is used for company-name standardization and hierarchies.
The analysts have their work cut out for them. Bringing together customer records from the 75 business units yielded a 75% duplication rate, misspellings and fields with incorrect or missing data.
"Most of the divisions would have sworn they had great processes and standards and place," Rybeck says. "But when you show them they entered the customer name 17 different ways, or someone had entered, 'Loading dock open 8:00-4:00' into the address field, they realize it's not as clean as they thought."
Multitalented
Although the data steward may report to IT—as is the case at Emerson and at pharmaceuticals company Sanofi-Synthelabo Inc.—it's not a job for someone steeped in technical knowledge. Yet it's not right for a business person who's a technophobe, either.
What you need is someone who's familiar with both disciplines, like Seth Cohen. Cohen is the first data quality control supervisor at Sanofi in New York. He was hired a year ago to help design automated processes to ensure the data quality of the customer knowledge base that Sanofi was beginning to build.
Cohen has enough technical skills to be able to spec out a data-cleansing system and then work with a developer to make sure that the system is written correctly.
But having worked in the pharmaceuticals field for three and a half years, he also knows the industry's specific business rules and understands the most important data concerns that must be addressed during the requirements-gathering stage.
Data stewards should have business knowledge because they need to make frequent judgment calls, Cohen says. With Sanofi's data warehouse, for instance, if the system expects to get numbers in a field but gets a string of letters instead, Cohen must decide what's wrong and how to correct it.
Mary Pickett is another data steward who has a mix of skills. When she joined Winston & Strawn LLP, a law firm in Chicago, she considered herself a database specialist. Today, however, her title is "marketing applications specialist," and one of her primary duties is to ensure the quality of Winston & Strawn's contact database.
"Especially in this economy with people moving around, it's a highly charged, dynamic database that keeps changing," Pickett says. "If it sits for a month, it's dirty again."
Pickett prefers to train business users from within the company to keep the data clean. Likely candidates include paralegals or secretaries who manage contact lists for their practice groups. Still, she says, it takes a solid year for data clerks like them to gain the necessary experience to move up to data coordinator.
The reason: They need to learn not only how to sort through duplicate company names, make sure contact names are associated with companies and use the database's cleansing tools, but also how to prioritize which clients are the most important to work on. "We want to keep our top clients as clean as possible," Pickett says.
Perfection Unattainable
Indeed, judgment is a big part of the data steward's job—including the ability to determine where you don't need 100% perfection.
At OneSource Information Services Inc., a provider of business information products in Concord, Mass., orientation sessions include a speech on the inevitable dirtiness of data. But at the same time, says Beth Jacaruso, director of content management at OneSource, the company "lives and dies by data quality." So where do you draw the line? That's where data stewards come in: deciding what's "clean enough."
Cohen says that task is one of the biggest challenges of the job. "100% accuracy is just not achievable," he says. "Some things you're just going to have to let go or you'd have a data warehouse with [only] 15 to 20 records."
A good example is when Sanofi purchases data on doctors that includes their birth dates, Cohen says. If a birth date is given as Feb. 31 or the number of the month is listed as 13 but the rest of the data is good, do you throw out all of the data or just figure the birth date isn't all that important?
It comes down to knowing how much it costs to fix the data vs. the payback. "You can pay millions of dollars a year to get it perfect, but if the returns are in the hundreds of thousands, is it worth it?" asks Chuck Kelley, senior advisory consultant at Navigator Systems Inc., a corporate performance management consultancy in Addison, Texas.
Good Diplomats
Data stewards also need to be politically astute, diplomatic and good at conflict resolution—in part because the environment isn't always friendly. When Cohen joined Sanofi, some questioned why he was there. In particular, IT didn't see why he was "causing them so many headaches and adding several extra steps to the process," he says.
There are many political traps, as well. Take the issue of defining "customer address." If data comes from a variety of sources, you're likely to get different types of coding schemes, some of which overlap. "Everyone thinks theirs is the best approach, and you need someone to facilitate," says Robert Seiner, president and principal of KIK Consulting & Educational Services in Pittsburgh.
People may also argue about how data should be produced, he says. Should field representatives enter it from their laptops? Or should it first be independently checked for quality? Should it be uploaded hourly or weekly? If you have to deal with issues like that and "you're argumentative and confrontational, that would indicate you're not an appropriate steward," Seiner says.
Most of all, data stewards need to understand that data quality is a journey, not a destination. "It's not a one-shot deal—it's ongoing," Rybeck says. "You can't quit after the first task." Brandel is a freelance writer in Grand Rapids, Mich. Contact her at mary.brandel@comcast.net.

DATA STEWARD
Required Tech Skills
Continued...
1 | 2 | 3 | 4 | NEXT  



Print this Story Send Us Feedback E-mail this Story Digg! Digg this Story Slashdot this Story
Data Stewards Seek Conformity
Sidebar: Resume of a Data Steward
Sidebar: Data Stewards at a Glance
"Welcome to a special..." Read more...
"Ariel Meadow Stallings, the cute-as-a-button alterna-chick and Redmond employee who writes the Microspotting blog, gave away "I am the empire"..." Read more...
Read more Careers posts or See all Blogs
Microsoft's Ballmer ordered to testify in 'Vista Capable' suit
Spyware case finally closed for teacher Julie Amero
Bush's exit to put new e-records system to the test
More top stories...
Judgment favors Novell in ongoing SCO case
Symantec sees spike in dangerous Microsoft attacks
BlackBerry Storm buyers brave the cold in Boston to be first with the new smart phone
If you're like our 7,000 survey respondents, your paycheck this year has been flattened and your bonus obliterated. We offer 12 ways to plump up your paycheck.
Microsoft's next OS might more accurately be called Windows 6.5: It's essentially a better version of Vista.
Twitter can be a valuable business tool -- if you know what you're doing. Here's how to juice it for all it's worth.
By helping Intel with loosened 'Vista Capable' requirements, Microsoft 'severely damaged' its credibility, said an HP exec in a newly unsealed Feb. 2006 e-mail.
Get the latest news, reviews and more about Microsoft's newest desktop operating system
Find wage data for 50 IT job titles.
All Zones
Business Continuity Zone
The File Data Management Zone
Security Management Zone
The SAS Zone
Business Intelligence and Analytics Zone
The Enterprise Search Zone
Software as a Service Zone
The Security Zone

Ads by TechWords

See your link here
Speeding the time to intelligence
Get this Computerworld report free for a limited time, compliments of SAS.
Time To Intelligence -- a concept defining how long it takes to get accurate and timely information into the hands of workers who need it most. Do it slower than your competitors and your company is toast. Do it faster, you scorch them. Business Intelligence is the key to optimizing Time To Intelligence, and success there is a combination of people, policies, and technology.
Download this executive briefing download
Transforming Disaster Recovery - VMware Infrastructure for rapid, reliable and cost-effective Disaster Recovery
Download this white paper today!
(Source: VMware) VMware Infrastructure transforms disaster recovery by providing you fast, reliable and cost-effective disaster recovery. Why suffer from the slow, expensive and unreliable problems associated with traditional disaster recovery solution? VMware makes disaster recovery affordable through consolidation savings and re-use of existing servers for your disaster recovery site. Experience the speed of virtualization!
Download this white paper go
Turning information into a Competitive Advantage
Turning information into a Competitive Advantage
View this webcast now!
Go to the webcast 
White Papers
Read up on the latest ideas and technologies from companies that sell hardware, software and services.
Infoblox IP Address Management Solutions Brief
2008 Internet Malware Trends Report
Enterprise Findability Without the Complexity
View more whitepapers 

SAS Information Management Kit

SAS is the leader in business intelligence and analytical software and services. Only SAS offers leading data integration, storage, analytics and business intelligence applications within a comprehensive enterprise intelligence platform. SAS gives 97 of the top 100 companies in the 2007 Fortune 500 THE POWER TO KNOW®.

Webcast: The Information Management Roadmap
Imagine high-quality data, cleansed, analyzed and delivered throughout your organization. Join Computerworld, IT visionary Thornton May and a panel of experts to learn how SAS® can help you make it happen.

View this webcast 
Research Report: Information Management Initiatives at Midsize and Large Organizations
See the top-line results of this Computerworld sponsored survey to see how IT and business leaders are handling information management implementation.

Download this report 
White Paper: Information Management: Better Information for Winning Decisions.
This white paper explains how the SAS Information Evolution Model aids companies in assessing how they use this information to make strategic decisions and drive business.

Download this white paper