Computerworld
Quick Menu
Search



Ads by TechWords

See your link here


Subscribe to our e-mail newsletters
For more info on a specific newsletter, click the title. Details will be displayed in a new window.
Data Management
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
More E-Mail Newsletters 
Computerworld 2007Subscribe to Computerworld
40 years of the most authoritative source of news and information for IT leaders.

QuickStudy: Extract, Transform and Load (ETL)

 

Sign up to receive Data Warehousing Resource Alerts

February 2, 2004 (Computerworld) -- ETL stands for extract, transform and load, the processes that enable companies to move data from multiple sources, reformat and cleanse it, and load it into another database, a data mart or a data warehouse for analysis, or on another operational system to support a business process.




Companies know they have valuable data lying around throughout their networks that needs to be moved from one place to another—such as from one business application to another or to a data warehouse for analysis.

The only problem is that the data lies in all sorts of heterogeneous systems, and therefore in all sorts of formats. For instance, a CRM system may define a customer in one way, while a back-end accounting system may define the same customer differently.


More
Computerworld
QuickStudies


To solve the problem, companies use extract, transform and load (ETL) software, which includes reading data from its source, cleaning it up and formatting it uniformly, and then writing it to the target repository to be exploited.

The data used in ETL processes can come from any source: a mainframe application, an ERP application, a CRM tool, a flat file, an Excel spreadsheet—even a message queue.

Pulling the Data

Extraction can be done via Java Database Connectivity, Microsoft Corp.'s Open Database Connectivity technology, proprietary code or by creating flat files, says Mike Schiff, an analyst at Current Analysis Inc., a Sterling, Va.-based consultancy.

After extraction, the data is transformed, or modified, depending on the specific business logic involved so that it can be sent to the target repository.

There are a variety of ways to perform the transformation, and the work involved varies. The data may require reformatting only, but most ETL operations also involve cleansing the data to remove duplicates and enforce consistency. Part of what the software does is examine individual data fields and apply rules to consistently convert the contents to the form required by the target repository or application, says Schiff.

For example, the category "male" might be represented in three different systems as M, male and 0/1. The ETL software would recognize that these entries mean the same thing and convert them to the target format.

In addition, the ETL process could involve standardizing name and address fields, verifying telephone numbers or expanding records with additional fields containing demographic information or data from other systems.

Harriet Fryman, group director of product marketing at data warehousing vendor Informatica Corp. in Redwood City, Calif., offers an example. Say, for instance, that a customer runs Oracle financials, PeopleSoft human resources software and SAP manufacturing applications and needs to access the data in each of these systems to complete an order-to-cash process. This will require the company's ETL software to extract data from the originating systems, which isn't as easy as it sounds in some instances—for example, pulling data from the SAP manufacturing application would require the generation of SAP proprietary ABAP code to extract the shipping and open purchase-order information.

The transformation occurs when the data from each source is mapped, cleansed and reconciled so it all can be tied together, with receivables tied to invoices and so on.

Continued...
1 | 2 | NEXT  



Print this Story Send Us Feedback E-mail this Story Digg! Digg this Story Slashdot this Story
QuickStudy: ETL
"Yahoo's owners have spoken and Jerry Yang is out as Yahoo CEO. Does this mean that Microsoft is in?..." Read more...
"In five years, IT will still be a viable career path, but it will no longer exist as the stand-alone..." Read more...
Read more Business Intelligence posts or See all Blogs
Microsoft's Ballmer ordered to testify in 'Vista Capable' suit
Spyware case finally closed for teacher Julie Amero
Bush's exit to put new e-records system to the test
More top stories...
Judgment favors Novell in ongoing SCO case
Symantec sees spike in dangerous Microsoft attacks
BlackBerry Storm buyers brave the cold in Boston to be first with the new smart phone
If you're like our 7,000 survey respondents, your paycheck this year has been flattened and your bonus obliterated. We offer 12 ways to plump up your paycheck.
Microsoft's next OS might more accurately be called Windows 6.5: It's essentially a better version of Vista.
Twitter can be a valuable business tool -- if you know what you're doing. Here's how to juice it for all it's worth.
By helping Intel with loosened 'Vista Capable' requirements, Microsoft 'severely damaged' its credibility, said an HP exec in a newly unsealed Feb. 2006 e-mail.
Get the latest news, reviews and more about Microsoft's newest desktop operating system
Find wage data for 50 IT job titles.
All Zones
Business Continuity Zone
The File Data Management Zone
Security Management Zone
The SAS Zone
Business Intelligence and Analytics Zone
The Enterprise Search Zone
Software as a Service Zone
The Security Zone

Ads by TechWords

See your link here
The Business Value of Tape Storage
Download this complimentary Technology Briefing today!
(Source: Dell) Reliable. Long-lived. Portable. Affordable. Energy-stingy. These and many other attributes of tape storage are virtually assuring a continued market for tape storage, a conclusion reached by IDC and many others. Combined with other storage solutions, tape is an integral part of any data archiving strategy.
Download this executive briefing download
Quick Sizing Guide for SAS Grid Running on HP BladeSystems and EVA Storage
Download this white paper today!
(Source: HP) Designed for CIOs, IT managers, data center managers and grid computing architects seeking to improve performance, SAS Grid Computing on the HP BladeSystem c-Class helps accelerate growth and mitigate risks with a simplified, consolidated infrastructure that's agile enough to efficiently handle change. SAS Grid Manager on HP BladeSystem can lower costs through automation, virtualization and improved IT efficiency.
Download this white paper go
Turning information into a Competitive Advantage
Turning information into a Competitive Advantage
View this webcast now!
Go to the webcast 
White Papers
Read up on the latest ideas and technologies from companies that sell hardware, software and services.
Infoblox IP Address Management Solutions Brief
2008 Internet Malware Trends Report
Enterprise Findability Without the Complexity
View more whitepapers 

SAS Information Management Kit

SAS is the leader in business intelligence and analytical software and services. Only SAS offers leading data integration, storage, analytics and business intelligence applications within a comprehensive enterprise intelligence platform. SAS gives 97 of the top 100 companies in the 2007 Fortune 500 THE POWER TO KNOW®.

Webcast: The Information Management Roadmap
Imagine high-quality data, cleansed, analyzed and delivered throughout your organization. Join Computerworld, IT visionary Thornton May and a panel of experts to learn how SAS® can help you make it happen.

View this webcast 
Research Report: Information Management Initiatives at Midsize and Large Organizations
See the top-line results of this Computerworld sponsored survey to see how IT and business leaders are handling information management implementation.

Download this report 
White Paper: Information Management: Better Information for Winning Decisions.
This white paper explains how the SAS Information Evolution Model aids companies in assessing how they use this information to make strategic decisions and drive business.

Download this white paper