Computerworld - ETL stands for extract, transform and load, the processes that enable companies to move data from multiple sources, reformat and cleanse it, and load it into another database, a data mart or a data warehouse for analysis, or on another operational system to support a business process.
Companies know they have valuable data lying around throughout their networks that needs to be moved from one place to anothersuch as from one business application to another or to a data warehouse for analysis.
The only problem is that the data lies in all sorts of heterogeneous systems, and therefore in all sorts of formats. For instance, a CRM system may define a customer in one way, while a back-end accounting system may define the same customer differently.
The data used in ETL processes can come from any source: a mainframe application, an ERP application, a CRM tool, a flat file, an Excel spreadsheeteven a message queue.
Pulling the Data
Extraction can be done via Java Database Connectivity, Microsoft Corp.'s Open Database Connectivity technology, proprietary code or by creating flat files, says Mike Schiff, an analyst at Current Analysis Inc., a Sterling, Va.-based consultancy.
After extraction, the data is transformed, or modified, depending on the specific business logic involved so that it can be sent to the target repository.
There are a variety of ways to perform the transformation, and the work involved varies. The data may require reformatting only, but most ETL operations also involve cleansing the data to remove duplicates and enforce consistency. Part of what the software does is examine individual data fields and apply rules to consistently convert the contents to the form required by the target repository or application, says Schiff.
For example, the category "male" might be represented in three different systems as M, male and 0/1. The ETL software would recognize that these entries mean the same thing and convert them to the target format.
In addition, the ETL process could involve standardizing name and address fields, verifying telephone numbers or expanding records with additional fields containing demographic information or data from other systems.
Harriet Fryman, group director of product marketing at data warehousing vendor Informatica Corp. in Redwood City, Calif., offers an example. Say, for instance, that a customer runs Oracle financials, PeopleSoft human resources software and SAP manufacturing applications and needs to access the data in each of these systems to complete an order-to-cash process. This will require the company's ETL software to extract data from the originating systems, which isn't as easy as it sounds in some instancesfor example, pulling data from the SAP manufacturing application would require the generation of SAP proprietary ABAP code to extract the shipping and open purchase-order information.
The transformation occurs when the data from each source is mapped, cleansed and reconciled so it all can be tied together, with receivables tied to invoices and so on.
- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- The value of smarter oil and gas fields With global energy requirements continuing to rise, the exploration, development and production of new oil and gas resources are shifting to increasingly challenging...
- Smarter Environmental Analytics Solutions: Offshore Oil and Gas Installations Example This IBM Redbooks® Solution Guide describes a solution for implementing smarter environmental monitoring and analytics for oil and gas industries. The solution implements...
- Piecing Together the Business Intelligence Puzzle Business intelligence (BI) technology collects and analyzes company data, delivering relevant information to corporate decision-makers in an effort to produce favorable outcomes.
- Harness IT -- An Introduction to Business Intelligence Solutions Learn the key selection criteria required to provide your organization with the capability to address structured data, unstructured data and mobile demands so...
- Live Webcast Best Practices for the Hyperconverged Enterprise Network To the Age of Constant Connectivity and Information overload
- Live Webcast On-demand webinar: "Mobility Mayhem: Balancing BYOD with Enterprise Security" Check out this on-demand webinar to hear Sophos senior security expert John Shier deep dive into how BYOD impacts your enterprise security strategy...
- Live Webcast Endpoint Backup & Restore: Protect Everyone, Everywhere Arek Sokol from the bleeding-edge IT team at Genentech/Roche explains how he leverages cross-platform enterprise endpoint backup in the public cloud as part...
- The Software-Defined Data Center: Is your ADC ready? Data center transformation is accelerating beyond virtualization to next-generation cloud architectures and software-defined data centers, bringing new challenges for application performance, scalability and...
- Application Acceleration: Optimize the End-User Experience Watch this on-demand webcast and learn how you can optimize your web content, accelerate performance across any device and browser combination, and offload...