Skip the navigation

Too Much ETL Signals Poor Data Management

By Ken Karacsony
September 5, 2005 12:00 PM ET

Computerworld - To put it bluntly, performing extensive extract, transform and load (ETL) processes is a symptom of poorly managed data and a fundamental lack of a cogently developed data strategy. When data is managed correctly as an enterprise asset, then ETL is significantly reduced and in many cases completely
eradicated. Now, I realize that this is a provocative statement, but in my estimation, ETL is overused within the IT community, leading to inefficiency and unnecessary expense.
ETL gained popularity as companies began to outgrow antiquated systems. As functionality was moved from legacy systems to open systems architectures, ETL played an indispensable role in moving the data. Unfortunately, many companies failed to completely retire their outdated systems; rather than performing ETL as a one-time initial load event, ETL evolved into a part of daily operations.
This problem was further exacerbated as companies developed systems within functional silos. The application-specific approach, in which the database is designed to accommodate the needs of an individual group or department, took root. According to this methodology, every new system requires its own database. As a result, data is copied from system to system. Hence, ETL is now firmly ensconced in nearly every company and is an integral part of IT operations.
Consider a simplified example of typical ETL activities, in which data is propagated from the product system into warranty, finance, purchasing and sales systems, and eventually into the data warehouse. Not only is the data extracted and loaded, but it must also be transformed because the data structures between systems are completely disparate.
This problem is compounded when the data is propagated back to the source system in order to synchronize the data that's no longer synchronized between systems precisely because it was copied. The inevitable result is poor data quality and high maintenance costs.
If the product database in this example changes -- for instance, if a new field or table is added -- it will be necessary to change all of the maps that move data from the source to a target. One minor structural change in the source can create a maintenance nightmare in the ETL maps and target databases -- a lot of IT expense with no value added.
The only legitimate ETL activity in this example is the data warehouse interface. All others are unnecessary and incur a tremendous cost. According to Larry English, president of Information Impact International and a leading expert in information quality, "The IS staff is busy maintaining, on average, a tenfold [increase in] redundant databases and the redundant applications or interface programs



Our Commenting Policies
Internet of Things: Get the latest!
Internet of Things

Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!