Bridging Data Islands
Integrating data from disparate sources provides companies with powerful management tools, but the process can be difficult, costly and error-prone.
October 14, 2002 12:00 PM ETComputerworld - After just two months, a new software tool enabled Aventis Pharmaceuticals Inc. to discover a promising candidate for a new drug to treat asthma, arthritis or even perhaps cancer; it's a chemical compound that might well have been overlooked using traditional IT tools.
Aventis is using DiscoveryLink, a feature of IBM's DB2 database management system that can propel a single SQL query out to multiple, heterogeneous data sources and bring information back to the user in one coherent view.
"Using this integrated framework, scientists were able to pull data from many different sources around the world, visualize it in a new way that they could never do before," says Peter Loupos, vice president for drug innovation and approval information systems at the Bridgewater, N.J.-based company.
IBM calls the Aventis approach to information integration "database federation." To get at federated data, DB2 uses IBM "wrapper" software called DataJoiner and Relational Connect. There's one wrapper for each type of data sourcewhether it's from Oracle Corp. or Sybase Inc. systems, Microsoft Corp. SQL Server or flat fileswith each one mapping the source data model to the DB2 data model. A single Aventis query may be sent against heterogeneous relational databases, unstructured documents and in-house expertise culled from e-mail and other sources.
Sending a SQL query against remote, heterogeneous databases is just one of several ways to integrate data. Others include the following:
- Custom, hard-wired interfaces that pass information from one application to another. These can be made to work exactly as users demand, but they can be costly to set up and maintain.
- Replication, in which a commercial product regularly or continuously copies databases or parts of databases from one place to another. Replication is simple but limited in its ability to do anything to data beyond copying it.
- Extract, Transform and Load (ETL), a process often used to create data warehouses and data marts. ETL software moves data from one place to another, applying rules or table lookups to combine or transform data in some way. ETL is powerful but can be very complex.
- Web services. Enabled by Internet protocols including the XML standard for exchanging data between disparate systems, Web services allow SQL-based relational data to be accessed as XML, or native XML to be accessed through SQL. Web services are ideal when applications are loosely coupled and difficult to integrate in other ways.
Regardless of the approach taken, data integration can be difficult, expensive and error-prone. In particular, great care must be taken to build interfaces between applications and databases that ensure accuracy and timeliness of information and that answer the needs of disparate communities of end users. Below, we look at how two organizations tackled their data integration problems, and you can read two more case studies online .
Business Intelligence
Additional Resources



Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.
White Papers & Webcasts
IDC Research Report: The Business Value of Consolidating on Energy-Efficient Servers
Download this Resource Now!
HP Technology Guide for Scalable Business Solutions
Download This Resource Now!
Architecting Business Intelligence Applications for Change: The Open Solution
Register for this webcast today!
Clipper Group Report: HP Provides Enhanced Options for Data Center
Download this Resource Now!
Enterprise Data Governance: Bridging the Business-IT Gap
Register for this live webcast today!
Technology Brief: Technologies in HP ProLiant G6 c-Class server blades with Intel Xeon processors
Download this Resource Now!
Informatica 9 Launch: Transform your Business. Transform your world.
Business and IT will finally be on the same page. Data quality issues will be a thing of the past. The promise of...
Introducing the HP ProLiant G6 servers
Download this Resource Now!
Lower IT Costs with Oracle Database 11g Release 2
Register for this webcast now!

