If you talk for a while with anyone dealing with enterprise data management issues, and the conversation will soon turn to two challenges at the top of every database manager's list: scalability and data integration. That's what happened when Computerworld interviewed a systems integrator and an IT executive:
- Jay M. Desai, a co-founder of Chicago-based Knightsbridge Solutions LLC, a consulting and systems integration firm that helps corporate clients solve data problems; and
- Timothy Wright, chief technology officer at Terra Lycos SA, a global Internet portal based in Barcelona, Spain, and Waltham, Mass.
Q: What are the top issues in data management?
Desai: The issue we see over and over again is large-scale data integration. Large companies have huge quantities of data, whether it's internal data or they're taking external data from a syndicator. The processing requirements are high, and the timetables to get things done are pretty tight. So most IT departments are wrestling with implementing scalable, high-performance, robust systems for decision support.
Scalability and performance can't be done as an afterthought. You need to think about a scalable foundation from the very onset [of building a system]. And understand that it doesn't have an end state -- it's an environment that is constantly growing.
Wright: We have 52 million registered users in the U.S. For authentication purposes, it's important that when somebody hits the site, they don't wait for 10 seconds to be authenticated against that database. So when you go to Lycos.com, if we've seen you before, we've already found you in the authentication database before the page appears. When you hit the [personalized] MyLycos page where you've registered your profile, that page has reached out to the database, it's found who you are, it's found your city and the stock quotes you like to get ... and it has built the page according to those specifications in under three seconds.
So we have significant scalability challenges. We have over 10 million e-mail messages on a daily basis. These applications run on a massive scale that very few organizations ever face. My Web logs for the U.S. are 150GB a day. And I've got to parse all of that data and generate a report by the following morning.
Q: How do you make all that work?
Wright: Panic is a strategy that's high on my list.
We have built applications which are massively scalable. We have in excess of 50TB of storage on the network. Half of it we outsource to StorageNetworks and they do a phenomenal job. The rest of it is managed in-house.
Q: How do you deal with integrating the data from business acquisitions?
Wright: It's on a case-by-case basis. Every acquired business has different skills built into their organization. I've done 42 acquisitions in the last seven years, and the best thing you can do is cherry-pick the strongest technology base out of each acquisition that you do.
We've consciously built the tools necessary to make these integrations go as smoothly as possible.
Sometimes it isn't clear there's a benefit to fully integrating databases. Merging those databases has to return a business value ... and sometimes it isn't cost-justified. You don't do integration for its own sake. You do it when the sum of the components is worth more than the individual components on their own.
Q: What are the special challenges when the databases have an international flavor?
Wright: As a global Internet company, we've built applications that are very effective at intercommunication [with systems in different countries]. It's easy for us to load new products or subscription services into our network. We've built these applications to be largely agnostic to the provisioning systems in different countries. Let me tell you, I've never seen a more complex tax code than the one in Brazil. But all I have to do is to build the interface between [our network] and the provisioning system in Brazil. And when I build that interface once, I'm not building it again; I'm just going to plug into that interface. If nothing else, we're experts at building effective [application programming interfaces] that allow us to cross-link products effectively.
Taming Data Chaos
Stories in this report:
- Taming Data Chaos
- The Story So Far
- Merging Data Silos
- Beware of Data Overload from External Data
- Learn to Manage Data, Not Crises
- Data's Tower of Babel
- Extracting Dollars From Data
- Why ROI is so Elusive
- Collections of Data: Bases, Marts, Warehouses
- The Power of Location
- Seeding for Data Growth
- The Search is On
- The Data Designers
- Demise of the Disk Era
- Dawn of a New Database
- Keeping CFOs Happy
- Case Studies in Data Management
- Hot Issues: Scalability and Data Integration