Hot issues: Scalability and data integration

If you talk for a while with anyone dealing with enterprise data management issues, and the conversation will soon turn to two challenges at the top of every database manager's list: scalability and data integration. That's what happened when Computerworld interviewed a systems integrator and an IT executive:

  • Jay M. Desai, a co-founder of Chicago-based Knightsbridge Solutions LLC, a consulting and systems integration firm that helps corporate clients solve data problems; and
  • Timothy Wright, chief technology officer at Terra Lycos SA, a global Internet portal based in Barcelona, Spain, and Waltham, Mass.

Q: What are the top issues in data management?
The issue we see over and over again is large-scale data integration. Large companies have huge quantities of data, whether it's internal data or they're taking external data from a syndicator. The processing requirements are high, and the timetables to get things done are pretty tight. So most IT departments are wrestling with implementing scalable, high-performance, robust systems for decision support.
Scalability and performance can't be done as an afterthought. You need to think about a scalable foundation from the very onset [of building a system]. And understand that it doesn't have an end state -- it's an environment that is constantly growing.
Wright: We have 52 million registered users in the U.S. For authentication purposes, it's important that when somebody hits the site, they don't wait for 10 seconds to be authenticated against that database. So when you go to, if we've seen you before, we've already found you in the authentication database before the page appears. When you hit the [personalized] MyLycos page where you've registered your profile, that page has reached out to the database, it's found who you are, it's found your city and the stock quotes you like to get ... and it has built the page according to those specifications in under three seconds.
So we have significant scalability challenges. We have over 10 million e-mail messages on a daily basis. These applications run on a massive scale that very few organizations ever face. My Web logs for the U.S. are 150GB a day. And I've got to parse all of that data and generate a report by the following morning.

Q: How do you make all that work?
Panic is a strategy that's high on my list.
We have built applications which are massively scalable. We have in excess of 50TB of storage on the network. Half of it we outsource to StorageNetworks and they do a phenomenal job. The rest of it is managed in-house.

Q: How do you deal with integrating the data from business acquisitions?
It's on a case-by-case basis. Every acquired business has different skills built into their organization. I've done 42 acquisitions in the last seven years, and the best thing you can do is cherry-pick the strongest technology base out of each acquisition that you do.
We've consciously built the tools necessary to make these integrations go as smoothly as possible.
Sometimes it isn't clear there's a benefit to fully integrating databases. Merging those databases has to return a business value ... and sometimes it isn't cost-justified. You don't do integration for its own sake. You do it when the sum of the components is worth more than the individual components on their own.

Q: What are the special challenges when the databases have an international flavor?
As a global Internet company, we've built applications that are very effective at intercommunication [with systems in different countries]. It's easy for us to load new products or subscription services into our network. We've built these applications to be largely agnostic to the provisioning systems in different countries. Let me tell you, I've never seen a more complex tax code than the one in Brazil. But all I have to do is to build the interface between [our network] and the provisioning system in Brazil. And when I build that interface once, I'm not building it again; I'm just going to plug into that interface. If nothing else, we're experts at building effective [application programming interfaces] that allow us to cross-link products effectively.

Special Report


Taming Data Chaos

Stories in this report:

Copyright © 2002 IDG Communications, Inc.

It’s time to break the ChatGPT habit
Shop Tech Products at Amazon