Juggling Content

When a user types a URL into the address window of a browser or clicks a link, routers on the Internet need only a second or two to take the user to where that information is stored. But ferreting out the specific documents and graphics - the content - and retrieving it from a busy Web site for display on the user's PC can take more time than some users want to spend.


Cache for The Cause

Load balancing can spread user demand over servers and data centers. It can even connect the user to the closest available server.

But traffic flow to end users can still be clobbered by congestion on the public Internet.

That's particularly true during Web rush hours when users log on to view or participate in a special event.

An example was Euro2000, Europe's championship soccer tournament held this spring in Brussels. Sportal Ltd. in London designed the infrastructure for the event and hosted the official site at www.euro2000.org.

To support the 1.4 billion hits between late March and mid-June - a level that put the site in the Guinness Book of Records as the most highly trafficked site in Internet history - Nial Pearson, Sportal's chief technology officer, designed a caching scheme.

Caches are special servers that hold the data that users request most from a given site.

Pearson says his firm set up "cache farms" in major cities across Europe with servers from CacheFlow Inc. in Sunnyvale, Calif. Sportal deployed Cisco Web switches at Sportal's main data center in Los Angeles and at a backup center in London and configured the servers at both data centers to redirect traffic to the caches.

That, Pearson says, made it possible to honor requests very quickly. Moreover, Pearson set his Web servers to automatically refresh the caches anytime data on the server changed, so content was always fresh. That's important because caches generally are set to fetch the most frequently requested Web pages at predetermined intervals, resulting in a mixture of old and new data in a cache.

And the fact that Euro2000 content was always served from the cache meant "there was almost no load on the servers in the data centers," according to Pearson.


Thousands of users could be looking for the same information or performing the same transactions at the same time and place. To ward off gridlock and delays, companies and service providers have turned to load balancing.

Load balancing, especially load balancing that uses Web switches, is "the glue that holds e-commerce together," says Ron Westfall, an analyst at Current Analysis Inc. in Sterling, Va.

Sometimes called Layer 7 switching or content switching, load balancing is a way of identifying what users coming to a Web site want and spreading those requests across multiple servers or data centers. Layer 7 refers to the routing of data packets based on information in the highest application layer of the Open Systems Interconnection network model.

The simplest form of load balancing employs a special-purpose computer that sits between a site's Web servers and the router that connects the site to the Internet. Software in the load balancer detects when one server is too busy to accommodate the incoming requests and switches the request to another server at the site. The process is seamless to the user.

As the amount of content housed on Web sites increases and the variety of applications used to create that content becomes more diverse, simple load balancing is no longer adequate. This is especially true for large name-brand sites such as Britannica.com Inc. in Chicago, which hosts a graphics-rich online encyclopedia and supports sophisticated search capabilities.

When Britannica deployed the site two years ago, CIO Doug Shuck says, the company was under no illusion it could control the performance of the Internet after content left its data centers and headed to users' desktops.

But Shuck says he wanted to minimize delays. "When we went online, besides the recognition of quality, we wanted to preserve the brand," says Shuck. "High availability was part of the design criteria for the site from the beginning."

"We started [building the infrastructure] with three points of distribution," says Shuck. "One was at our corporate facility in Chicago. Others were co-located at hosting centers in Sunnyvale, Calif. and Herndon, Va."

Servers at these centers are now set up in three tiers consisting of Web servers, application servers and database servers, Shuck says. This makes managing servers easier because like applications are located on the same machines. It also increases the processing power available for handling bursts of traffic.

Dynamic Balance

Shuck says he balances traffic across servers at each location using Big-IP, a brand of load balancer from F5 Networks Inc. in Seattle. He says the F5 equipment dynamically balances requests for content over the appropriate servers within each site. It detects server problems and shifts the load to "one or more machine sets [clusters of servers] per location," Shuck says.

To further enhance the user experience, Shuck set up a means to route user requests to the data center that will respond the quickest.

He placed 3-DNS Controllers, also from F5, at each data center. Shuck says the 3-DNS units look at where the user comes from and compare the response time for that interaction to the response times of servers at the other Britannica data centers.

Then, they automatically select the route and data center that are predicted to give the best performance. Next, F5's Big-IP load balancers take over to spread traffic across multiple sites. Alteon WebSystems Inc. and Cisco Systems Inc. have similar technology, but their products are built into load balancers, not sold as separate units.

Balancing loads across applications that are native to the Internet, such as those that handle e-mail and serve up HTML documents, is one thing. Balancing loads across different applications, however, requires a new breed of equipment called a Web switch.

Web switches look into the packets at Layer 7, which contains information on applications that interface with the network, says Stan Schatt, an analyst at Giga Information Group Inc. in Cambridge, Mass.

Mike Shoupe, a network engineer at PSINet Inc. a global network outsourcer and Web hosting company in Ashburn, Va., uses Web switches made by San Jose-based Alteon Websystems to balance loads across applications. Alteon agreed late last month to be acquired by Nortel Networks Corp. in Brampton, Ontario, for $7.8 billion.

Other companies, including Sportal Ltd., a sporting events hosting company in London, and Digex Inc., a Web hosting and application provider outsourcer in Beltsville, Md., use switches from San Jose-based Cisco.

Although Shoupe uses F5's Big-IP in some applications, he says F5's products were designed primarily to identify requests for Web content, such as HTML pages, and send those requests to the server that holds that content and is most available to provide it.

Shoupe says the more robust Web switches from Alteon and Cisco can read the Internet data packets contained in user requests to identify the application that processes that data and point the request to the servers on which the application resides.


Click on the image above to view the complete diagram.


That added intelligence of Web switches ensures a persistent session: a connection that stays intact until information has been retrieved or a transaction has been completed, says Charles Boyle, director of research and development at Digex, which hosts Web sites for large companies such as St. Louis-based Trans World Airlines Inc. and J. P. Morgan & Co. in New York.

Changing an IP address in the middle of a transaction, which Boyle says service providers such as America Online Inc. routinely do, can break the session, preventing the user from completing his purchase.

By reading cookies (small files on the user's PC) that uniquely identify the user and determining from data packets which applications are required to complete a transaction, the Web switch can track the user session and keep the user connected and the transaction process intact until it has been completed.

High-level switches become important when users need to balance traffic based on applications that haven't typically been used in a Web site environments, such as enterprise resource planning software from Germany's SAP AG or Pleasanton, Calif.-based PeopleSoft Inc.

Depending on configuration, prices for Web switches can run from $10,000 to more than $100,000.

The new Web switches can balance loads during e-commerce transactions that use Secure Sockets Layer connections to maintain privacy. Westfall says they can also be set up to manage load across application servers dedicated to mobile and wireless protocols, such as those that handle Web traffic through Wireless Application Protocol.

Copyright © 2000 IDG Communications, Inc.

Shop Tech Products at Amazon