Preserving Web history
Computerworld -
With 60 Web sites, 20,000 Web pages and approximately 100 page changes per month to manage, you would think that Chris Strout wouldn't dwell on the past. But Strout, Web site manager at Chicago-based insurance brokerage Aon Corp., says that preserving historical Web site information is critical to meeting his company's regulatory obligations.
"We've had some compliance issues with the SEC where they've said, 'Information we're looking for is not on the site. Where is it? Has it been on the site in the past?' " he says. Using TeamSite, a content management tool from Sunnyvale, Calif.-based Interwoven Inc., Strout says, he can show where the requested content appeared at a given time - and how users navigated to it.
Regulatory compliance is just one reason to maintain access to historical Web site information, corporate archivists say. Information in Web archives can also provide critical evidence to protect a company in legal matters, or allow the marketing department to look back at previous online marketing efforts to see how the company presented itself and its products over time.
Unfortunately, in many organizations, Web site content - including the original context, look and feel - is disappearing into oblivion.
"This could be a period that is relatively undocumented, given the amount of information that's out there," laments Bruce Bruemmer, corporate archivist at Cargill Inc. in Wayzata, Minn. "We're going to just lose a lot of information."
Part of the problem is complexity: How do you archive continually changing Web sites with thousands of pages that include active content and dynamically generated page elements? Many organizations avoid that question and just try to get the basics. A simple tool like Adobe Systems Inc.'s Acrobat can encapsulate static Web page content and maintain active hyperlinks within searchable Portable Document Format (PDF) images. On the high end, content management tools from companies such as Pleasanton, Calif.-based Documentum Inc. can provide more detailed snapshots of previous Web site content. Interwoven's TeamSite virtualization engine can even re-create historical application servers, JavaServer Pages, Extensible Style Language style sheets and other code. But IT must weigh the cost of such systems, which can easily run into six figures.
Getting the Basics
Bruemmer takes a piecemeal approach to archiving. "Right now, I'm in the sticks and stones era," he says. "If there's Web content I want to capture, I'll capture it and move it off-line on CD-ROMs as PDFs."
He recommends establishing policies for determining what content should be archived and for how long. But applying such policies is
Web Site Management
Additional Resources



White Papers & Webcasts
Return on Information: Google Enterprise Search pays you back
Download this whitepaper showing how Google Enterprise Search boosts your bottom line.
Key Strategies for Managing Data Growth
What are you storage challenges?
Case Study: Live Nation and Citrix NetScaler
When Live Nation spun off from Clear Channel Communications it urgently needed to consolidate nearly 100 different Web sites.
Extending Client Refresh - 11 Steps to Maximize Savings
Register Now!
Data Manager Report Excerpt: File System Inventory
Cut storage costs and boost operational efficiencies.
Lower the Cost and Complexity of a Mobile Workforce through Automation
Download This Resource Now!
Reducing Storage Costs with F5 ARX
Save money- deploy ARX Solutions.
Managing Mobility: Improve Data Security, Compliance and Manageability
Download This Resource Now!
Southern Company
Download Now
Consolidate Your Servers and Storage to Lower Costs with Oracle Database 11g
Register for this webcast!
