Catastrophes and contingencies

For years, companies have prepared for the worst. And that was before Sept. 11, 2001. In the wake of that tragedy, perceptions of how to safeguard corporate data have changed. The lessons learned from the World Trade Center catastrophe have formed the blueprints that life science companies must adopt to ensure survival after a disaster.

The key shift in disaster-recovery planning since Sept. 11 is termed business continuance planning, which builds upon the time-honored, back-up-everything-on-tape disaster-recovery method.

In the event of a disaster, business continuance planning looks for ways to quickly restore day-to-day operations and prevent the loss of critical company data. Business continuance planning also strives to improve the efficiency of normal day-to-day operations, providing methods to smooth out small glitches that can lead to loss of data, productivity, or business.

While backing up data to tapes that are stored securely offsite remains essential, business continuance planning relies on additional methods to facilitate disaster recovery.

Data mirroring—where data are replicated to a second live data center—is a crucial component of a thorough business continuance plan. The idea is to have a copy of all vital data online at a second site. This does not preclude backing up the data to tapes—indeed, the combination of mirroring and tape backup inspires confidence that no data will be lost in a major disaster.

"Backing data up to tapes on a daily basis is still our safety net, but we were concerned that the time required to retrieve data from offsite storage after a major disaster would be significant," says David O'Neill, network administrator at a specialty drug manufacturer, which O'Neill did not want named. "That's one reason we looked at replicating data to multiple sites. It ensures that data are always available to our staff."

But data mirroring is not cheap. "Tapes are still relatively inexpensive compared to disk drives," O'Neill says. "Data mirroring requires almost double the disk space as is required to store the original data."

One approach to data mirroring is real-time data replication to offsite storage devices, which has the advantage of virtually no data loss if the primary site is out of commission. The downside, however, is that moving large files—medical images, for example—throughout the business day could clog WAN connections and retard applications that access the data. Alternatively, data can be copied to offsite storage systems during down times, when network usage is not as high.

In either case, technology can help expedite the data transfer process. Data caching and compression are commonly applied to files before offsite transfers. Products from companies such as Peribit Networks Inc. reduce the amount of data that needs to be sent between two locations. Peribit's tools use algorithms that search data for patterns and build tables that store identifying labels. Any time the same patterns are detected after the labeling process takes place, only the labels (and not the actual data) are transmitted to the secondary site.

Procedures count

One lesson learned after Sept. 11 was that business continuance requires more than brute force technical solutions. For example, some large companies affected by the World Trade Center collapse that also had secondary data centers in other cities discovered that the existence of those centers was not enough to restore normal operations.

"There was always the assumption that if a major data center was lost, you could always put IT staff on an airplane and get them out to a backup data center, so they could get vital business systems up and running," says Raymond Lopez, a consultant at Rosewall and Associates, an IT consulting firm. "This was not possible for several days after September 11, since domestic travel was completely shut down." Of course, this assumes that IT staff would be available to travel. Several companies lost many or all of their IT personnel on Sept. 11.

But correctly following backup procedures does not insure that all vital data will be preserved, either. "Firms were more vulnerable than expected to the amount of data still stored on paper or on users' desktops," says Nicholas Parks, an analyst at TowerGroup, a global services research firm.

For these reasons, Parks says, business continuance planning "now considers human and environmental factors in a disaster, with the goal of continuous availability and performance of the business, rather than just the restoration of operations in hours or days following a catastrophe."

Lopez agrees: "Disaster recovery is as much about developing policies and practices and putting those things into place, as it is installing hardware and software systems."

Life science companies would do well to follow the procedures implemented by the financial industry post-Sept. 11. For instance, to minimize the loss of crucial business information stemming from damage to papers or desktop computers, many companies are scanning paper documents as soon as they are created or received. Additionally, many comanies are limiting the amount of data an employee can store locally on his or her desktop, transferring more data to networked storage systems that are regularly backed up.

Planning costs

"Preparation is the key to ensure that businesses can quickly rebound after a disaster," says Tony Adams, principal analyst at Gartner's IT services group. "Businesses now more widely understand that they must prepare in advance to meet the challenges of a disaster."

But many companies are finding they don't have the money to get started. In a recent Gartner survey of 205 IT managers, 24% of the respondents said that lack of funds was preventing implementation of a disaster-recovery plan. One in three companies even admitted they would lose critical data or operational capability if a disaster occurred. And 37% indicated they needed additional funding to carry out their disaster-recovery plan.

The irony here is that if companies don't spend money in advance on disaster-recovery planning, they will spend far more after a disaster. Knowing this, some life science managers are justifying the costs of disaster-recovery planning and implementation by showing upper management tangible savings that such systems and procedures bring in other areas. "I try to show my CIO that any investment in disaster recovery saves us money in our day-to-day operations," says Charles Mitchell, director of IT at a New Jersey pharmaceutical company research center. "The same systems and procedures that would give us access to data in a disaster can be used to improve normal systems availability."

Mitchell notes that disaster-recovery data-mirroring techniques can also be used to access data during routine maintenance or backup of storage systems. "The selling point isn't, 'Let's spend all this money in case of catastrophe,' it's more, 'Here's what we need to keep everyone happy during normal conditions, and, by the way, for no additional money we cover our [behinds] in case of a major disaster.'"

Mitchell is not alone in employing this strategy. "Developing a [return on investment] proposal for security or disaster-recovery systems is difficult," Lopez says. "But showing management that investments in IT systems will improve the resiliency of existing systems and reduce day-to-day operational costs makes a much stronger case."

Most companies will thankfully never experience a disaster akin to Sept. 11. But virtually all companies experience short-term disruptions of their systems and consequently are developing plans that deal with mundane problems to reduce downtime.

Life science IT managers need to ensure the availability and resiliency of their systems. Virtually all companies have systems in place to keep the business running when minor glitches, such as power disruptions, play havoc on operations.

Question of survival

In a survey of 163 business managers by the consultancy Contingency Planning Research and Contingency Planning & Management magazine, 7% of companies said their survival would be at risk if systems went down for one hour. An additional 17% would be at risk if the outage lasted one business day.

"We could probably survive an outage for several hours or a day or so," says Jon Benson, network systems administrator at Neurome, which studies gene-expression patterns in brain function and diseases. But it would be highly disruptive. "We do data acquisition at night where the data is written to [storage] by robotic microarray systems, and we do data analysis during the day," Benson says. With company systems running 24 hours a day, "lost time means lost opportunity. It would also hold up research for our customers." Benson uses APC's InfraStruXure system for power protection.

Common problems can provide useful ways to justify the cost of business continuance planning. Take, for example, overheating. When data centers housed primarily mainframes, it was assumed that cooling requirements were uniform throughout the center. Today, most data centers are built around racks of high-performance equipment, which instead create "heat islands."

Surprisingly, however, few companies are searching for more efficient ways to cool their data centers, according to Hewlett-Packard Co. Many apparently believe that cooling will draw more attention as higher-performance, high-density systems are deployed. Hewlett-Packard Laboratories is working on more intelligent cooling and believes it can cut data center energy consumption by 25%, perhaps saving a company with an average data center about $1 million annually.

Improving the normal operations of a data center can yield quantifiable savings in operational costs. This in turn can help an IT manager make a case for investing in systems that will save a company in the event of the unthinkable.

Hidden Costs of Downtime

Downed systems cost companies money. Here are some points to consider when trying to cost-justify a disaster-recovery system.

Lost productivity

(hard dollars)

Multiply the average hourly wage of all employees affected by a disaster by the total number of employees idled by the length of the outage.
Lost work
(hard dollars)
If the data from a completed experiment is lost, multiply the hours required to redo the experiment by the number of employees who conducted the experiment times the average hourly wage for these employees.
Restoration costs
(hard dollars)
Multiply the average hourly wage of the IT staff restoring the systems by the number of staff members performing the restoration times the hours to complete it.
Lost revenue
(hard dollars)
If your company manufactures drugs and you cannot fulfill orders, multiply the revenue each drug brings in per day by the number of days of the outage.
Customer/Investor confidence
(soft dollars)
Hard to calculate, but take into account customer and investor reaction if company cannot quickly recover from a disaster.
Contractual commitments
(possible hard dollars)
Check contracts with business partners to see if there are any penalties for failing to meet contractual commitments.

Source: Bio·IT World

Copyright © 2003 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon