Ads by TechWords

See your link here
Receive the latest technology news and information.
Security
Disaster Recovery
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

A business continuity checklist

April 19, 2004 12:00 PM ET

Computerworld - It's spring -- time we turn the clocks ahead. For us, it's also a reminder that it's time for companies to plan for business continuity.

Why? One client whose IT environment comprised a mix of operating systems had not established a single time source. Systems ran just fine. But when a failure took place, the effects -- though apparent throughout the infrastructure -- couldn't be correlated, because the system clocks weren't synchronized.

As IT staffs continue to patch and modify heterogeneous operating systems, the variety and complexity of error conditions multiplies for business continuity. Disaster recovery -- restoring data and returning to a functional state -- is well understood. However, we've found that many companies may not completely understand the complexity of business continuity.

Before the alarm bell rings
The good news is that business continuity and disaster recovery share the same infrastructure. But many companies don't plan or execute effectively because they don't follow operational best practices. The IT Infrastructure Library offers a definitive approach to disaster recovery and business continuity. We frequently bring the following elements to our clients' attention:

1. Establish a service-level agreement.
All disaster recovery and continuity work begins with agreement on what matters most to the business. For example, if access to a trading-floor application is lost for 15 minutes, the financial effect can be tremendous. This agreement forms the basis for service-level agreements (SLA) about IT performance.

SLAs should be more than availability definitions. The familiar target measured in number of "9s" of availability is often chosen without thought for much more than uptime. In addition to performance and service outages, SLAs must include application updates, release-schedule guarantees, even patch management activity -- all of which factor into systems' continuous operation.

Don't oversimplify this aspect of planning. Consider that the components of the application infrastructure fit into the availability scheme: An application instance is supported by the presentation-layer, application-layer and data-layer servers.

In all practicality, an SLA of 100% -- though improbable -- won't mean that every component in the environment requires 100% uptime. Instead, a service-level objective should be defined for each component, relative to other components, so that overall environmental performance delivers the agreed-on service level. Start with the element most crucial to the SLA -- the database, for example -- and factor in other components' performance.

2. Identify potential problems with achieving the SLA.
Develop scenarios that outline exactly what could go wrong and what it would take to mitigate it. Then rank these scenarios for probability and cost. Next, prioritize them for executive sign-off. Agreement on projected losses gives a realistic idea of the resources required for continuity.

For organizations just beginning an implementation, the definition of failure scenarios is a chance to set options for creating an application environment to eliminate specific vulnerabilities. Buy-in from executive leadership will lead to a road map for deployment.

3. Perform data classification.
Many clients haven't evaluated what data an application requires and the sensitivity of that data. Data classification reflects data availability requirements and in turn determines storage infrastructure for business continuity. Skipping this detailed but crucial step makes it hard to define costs, easy to overengineer or overbuild application infrastructure -- and easy to overspend.

4. Understand the risk thresholds for different areas of the business.
This insight enables the services desk to make intelligent decisions when, for instance, a server has failed. If the recovery time objective is 30 minutes and it will take 15 minutes to identify a problem, it's important to know when your "go/no go" decision must be made.

5. Develop detailed procedures for each scenario approved.
The failure scenarios selected are the basis for disaster recovery and business continuity planning and need to be adequately communicated to all architects and developers to ensure consistency in approaches to application development and infrastructures. Failure scenarios shine a light on the risks so that all are engaged in mitigation.

6. Test, test, test.
While a "minor" change to the IT environment might not effect recovery, it could have ramifications for successful fail-over. Untested, minor changes have an unknown effect on site fail-overs.

A good time to test is with a new release of an application, especially one with business continuity requirements. Testing may reveal new options or the elimination of certain failure scenarios that should be factored into the final release.

The clock is ticking
In essence, disaster recovery and business continuity come down to planning and preparation. With proper planning, business continuity allows people to make smart decisions in compressed time frames, with little information. With adequate preparation, any event can be swiftly dealt with using tried and tested methods. Without either, your company stands to suffer losses that will extend far beyond the measured cost.

Christopher Burry is a technology infrastructure practice director and fellow at Avanade Inc., a Seattle-based integrator for Microsoft Corp. technology that's a joint venture between Accenture Ltd. and Microsoft. David Mancusi is technology infrastructure practice director of Avanade's Eastern Region. Comments or questions can be sent to Christopher.Burry@avanade.com.




Jump to comments

Additional Resources

Xerox
By using solid ink technology only from Xerox, you could save up to 65% by printing color for the cost of black and white. Enter for a chance to WIN a PhaserTM 8860 network color printer!
Microsoft
Save time and mitigate security risk. Deploy it now.
Sybase
In this white paper, IDC analyzes the role of next-generation mobile enterprise platforms as organizations seek a more strategic deployment of mobile solutions.

Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.

White Papers & Webcasts

Why Email Must Operate 24/7 and How to Make This Happen
Learn how to avoid an email outage by implementing a hosted email continuity solution.  

Disaster Recovery 2008: Reduced Costs and Improved Performance
How long can your Enterprise afford to be without your data? With an accelerated disaster recovery program, you never have to answer this...

Optimizing Data protection Operations in VMware Environments
This Taneja Group Solution Profile identifies the data protection optimizations available in the VMware vSphere environment.  

HP StorageWorks EVA4400 & Microsoft
Download this video, free, compliments of HP.

Saving Your Servers from Disaster using VMware Virtualization
Take a look at some of the key features of virtualization that help defend your IT environment against disaster.  

The Top Ten Most Forgotten Things When building a Disaster Recovery Plan
The checklist that could help your company survive a disaster.  

Effectively Implementing Datacenter Automation
Effectively select and deploy the best datacenter automation solution today!