Calculating the cost of downtime

Most executives think of a disaster as a hurricane, tornado, flood or earthquake. But a disaster is any event that prevents your business from accessing the data and systems it needs to operate, including regional power outages, virus outbreaks, employee sabotage and terrorist attacks. Virtually every company faces the risk of IT interruptions that can grind business to a halt.

Financially justifying requests to management for funding of disaster recovery planning and testing can be difficult. No business is "the average," so statistics that quantify the cost of downtime for "average" businesses aren't helpful. The completion of a business impact analysis will reveal the true costs of downtime for a business. However, they can be expensive and many executives are reluctant to invest without some way to measure the value or return.

Using our own business (Iron Mountain Off-site Data Protection) for illustration, we will walk through a series of steps to help you build the case to justify disaster recovery planning and testing for your own organization.

Step 1: Identify the business-continuity components that you will focus on. There are four components of a comprehensive business-continuity plan: people, property, systems and data. When we conducted this project several years ago, we decided to begin with the systems and data.

Step 2: Define what you're protecting. Identify the core competency of your business and define what supporting IT elements must be protected to support it. This is the heart of what your business does, your competitive advantage in the marketplace. Local service is our core competency, so our systems must always ensure local service levels for all of our customers.

Step 3: Prioritize business functions. Next, prioritize the functions that support the core competence and the systems that are needed to support them. You'll typically spend 80% of your available resources to restore the 20% of your systems, applications and data that these functions depend upon. In our example, three systems were identified:

    Vault management: The company's primary operational system that manages off-site tape inventories and movement.

    Customer-facing applications: Hardware and software used by customers to retrieve information or to communicate with Iron Mountain about their accounts.

    E-mail: Critical for customer service and communication.

Step 4: Classify outage types, frequencies and duration. At the time of the project, this Iron Mountain division had three data centers spread across the country connected by a private, high-speed Asynchronous Transfer Mode network. There were about 60 branch locations, each connected to two data centers in the WAN to provide redundancy. Our IT help desk provided records of past branch outages. Their log of trouble tickets supplied much of the information that was needed. We identified four types of outages:

    Branch outage: One of the 60 branches goes down. These were typically caused by faulty routers, malfunctioning LANs, or a loss of electrical power; only rarely were they caused by a natural disaster. They affected an average of eight branch offices each month, and each outage lasted from one to four hours.

    Regional outage: This affects multiple branches within a single geographic region. It is most often caused by failures from the telecommunications company.

    Data center outage: This occurs when one of Iron Mountain's three data centers goes off-line. An application failure, not a fire, flood or other catastrophe, is the biggest reason for a data center outage.

    National outage: Exceedingly rare, the only recent example is "Black Monday" in 1999, when AT&T's telecom network failed nationwide.

Step 5: Calculate the cost of downtime factors. These include potential lost revenue, reductions in worker productivity, damaged reputation with customers and in the marketplace. Financial analysts and accountants at your company can help you come up with the factors for your business. The bulk of our outage expenses were the labor charges for a team of technologists who had to resolve the outages. We also included the cost of manually recording the 727,000 transactions that are recorded by our systems each day plus the cost of data-entering those transactions once the system comes back online.

Frequency x Duration x Hourly Cost = Lost Profits

A sample cost for branch outages can be calculated if we use an example of 90 total branch outages in an average year, with a branch outage lasting for an average of 1.5 hours, and the cost per hour (use the daily cost divided by the number of hours in a day) was $300 an hour; then the cost of branch outages for a year would be about $40,500.

1by1.gif
Outage

Type
Minimum

Impact
Maximum

Impact
Branch 1X 5X
Data Center 2X 10X
Regional 0.2X 1X
National 1.5X 1.5X
Total 4.7X 17.5X



How much disaster preparedness can you afford?
The chart above is for demonstration purposes only. The "X" represents of "orders of magnitude" instead of specific dollar amounts. You can create a chart for the calculation of the cost of downtime specific to your own organization.

We can use the sample cost of branch downtime that we calculated at $40,500 to be the "1X," "Minimum Impact" cost for this chart. Rarely, but episodically, there will be an event for which branch outage costs will be unusually high. It may occur only every six or seven years, but you want to include it since it represents real costs.

After you calculate the financial impact of outages, you can calculate a payback period for any investment by using guidelines provided by your accountants, such as a three-year payback period. If you use a three-year payback period, you can demonstrate that there is solid payback value for any investment that is up to three times the annual cost of the downtime cost you have calculated.

Conclusion
Disaster preparedness and recovery planning is an iterative process, not a one-time event. You need to continually revisit disaster-recovery plans to ensure they remain aligned with current business realities and goals and test those plans regularly to ensure that they perform as planned.

Kevin Roden has been executive vice president and CIO at

Iron Mountain Inc. since 1999. Before then, he was CIO at FleetBoston Financial Corp. and held numerous technology and management positions in a 20-year career at BankBoston, most recently as executive director of U.S. technology.



Special Report

Preparing For The Worst

Stories in this report:

How to protect Windows 10 PCs from ransomware
Shop Tech Products at Amazon