Ads by TechWords

See your link here
Receive the latest technology news and information.
Storage
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

When to shred: Purging data saves money, cuts legal risk

E-discovery ranges from $1 million to $3 million per terabyte of data

September 18, 2008 12:00 PM ET

Active Comments
JSMV says: It is sad that so much energy has to be devoted to purging data, but it is reality. Osterman Research...
H. Hasl-Kelchner says: Thank you for pointing out the exorbitant cost associated with e-discovery. There is nothing "underhanded" about purging documents pursuant to...


Computerworld - A funny thing happened on East Carolina University's journey to creating a data-retention strategy. As part of a compliance project launched one and a half years ago, Brent Zimmer, systems specialist at the university, was working with attorneys and archivists to determine which data was most important to keep and for how long. But it soon became clear that it was just as important to identify which data should be thrown away.

Zimmer was aware of the importance of being able to quickly produce required information during litigation, "but the thing we never thought about was keeping data too long," he says. The risk is keeping data that you wouldn't otherwise be required to produce, but as long as it's discoverable, it could be used as evidence against you.

Like many organizations, East Carolina had its share of data to purge. "We never made anyone throw away anything unless they ran out of space on their quota," Zimmer says. Some users, he says, had e-mail dating back to 1996.

East Carolina is not unusual; many organizations hang on to more data than they need, for much longer than they should, according to John Merryman, services director at GlassHouse Technologies Inc., a storage services provider in Framingham, Mass. One reason is fear. "Companies are really sensitive because there's a perceived underhandedness to purging data," he says. "People might wonder, 'Why aren't you keeping all your records?'"

Another is the low cost of storage. Organizations have historically preferred to buy more disks than spend time and resources sorting through what they do and don't need. "Many people would prefer to throw technology at the problem than address it at a business level by making changes in policies and processes," says Kevin Beaver, founder of Principle Logic LLC in Acworth, Ga.

But thanks to e-discovery risk and burgeoning data volumes -- 20% to 50% compound annual growth rate for some companies -- the tide is starting to turn, according to Merryman. The average cost companies incur for electronic data discovery ranges from $1 million to $3 million per terabyte of data, according to Glasshouse. While you need to pay attention to retaining data, at the same time, "all indications are that you need to be keeping less," Merryman says.

A recent report from Gartner Inc. concurs. It states that the current explosion of data is outpacing the decline in storage prices, even before the resource costs for maintaining data are taken into account. Estimating that the average employee might generate 10GB per year, at a cost of $5 per gigabyte to back it up, Gartner says a 5,000-worker company would face annual costs of $1.25 million for five years of storage.

At a cost of $5 per gigabyte, a 5,000-worker company would face annual costs of $1.25 million for five years of storage.
Gartner Inc.

And considering that many companies maintain multiple copies of data, thanks to test data, operational data and disaster recovery copies, not to mention backups, "there's an explosion of data in most companies," Merryman says.

Aside from the costs, keeping all those records indefinitely is a gold mine for attorneys looking for evidence, he adds.

Getting policy straight

The "2007 Litigation Trends Survey Findings" report (download PDF) by Fulbright & Jaworski LLP, which had a base of 253 U.S. and 50 U.K. corporate counsels, described the following findings:

  • The number of lawsuits filed against companies appears to be down from last year, returning to levels similar to 2005. However, suits with $20 million or more at stake are on the rise. All of the respondents from small and midsize companies reported at least one lawsuit of that magnitude in the past year. Twenty percent of the largest companies surveyed had 21 to 50 lawsuits of that size.
  • Almost 40% of the largest companies surveyed spent $5 million or more annually on litigation, excluding settlements and awards.
  • In the records-retention area, 31% of all the companies in the survey now log or retain instant messages, and 40% retain voice mail.

One way to address this problem is to set retention policies that reduce exposure to legal problems. But don't try to boil the ocean, Merryman advises. Instead, create policies from the application or business level down, rather than looking across the whole data landscape and letting policy bubble up. Also, create black-and-white rules that are easy to deal with.

For instance, roll all data types -- such as e-mail, application and file data -- into 10 to 30 categories of big-picture policies rather than hundreds of granular ones. "You need broader rules like 'Accounting data needs to be retained six years,' not 'This annual report needs to be retained [for] five years,'" he says.

According to research from Enterprise Strategy Group Inc. in Milford, Mass., the average required retention period for files, e-mails and databases is on the rise. Most companies retain data for four to 10 years, says Brian Babineau, a senior analyst at ESG.

East Carolina University started with the low-hanging fruit, setting retention and purging policies for e-mail, medical records and security video. It archived that data on a new system based on Symantec Corp.'s Enterprise Vault storage management software and EMC Corp.'s Centera content-addressed storage (CAS) array. E-mails from the chancellor or dean are saved for seven years, Zimmer says, while faculty and staff e-mail gets purged after three years.

Meanwhile, security video is archived for 30 days -- a good thing, since university police collect a terabyte per day. Patient records from the medical school need to be kept for 20 years after the patient is deceased, but East Carolina now uses EMC Rainfinity to take that data off primary storage and archive it to the Centera device so it's out of the backup environment.

Beyond that, the job will get more difficult, Zimmer acknowledges. "There's a lot of other stuff that we don't know the retention [requirements] for, so that will be more tricky," he says.

The key to reducing data volumes, Gartner says, is a process called "content valuation," which involves examining factors such as authorship authority, usage patterns, nature of content and business purpose. According to Gartner, there are many ways to approach content valuation, including electronic records management, content management, enterprise search to identify what's a record and what's not, legal preservation software and policy management.



Jump to comments

purge data

Additional Resources

Microsoft
Here are some of the key reasons why you would want to run Unified Access Gateway with DirectAccess.
Microsoft
Review how one energy firm tightened protection and simplified IT work using business-ready security solutions.
Sybase
In this white paper, IDC analyzes the role of next-generation mobile enterprise platforms as organizations seek a more strategic deployment of mobile solutions.

Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.

What People Are Saying

White Papers & Webcasts

Cache Tier Memory Efficiency with Gear6 Web Cache
Download this valuable white paper!  

Connecting to the Cloud with F5 and VMware VMotion
F5 and VMware partner to enable live application and storage migrations between datacenters and clouds, over short or long distances.  

Virtualize Microsoft Applications on VMware
Register for this live webcast now!

F5 Virtualization Guide: Seven Key Challenges You Can't Ignore
Seven Key Challenges You Can't Ignore  

Strategic ECM Webinar
Learn what new strategic business benefits can be realized through ECM!


IT Jobs

 

Partnered Content
Hitachi - Inspire the Next
Storage Economics: Understanding Tiered Storage Solutions
Storage Economics is a suite of methodologies, tools, and services that help customers identify the total cost of storage ownership and provide a tiered storage solution to reduce ongoing costs. Understand the benefits of implementing a tiered storage architecture which include improving storage capacities and easing the access demands to any single storage tier. Learn more.
Download this white paper 
Strategies for an Increasingly Cost-Conscious Data Storage World
Whatever word you use, we can all agree that the global economy continues to face challenging times. Yet, the essential challenge remains the same: IT demands continue to increase but the resources to address such challenges are being flattened or cut. However, we truly have an opportunity here to do more with less and focus on efficiency. Hitachi can help. Learn more.
Download this white paper 
Four Principles to Reduce TCO
Yes, good news! The good news is that there are proven strategic investments available today for storage infrastructure cost reduction. Smart organizations will follow the principles of Storage Economics to evaluate them not just for their technical prowess but also for how well they can support business performance and particularly efforts to economize. Learn more.
Download this white paper