Ads by TechWords

See your link here
Receive the latest technology news and information.
Storage
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

When to shred: Purging data saves money, cuts legal risk

E-discovery ranges from $1 million to $3 million per terabyte of data

September 18, 2008 12:00 PM ET

Active Comments
JSMV says: It is sad that so much energy has to be devoted to purging data, but it is reality. Osterman Research...
H. Hasl-Kelchner says: Thank you for pointing out the exorbitant cost associated with e-discovery. There is nothing "underhanded" about purging documents pursuant to...


Computerworld - A funny thing happened on East Carolina University's journey to creating a data-retention strategy. As part of a compliance project launched one and a half years ago, Brent Zimmer, systems specialist at the university, was working with attorneys and archivists to determine which data was most important to keep and for how long. But it soon became clear that it was just as important to identify which data should be thrown away.

Zimmer was aware of the importance of being able to quickly produce required information during litigation, "but the thing we never thought about was keeping data too long," he says. The risk is keeping data that you wouldn't otherwise be required to produce, but as long as it's discoverable, it could be used as evidence against you.

Like many organizations, East Carolina had its share of data to purge. "We never made anyone throw away anything unless they ran out of space on their quota," Zimmer says. Some users, he says, had e-mail dating back to 1996.

East Carolina is not unusual; many organizations hang on to more data than they need, for much longer than they should, according to John Merryman, services director at GlassHouse Technologies Inc., a storage services provider in Framingham, Mass. One reason is fear. "Companies are really sensitive because there's a perceived underhandedness to purging data," he says. "People might wonder, 'Why aren't you keeping all your records?'"

Another is the low cost of storage. Organizations have historically preferred to buy more disks than spend time and resources sorting through what they do and don't need. "Many people would prefer to throw technology at the problem than address it at a business level by making changes in policies and processes," says Kevin Beaver, founder of Principle Logic LLC in Acworth, Ga.

But thanks to e-discovery risk and burgeoning data volumes -- 20% to 50% compound annual growth rate for some companies -- the tide is starting to turn, according to Merryman. The average cost companies incur for electronic data discovery ranges from $1 million to $3 million per terabyte of data, according to Glasshouse. While you need to pay attention to retaining data, at the same time, "all indications are that you need to be keeping less," Merryman says.

A recent report from Gartner Inc. concurs. It states that the current explosion of data is outpacing the decline in storage prices, even before the resource costs for maintaining data are taken into account. Estimating that the average employee might generate 10GB per year, at a cost of $5 per gigabyte to back it up, Gartner says a 5,000-worker company would face annual costs of $1.25 million for five years of storage.

At a cost of $5 per gigabyte, a 5,000-worker company would face annual costs of $1.25 million for five years of storage.
Gartner Inc.

And considering that many companies maintain multiple copies of data, thanks to test data, operational data and disaster recovery copies, not to mention backups, "there's an explosion of data in most companies," Merryman says.

Aside from the costs, keeping all those records indefinitely is a gold mine for attorneys looking for evidence, he adds.

Getting policy straight

The "2007 Litigation Trends Survey Findings" report (download PDF) by Fulbright & Jaworski LLP, which had a base of 253 U.S. and 50 U.K. corporate counsels, described the following findings:

  • The number of lawsuits filed against companies appears to be down from last year, returning to levels similar to 2005. However, suits with $20 million or more at stake are on the rise. All of the respondents from small and midsize companies reported at least one lawsuit of that magnitude in the past year. Twenty percent of the largest companies surveyed had 21 to 50 lawsuits of that size.
  • Almost 40% of the largest companies surveyed spent $5 million or more annually on litigation, excluding settlements and awards.
  • In the records-retention area, 31% of all the companies in the survey now log or retain instant messages, and 40% retain voice mail.

One way to address this problem is to set retention policies that reduce exposure to legal problems. But don't try to boil the ocean, Merryman advises. Instead, create policies from the application or business level down, rather than looking across the whole data landscape and letting policy bubble up. Also, create black-and-white rules that are easy to deal with.

For instance, roll all data types -- such as e-mail, application and file data -- into 10 to 30 categories of big-picture policies rather than hundreds of granular ones. "You need broader rules like 'Accounting data needs to be retained six years,' not 'This annual report needs to be retained [for] five years,'" he says.

According to research from Enterprise Strategy Group Inc. in Milford, Mass., the average required retention period for files, e-mails and databases is on the rise. Most companies retain data for four to 10 years, says Brian Babineau, a senior analyst at ESG.

East Carolina University started with the low-hanging fruit, setting retention and purging policies for e-mail, medical records and security video. It archived that data on a new system based on Symantec Corp.'s Enterprise Vault storage management software and EMC Corp.'s Centera content-addressed storage (CAS) array. E-mails from the chancellor or dean are saved for seven years, Zimmer says, while faculty and staff e-mail gets purged after three years.

Meanwhile, security video is archived for 30 days -- a good thing, since university police collect a terabyte per day. Patient records from the medical school need to be kept for 20 years after the patient is deceased, but East Carolina now uses EMC Rainfinity to take that data off primary storage and archive it to the Centera device so it's out of the backup environment.

Beyond that, the job will get more difficult, Zimmer acknowledges. "There's a lot of other stuff that we don't know the retention [requirements] for, so that will be more tricky," he says.

The key to reducing data volumes, Gartner says, is a process called "content valuation," which involves examining factors such as authorship authority, usage patterns, nature of content and business purpose. According to Gartner, there are many ways to approach content valuation, including electronic records management, content management, enterprise search to identify what's a record and what's not, legal preservation software and policy management.



Jump to comments

purge data

Additional Resources

Xerox
By using solid ink technology only from Xerox, you could save up to 65% by printing color for the cost of black and white. Enter for a chance to WIN a PhaserTM 8860 network color printer!
Microsoft
Save time and mitigate security risk. Deploy it now.
Sybase
In this white paper, IDC analyzes the role of next-generation mobile enterprise platforms as organizations seek a more strategic deployment of mobile solutions.

Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.

What People Are Saying

White Papers & Webcasts

Data Protection is not an insurance policy -you cannot buy-back lost data
Find out why you need to maintain access to critical information to run your business and remain competitive.

Strategic ECM Webinar
Learn what new strategic business benefits can be realized through ECM!

5 Architecture Issues that Impact BES performance
Register to attend this LIVE Webinar to learn 5 Architecture Issues that Impact BES performance!

The Power/Density Paradox: The Result of High Density without Power Efficiency
Download this brief to explore what the power/density paradox is and how IT professionals can mitigate the risk.  

Four Principles for Reducing Storage TCO
View cost reduction strategies in this video! Provided by Hitachi Data Systems.