E-discovery in the cloud? Not so easy

Your company is embroiled in a lawsuit, and your general counsel has come to IT for help in conducting e-discovery on a batch of data. You easily gather some of the information from storage in your data center, but some of it is sitting in the cloud. Easy enough, you think, to get that data as well.

You may be in for a rude awakening.

Many lawyers and IT staff "just assume if they put data in the cloud it's going to be at their fingertips, that it's inherently discoverable," says Barry Murphy, co-founder and principal analyst at eDJ Group Inc., a consulting firm specializing in e-discovery. "That's not necessarily the case."

The cloud has dramatically expanded the number of places where electronically stored information (ESI) can live. Under the Federal Rules of Civil Procedure (pdf), a party to litigation is expected to preserve and be able to produce ESI that is in its "possession, custody or control."

With cloud, those duties are split -- the ESI may not technically be in your possession anymore, and yet it's presumably under your control, says James M. Kunick, principal and chair of intellectual property and technology practice at law firm Much Shelist P.C.

Because this area is so new, the legal ramifications of storing data in the cloud are still murky. Among the few instances of case law is Gordon Partners v. Blumenthal, which found that if a company has "access to documents to conduct business, [then] it has possession, custody and control of the documents for purpose of discovery," according to Murphy.

That could potentially pose a significant problem. Depending upon the relationship they have with their cloud vendor, companies may not know exactly where their data is stored. Even if they do, information in the cloud can be difficult to access in the right format and in a timely manner.

And there is a danger that companies can lose control over access to that data -- opposing attorneys, for example, might subpoena not only your company, but also your cloud provider. "You need to make sure your contract with the provider allows you to control what happens if they get a subpoena," Kunick warns.

What we mean by 'e-discovery in the cloud'

"E-discovery in the cloud" can be just as confusing as all other things cloud-related. It means different things to different people. This article focuses on e-discovery on data that has been stored in the cloud for general purposes. (Category three, below.) But vendors sell a variety of tools related to e-discovery and the cloud. Here's how Christine Taylor, an analyst at Taneja Group, delineates the market:

E-discovery SaaS: Using the cloud to deliver e-discovery application software. These SaaS packages typically cover one of several e-discovery processes, such as collection, preservation or review.

Cloud-based e-discovery: Using a hosting provider to run e-discovery processes on data archived to the cloud. This comes in two forms. First, a customer can archive data at a hosting provider with the specific understanding that the service provider can and will do e-discovery on that data if the need arises. Second, a customer keeps its archives in-house, but in the case of legal trouble it collects the relevant data and sends it to a cloud provider specifically for the purpose of e-discovery services.

E-discovery on any data stored in the cloud: Using a cloud provider to store data in the cloud, with no special provisions or considerations about e-discovery. This is by the far the riskiest option of the three, says Taylor. "Even where the cloud provider is trusted, such as Google or Amazon, service level guarantees for the enterprise are notoriously poor," she wrote in a recent report. "And these services also have few mechanisms in place to report on physical data locations to their customers, which can be a serious defensibility issue."

Know the potential problems

And yet most companies are blissfully unaware of the potential problems with e-discovery, says Murphy. In a recent survey of legal and IT professionals using cloud services, Murphy found that less than 16% of 172 respondents had put an e-discovery plan in place before moving data to the cloud. Even more alarming, he says, nearly 60% of respondents didn't know whether they had an e-discovery plan in place or not.

In another survey, conducted last year by Clearwell Systems Inc. (an e-discovery vendor acquired last year by Symantec) and consulting firm Enterprise Strategy Group (ESG) Inc., nearly 60% of more than 100 Fortune 2000 enterprises and government agencies said they expected to have to consider their cloud-based applications "in scope" for e-discovery.

In the same survey, however, only 26% considered themselves somewhat or very prepared for such e-discovery requests. In other words, notes Katey Wood, an ESG analyst, "they said yes, they think they'll have litigation, but no, they are not prepared for it."

Murphy thinks lawyers may even start to target cloud-based sources of information hoping to catch opponents unprepared. A wily opposing attorney could, for example, request discovery of data in Salesforce.com, knowing that most companies are inexperienced with doing collections from that particular data source.

"Until we have a successful anecdote in which someone gets sued and they run their search successfully on data in the cloud, until they've actually done it at speed and to scale, we won't really know" how prepared companies are for e-discovery in the cloud, he says.

Tom Conophy, CIO of InterContinental Hotels Group, is one executive who believes he's got his bases covered. Among the many cloud initiatives of the $18 billion hospitality company is a project to move its global reservations system, now on a mainframe, to the cloud.

IHG is in the process of choosing a cloud provider and in its contracts, the company is "very careful about making sure that our intellectual property and our content is ours, and that at any given time we have the ability to access it, export it, turn it off -- whatever we need to do with it," says Conophy. "It's no different than if it was running in our own [data center]."

Be mindful of email, social media

Potential e-discovery problems vary depending on the type of cloud provider and the contract, observers say. Because email has been subject to e-discovery for a while, many email hosting providers have this covered in their contracts. And large cloud vendors that typically serve Fortune 500 companies are likely to pay more attention to the discoverability of data.

With other cloud providers, the area can be murky. "A lot is negotiated on a vendor-by-vendor basis at this point," says Wood. (For guidance, see E-discovery questions to ask your cloud vendor] and 20 steps to an iron-clad SaaS contract.)

Some SaaS providers make it easier than others to get data out of their systems. Salesforce.com, for example, "is not an easily searchable system -- because it's not a content management system per se -- and yet people are storing information there," says Murphy.

Social media represents one of the biggest challenges. Sites like Facebook, LinkedIn and Twitter rely on standard terms of service contracts with users, including companies that use the services for marketing and connecting with customers. But what if a company needs to discover what a former employee posted on Facebook? Since it is the former employee's account, the company has no rights to access that information.

That means companies have to consider constantly collecting the content of all employees' posts as a safeguard. Certain regulated companies in financial services -- such as broker/dealers -- already do this, notes Murphy.

Companies that don't may find the going tough should they need to retrieve social data. Although social media companies say that anyone can write to their open APIs to get the data they need, "accessibility changes on a regular basis as the APIs of the vendors change," Murphy points out.

In addition, how long would it take to download all the data? Murphy points out that most sites "throttle their APIs," which could slow downloads or search results. Some e-discovery service providers, he notes, have started to target this problem. "They pay a lot of money to be in these API programs," he says. "They are essentially buying less throttling."

Know your data's location

Whatever type of cloud you're dealing with, it's important to know exactly where your data resides. In some cases, a cloud vendor may be storing it in a data center in a different country, where different data privacy and e-discovery rules apply.

And even if you've contracted with one cloud provider, do you know whether that company is using subcontractors? That's frequently the case, says Kunick. "It's more than likely that your data will reside in several locations." Even if you have an iron-clad contract with your cloud provider, can that provider get at the data in a prompt and defensible way from its subcontractors?

Before there was a cloud, companies would contract with large managed service providers and would spell out most of these provisions in long, detailed contracts, Kunick says. But most cloud provider contracts don't cover such details. "With cloud service providers, the contracts are seldom longer than 10 pages."

Beware renegade business units

Even if contracts cover every detail, shadow IT activities within corporations can be the source of other e-discovery problems. Charles Skamser, president and CEO of consulting firm eDiscovery Solutions Group, spent the last several months interviewing some 60 cloud service providers. Most told him that their clients are not asking about e-discovery. In fact, "some of the [cloud service providers] even said, 'What's e-discovery?'" says Skamser.

More telling, perhaps, Skamser's research indicates that a high percentage of the client base of these providers are "renegade business units" of large corporations seeking to do an end-run around what they perceive as unresponsive internal IT organizations.

This could be a recipe for disaster. When a large corporation is sued and presented with an e-discovery request, the general counsel would likely go to the IT department and ask for help. The general counsel may not ask a particular business unit manager, and even if they do, the manager won't know how to comply and probably has no e-discovery provisions in his contract with the cloud vendor, Skamser explains.

Develop a comprehensive plan

The most important thing is for a corporation to have a comprehensive information governance and discovery plan that covers all sources of data, including the cloud, says Murphy.

The plan needs to provide not only for how to conduct e-discovery on data stored in the cloud, but also how to review that data alongside data residing elsewhere. "People will store information in all sorts of places," he notes. "You need to apply the same discipline on all data sources."

It's not too late to prepare for e-discovery on cloud-based data, says Murphy. "Best practices are just beginning to emerge, and the good news is that companies have the opportunity to get ahead of the curve," he wrote in a recent report.

"The key is to treat cloud-based sources of data like any other data source," Murphy concludes. "Include it in data maps, have a plan for collecting and preserving it, know how to manage the chain of custody, and understand when to dispose of the data so that it poses no e-discovery risk."

Questions to ask your cloud vendor

The key to setting successful e-discovery policies for cloud computing is knowing exactly what your cloud vendor will and will not do in the event of e-discovery. More than 70% of the respondents to eDJ Group's survey did not know their cloud vendor's policy in terms of responding to e-discovery needs.

Here are some questions that analysts recommend asking:

How would information be placed on legal hold?

How can the information be accessed by various parties?

How would the e-discovery functions of review and analysis be executed? Can you look at the data without having to download it?

What are the vendor's systems, data and backup procedures? Can it ensure that information is protected and redundant?

Exactly how is information stored? Is tenancy shared or do you get your own dedicated storage?

Where is the physical location of the stored data? Different countries have different regulations and law regarding data privacy and e-discovery.

Who bears the responsibility and cost of information collection and preservation? Who would be held liable for a failure to collect and preserve the information?

Will the cloud vendor agree to identify an employee to testify regarding preservation and collection? "Doing so goes a long way toward successfully managing the chain of custody of information," writes eDJ co-founder Barry Murphy.

Exactly how can the data be searched and collected or locked down? Some cloud vendors may not have the tools to do this.

How long would a large collection of data take? In what format would the cloud provider deliver the data? Unless these details are pinned down, your legal counsel might promise an unrealistic deadline for delivering data or, worse yet, not be able to meet the court's deadline.

How will you get the information back if the vendor goes out of business?

How long will the vendor retain your data? If the vendor discards it too soon, then it can "look to the plaintiff as if the defendant has found a way to shred their documents," says Much Shelist P.C. principal James Kunick.

Can you test the vendor's system to make sure you can access, search and/or download data promptly and properly?

Copyright © 2012 IDG Communications, Inc.

It’s time to break the ChatGPT habit
Shop Tech Products at Amazon