Analysis: TSA document release show pitfalls of electronic redaction

TSA woes serve as reminder to companies that still make basic mistakes, analysts say

The inadvertent exposure of a sensitive Transportation Security Administration (TSA) security manual earlier this week serves as a sobering reminder to enterprises that often overlook pitfalls of electronic document redaction security, analysts said.

The lapse occurred when a contract employee posted an improperly redacted TSA Standard Operations Procedure manual on the publicly accessible Federal Business Opportunities Web site. The document was posted as part of a TSA contract solicitation bid and contained detailed information on the screening procedures and protocols used by a TSA officials at 450 U.S. airports.

The manual was discovered on Sunday by people at The Wandering Aramean blog, which recovered the redacted portions and sent the document to anti-secrecy site Cryptome.org. Though the TSA has insisted that the document was outdated, the incident has stirred widespread concern among lawmakers, with some calling the gaffe "shocking" and "reckless."

In a letter sent to DHS Secretary Janet Napolitano, several members of the House Committee on Homeland Security demanded details on the DHS' procedures and guidelines for redacting sensitive documents. The lawmakers also wanted the DHS to verify the security of any other redacted documents containing sensitive information that might be available online.

The TSA incident is the second time this week that an organization found itself in trouble over problems stemming from improper redaction. The other incident involved HSBC Bank, which blamed a bug in its imaging software for the inadvertent exposure of sensitive data on some of its customers going through bankruptcy proceedings. The bank claimed that information it had redacted from electronically filed Chapter 13 bankruptcy proof-of-claim forms had become viewable as as result of the undisclosed bug.

Such incidents highlight the havoc that can result from improper electronic redaction, analysts said. Earlier this year, Facebook found itself in the middle of an embarrassing situation after an Associated Press reporter reversed redacted court testimony showing Facebook's estimates of its market value to be substantially lower than what it was claiming publicly.

In May 2005, the Pentagon posted a report on its Web site containing information that it believed had been blacked out about the name of a U.S. soldier who shot and killed an Italian secret service agent in Iraq.

The lapses often result from a very simple misunderstanding of how electronic redaction works, said Barry Murphy, an analyst with Murphy Insights, a Boston-based consultancy specializing in e-discovery, records management, and content archiving. "If I put a lot of black magic marker on paper I am actually covering the data so that it is redacted," Murphy said. "In the digital world that is not true."

Obscuring portions of text in a word processor by placing black boxes over it, for instance, does nothing to redact it, Murphy said. The text may not be viewable, but it still can be indexed, making it still very searchable and easily retrieved by copying and pasting the blacked-out portion to another document, he said.

Another common mistake that companies tend to make is to overlook the metadata or the hidden information and revision histories that are often automatically embedded in Microsoft Word documents and in PDF files, he said. Blacking out text or even cutting it out does not get rid of this metadata. The only way to ensure that sensitive data is not simply visually hidden or made illegible is to actually remove it using redaction tools, he said.

One company that got burned this way in 2005 was drug giant Merck & Co., which had deleted information linking the drug Vioxx to an increased risk of heart disease from a document it submitted to a publisher. The deleted information was included in metadata embedded in the document and was later recovered.

"The major, major thing is do not use your word processing programs for redaction," said John Pescatore, an analyst with Gartner Inc. "Certainly don't just use the black-out features," he said. Even with products such as Adobe's Professional Edition, which now comes with explicit redaction capabilities, companies might be better off keeping document creation and redaction separate, Pescatore said. There are "very strong, usable software tools that can be used for electronic redaction," he added.

Some examples of automated redaction tools include Redact-IT by Informative Graphics Corp., RapidRedact by RapidRedact and ID Shield by Extract Systems. Christine Musil, director of marketing for Informative Graphics, said that companies often need such tools for complying with e-discovery and other legal requirements, or when sharing certain documents with business partners and suppliers and for protecting IP and other sensitive data in shared documents. The company also is seeing demand for county and local governments looking to redact personal data from public records, she said.

With Redact-IT, customers are able to do simple text searches for data they want redacted from documents, and then have it published as a PDF or TIF document, she said. "It actually removes every instance of whatever it is that you want redacted" from the document, including any underlying metadata, she said.

FREE Computerworld Insider Guide: IT Certification Study Tips
Editors' Picks
Join the discussion
Be the first to comment on this article. Our Commenting Policies