P2P networks rife with sensitive health care data, researcher warns

Data leaks could be significant threat to patients, providers, Dartmouth study finds

Eric Johnson didn't have to break into a computer to gain access to a 1,718-page document containing Social Security numbers, dates of birth, insurance information, treatment codes and other health care data belonging to about 9,000 patients at a medical testing laboratory.

Nor did he need to ransack a health care facility to lay his hands on more than 350MB of sensitive patient data for a group of anesthesiologists or to get a spreadsheet with 82 fields of information on more than 20,000 patients belonging to a health system.

In all instances, Johnson was able to find and freely download the sensitive data from a peer-to-peer file-sharing network using some basic search terms.

Johnson, a professor of operations management at the Dartmouth College Tuck School of Business, did the searches last year as part of a study looking at the inadvertent hemorrhaging of sensitive health care data on Internet file-sharing networks.

The results of that study, which are scheduled to be published in the next few days, show that data leaks over P2P networks involving the health care sector pose a significant threat to patients, providers and payers, Johnson said.

"When you start thinking about the nature of these disclosures, it's far more worrisome" than compromises such as those involving payment card data, he said.

"Here you are leaking not just detailed personally identifiable information but also very personal medical information related to patients," Johnson said. Such data can be readily used by hospital employees, the uninsured, organized crime rings, illegal aliens and drug abusers for medical identity theft, and to fraudulently obtain costly medical services and prescription drugs, he said. And while such fraud can cost millions, there is less monitoring for such fraud in the health care industry than there is in the financial sector.

P2P networks allow Internet users to share music, video and data files with others on the network. Normally, popular P2P clients -- such as Kazaa, LimeWire, BearShare, Morpheus and FastTrack -- let users download files and share items from a particular folder. But if proper care isn't taken to control the access that these clients have on a system, it is easy to expose far more data than intended.

For example, Dartmouth conducted a similar study about 18 months ago and found volumes of sensitive financial data on P2P networks as a result of inadvertent data leakage. At a congressional hearing in July 2007, security experts testified that millions of documents, including sensitive military and government documents, were being leaked on P2P networks. Even pharmaceutical giant Pfizer Inc. became a victim when an employee illegally installed a P2P client on a company computer and exposed personal data belonging to 17,000 employees.

Hospitals and other health care providers need to be aware of the dangers posed by inadvertent data leakage and implement better controls to monitor, detect and stop them, Johnson said. Stricter access control measures also need to be adopted across the health care industry to minimize access to sensitive patient data, he said. This is especially critical because of the growing portability of sensitive data, he said.

Dartmouth began its search for medical data on P2P networks in January 2008. Over a two-week period, researchers examined health care data disclosures and health-related searches on file-sharing networks such as Gnutella, FastTrack, Aries and e-Donkey.

The search focused on finding information related to the top 10 publicly traded health care organizations in the country, representing nearly $70 billion in spending. Dartmouth researchers used search terms related to each of the companies, such as the names of affiliated hospitals, clinics and brands, to see what information it could find on P2P networks.

The university conducted the searches with the help of a firm called Tiversa Inc., which sells P2P network-monitoring services to government agencies and private companies. The searches yielded an astonishing range of information floating over P2P networks, Johnson said.

The information found on the networks originated from health care companies, suppliers and patients, and included sensitive patient health and identity information, insurance and billing data, and business documents. The data found on P2P networks included medical diagnoses and psychiatric evaluations, and even blank, signed prescription forms that anyone could have easily copied and filled out.

What was probably the most interesting finding "was the sheer amount of unstructured data that is floating around," Johnson said. The range of health care information floating on P2P networks and the variety of sources from which it is being leaked highlight the disorganized and decentralized manner in which health care data is being collected, stored, used and shared, he said.

From CIO: 8 Free Online Courses to Grow Your Tech Skills
Join the discussion
Be the first to comment on this article. Our Commenting Policies