Securing business intelligence

Searching for the elusive competitive advantage increasingly means parsing, correlating and analyzing mountains of data into refined molehills of business intelligence, which raises the risk of exposing private information.

Better tools and more sophisticated analysis are bringing more granular analysis of operational and transactional data, according to Keith Gile, a business intelligence analyst at Forrester Research Inc. in Cambridge, Mass. Granularity is good for identifying trends to segment markets or to spot suspicious behavior, but it can also identify an individual.

"The tendency in 2004 is to drill deeper into aggregate data to get additional information, bringing us closer to data that implies a privacy breach. All the BI vendors are adding increasing analytical functions, and it is clear that the more granular the data, the more valuable it is," Gile said.

Most business intelligence, until now, focused on aggregated data, Gile said, which poses little risk of uncovering personally identifiable information subject to the protections of privacy regulations such as HIPAA or the Gramm-Leach-Bliley Act.

But the trend in business intelligence is to integrate data from transactional systems containing customer or patient information, such as ERP or CRM, with business intelligence applications, Gile said.

"Greater granularity is valuable. However, it increases the risk of exposing too much information," he said, resulting in situations where crucial, identifiable information "must be stripped off" business intelligence analysis performed by a user community beyond the scrutiny of privacy officers. "IT must be prepared to deal with BI and privacy and security because there is an issue with privacy in terms of data being released through BI technologies," Gile said.

Deleting personally identifiable information from a business intelligence system isn't an option in health care, said Jonathan Rothman, director of data management at Livingston, N.J.-based Emergency Medical Associates.

EMA developed a business intelligence system that's the patient information repository for more than 1,500 physicians, nurses, pharmacists, billing clerks and other health care professionals. It includes both operational and transactional patient data that's integrated into a data warehouse that further collects data from other sources. The result is a unique blend of protected and nonprotected health care information.

"All of our patient information is private," said Debbie Clark, EMA's director of client services. That means the entire business intelligence application and all the information in the data warehouse was designed with policies, procedures and safeguards for HIPAA compliance.

"Anytime a user can see patient information, we have to know who they are," Clark said. EMA has implemented password authentication, role-based authorization and auditing from its business intelligence vendor, Business Objects SA in San Jose.

"Anytime we report information out of the data warehouse, there is no reason for me to provide any PHI [personal health information] unless it's aggregated and made nonpublic," Rothman said. The business intelligence analysis, he said, such as "how many heart ailments seen by a physician or how many stomach ailments on a given day," stays in the business intelligence application, which is accessible in standard browser sessions.

"We bring together on average 2,200 emergency room visits per day, 365 days per year, 24 hours per day. There's tremendous security around our data warehouse," which includes virtual private networks to connect the 26 hospitals that use the data warehouse.

EMA is "just starting to use wireless access," Clark said. "The hospitals are very security-aware, and they're installing secure, HIPAA-compliant wireless networks now."

U.S. government is big BI user

The largest potential user of business intelligence that reveals personally identifiable information is the U.S. government. According to a General Accounting Office survey released May 27, there are 52 federal systems analyzing 131 operational data warehouses. An additional 68 data warehouses are being created, the GAO report said, and government officials plan to include personally identifying information in at least 122 of the 199 current and planned data warehouses.

The federal business intelligence systems will use private-sector data, such as credit card transactions, phone records or airline flight data, in 54 data warehouses, the GAO said. Seventy-seven data warehouses contain information from multiple government sources with the Defense Department making the greatest use of business intelligence, with 47 data warehouse projects reported.

All the federal data gathering and analysis has raised the hackles of the privacy community. Congress should pass laws protecting the privacy of citizens from government information analysis, according to the recommendation sent to Defense Secretary Donald Rumsfeld in May by a DOD advisory committee.

The DOD panel said federal agencies should generally restrict their data analysis to aggregated and anonymous data. If personally identifiable information is needed, it should require a court order, the panel recommended.

The Pentagon's Defense Advanced Research Projects Administration has funded business intelligence projects to combat terrorism. One project is looking into software agents that can automatically burrow into databases seeking suspicious patterns in travel reservations or questionable data in e-mails.

Privacy also is a top priority as the Department of Homeland Security ramps up its business intelligence systems. Peter Sand, the DHS director of privacy technology, said in a speech on May 21 to the Electronic Privacy Information Center's Freedom 2.0 conference that privacy is designed into new data analysis systems in the DHS.

The DHS is subject to the provisions of the Freedom of Information Act, the Privacy Act of 1974 and the E-Government Act of 2002, he said, and the DHS privacy office works as an independent watchdog inside the DHS.

Forrester's Gile said census data also may be vulnerable for making personally identifiable associations, followed by local election records.

"The Democrats and Republicans have spent a lot of money capturing highly atomic data on individual voter patterns and tendencies," he said, and are busy correlating voting data with other large consumer databases such as BuyRite, Nielsen and Information Resource Inc.

And the lawyers and courts are waiting to rule on all privacy breaches and interpret regulations. "Somebody will ultimately be taken to court on just such privacy issues," Gile said.

Special Report

The Future of BI

Stories in this report:

Copyright © 2004 IDG Communications, Inc.

It’s time to break the ChatGPT habit
Shop Tech Products at Amazon