Despite lingering privacy concerns, the U.S. Office of Personnel Management (OPM) is plowing ahead with plans to build a massive centralized database containing detailed healthcare claims information on millions of federal employees and their families.
The agency on Wednesday released two formal notices in the Federal Register detailing plans for the new Health Claims Data Warehouse. One of the notices describes how the OPM will use the database, the other describes how the OPM Inspector General's office will use it.
Work on the database begins July 15.
The notices -- known in government parlance as systems-of-records notices -- are aimed at addressing some of the concerns raised by several privacy groups when the OPM first detailed its plans last October. The outcry prompted the OPM to push back its original deadline.
Wednesday's notice, for instance, substantially limits the scope of the database, narrows the circumstances under which information from it will be used and clarifies that only de-identified data will be released outside of OPM.
The revised plans go a long way in addressing some of the original concerns, said Harley Geiger, policy council at the Center for Democracy and Technology (CDT), which has been vigorously arguing for more privacy controls.
Even so, several other fundamental issues, including database architecture and data anonymity, that remain unaddressed, Geiger said.
According to the OPM, the data warehouse is designed to help the agency better manage federal health claims programs. Under the effort, the agency will collect and analyze health services data from the Federal Employees Health Benefits Program (FEHBP). Members of the FEHBP include federal and postal employees, uniformed service members and retirees.
As part of its plan, OPM will establish a direct data feed with the FEHBP to continuously collect, manage and analyze health services data. The information collected includes individuals' names, addresses, Social Security Numbers and dates of birth, plus the names of their spouses and other information about dependents, as well as information about their healthcare coverage, medical conditions, procedures and diagnoses.
The OPM will use identifiable data to create 'longitudinal' long-term health records for each individual in the database. However, OPM analysts, who access the data for analyses, will only have access to de-identified records.
"OPM will analyze the data in order to evaluate: The cost of care; utilization of services; and quality of care for specific population groups, geographic areas, health plans, health care providers, disease conditions, and other relevant categories," the OPM notices said.
The inspector general's office, meanwhile, will use the claims database for audit and investigative purposes to detect fraud and waste, according to one notice.
The OPM's revised plan differs substantially from the one it proposed last year in some key areas.
Originally, for instance, the OPM data warehouse would have pulled in data from two other sources as well: the National Pre-Existing Condition Insurance Program and the Multi-State Option Plan.
As it stands, the system will now use only data from the FEHBP, although OPM left the door open for including the Multi-State Option Plan, Geiger said.
The narrower scope is important because a significantly smaller number of people will now be part of the database, he said.
Another change involves greater transparency about privacy protections, Geiger said. The revised notice offers more detail on the privacy laws OPM will follow, and spells out who has access to fully identifiable data and who has access only to anonymized data.
"They have also made it clear that some of the information sharing they had reserved for their use under the previous SORN is no longer in place," he said. And when the OPM does share information externally, it will be only be fully anonymized data.
The main issue is about the need for such a centralized data repository in the first place, Geiger said. The OPM could achieve its goals just as well simply by accessing the FEHBP data here it is now, rather than pulling it into a separate database.
The need for the OPM to have fully identifiable data is also unclear, he said, arguing that the analyses OPM envisions can be achieved by using hashed, anonymized data.
In most cases, enrollees in the federal health employment benefit plan won't know that claims information is going to OPM and they won't know what information is used and why, he said.
Officials at the OPM could not be reached immediately for comment.
Jaikumar Vijayan covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at @jaivijayan or subscribe to Jaikumar's RSS feed . His e-mail address is firstname.lastname@example.org.