Consortium for a Cure

The power of grid computing propels cancer research to a new level.

COMPUTERWORLD HONORS SHOWCASE

National Cancer Institutes Cancer Biomedical Informatics Grid
ORGANIZATION: The National Cancer Institute (NCI), part of the National Institutes of Health, is the U.S. governments principal agency for cancer research and training.PROJECT CHAMPIONS: Dr. John Niederhuber, former chairman of the National Cancer Advisory Board and now director of the NCI; former director Dr. Andrew von Eschenbach; and deputy director Anna BarkerSTAFF: About 180 NCI IT staffers, along with about 620 people from various institutions outside the NCI, work on caBIG.ROI: The NCI has invested $20 million annually over the past three years. Participants share tools and data, which reduces duplicative efforts and speeds up research breakthroughs.

Scientists and doctors fighting cancer have a powerful new weapon. It’s not an innovative drug or breakthrough gene therapy. Rather, it’s an expansive technology initiative that links them together in their efforts to find a cure. The Cancer Biomedical Informatics Grid, or caBIG, is a voluntary, open-source, open-access network that allows institutions, teams and individual researchers in the U.S. to share tools, standards, data, applications and technologies.

The National Cancer Institute developed caBIG with one goal in mind: to speed progress on cancer research and care. “I really see it as the next generation of research,” says David Fenstermacher, director of biomedical informatics at the Abramson Cancer Center of the University of Pennsylvania.

Computerworld named the NCI a 2006 Honors Program recipient in the science category for its development of caBIG.

“The community was really very ready for it,” says Joel Saltz, professor and chairman of biomedical informatics at the Arthur G. James Cancer Hospital and Richard J. Solove Research Institute, part of Ohio State University.

The research community has been developing grid technologies and supercomputer consortia to support its work, Saltz says. And as grid technology has matured, so has researchers’ willingness to share data and tools, particularly through the Human Genome Project.

The Need

Despite the evolution in those areas, researchers were still fairly isolated within their institutions when the NCI announced the caBIG initiative in 2003, project leaders say.

“We were generating this large knowledge base, and the only way we were getting that knowledge out was writing papers,” says Rakesh Nagarajan, an assistant professor in the department of pathology and immunology at Washington University in St. Louis.

And even though researchers and institutions at the time were sharing some data sets, the process was cumbersome. Researchers would e-mail files or burn data onto CDs or DVDs, which would then be mailed. Moreover, the data’s format varied from institution to institution, which required recipients to convert information into formats compatible with their own systems.

“The idea [with caBIG] was that you should be able to re-analyze data in new and novel ways and integrate other data,” Nagarajan says.

This was the backdrop as the NCI launched caBIG in February 2004, working with some 50 NCI-designated cancer centers and more than 30 other organizations to build this infrastructure.

The Challenge

While the initiative enjoyed an enthusiastic response from the research community, the program — like all technology projects — had to contend with both systems and cultural challenges.

First, caBIG had to link institutions spread across 24 states and individuals working on various systems, including legacy systems, in different locales that were generating data in different formats using varying semantics for the same abstract concepts.

“What we saw there was tremendous [diversity] in need and systems. And we were covering the entire landscape, from discovery research to clinical trials to the intermediary steps,” says Ken Buetow, the NCI’s associate director for bioinformatics and IT, and director of the NCI Center for Bioinformatics, which directs caBIG.

Program leaders cooperatively developed the architecture, software infrastructure and standards for sharing data. They created interoperability that allows for machine-to-machine exchanges, the compatibility of tools and the accurate exchange of complex concepts through the use of standardized definitions and semantics.

The Technology

CaBIG leverages existing software standards and open platforms, such as the Globus Toolkit, wherever possible in the development of this suite of interoperable biomedical informatics tools and shared vocabularies and data elements.

The program also uses specific tools such as caDSR (Cancer Data Standards Repository), EVS (Enterprise Vocabulary System) and caGrid to allow geographically dispersed researchers and informaticians to query data using desktop tools.

Today, more than 800 people contribute to caBIG by developing applications, infrastructure, standards, policy documents and related resources. CaBIG also links cancer centers in 32 states. Moreover, NCI has developed a close collaboration with the U.K.’s National Cancer Research Institute, which plans to hire someone to work exclusively with caBIG to ensure coordination and technical compatibility.

The NCI has spent $20 million annually for the past three years to develop caBIG, with much of this federal money going into labor costs and software-related services, according to Buetow and Peter A. Covitz, chief operating officer at the NCI Center for Bioinformatics. Participating institutions also contribute, usually through in-kind resources or sweat equity.

Moving forward, Buetow and Covitz say the goal is to expand not only the tools and data available on caBIG, but also the number of people using it. Program leaders say they also want to improve the ease of using and making submissions to caBIG and deliver more user interfaces in the upcoming year.

“The belief is that caBIG will tie together resources and, if nothing else, will accelerate breakthroughs in cancer research,” Covitz says.

Pratt is a Computerworld contributing writer in Waltham, Mass. Contact her at marykpratt@verizon.net.

Copyright © 2006 IDG Communications, Inc.

It’s time to break the ChatGPT habit
Shop Tech Products at Amazon