Skip the navigation

CERN readies world's biggest science grid

The computing network now encompasses more than 100 sites in 31 countries

By James Niccolai
March 21, 2005 12:00 PM ET

IDG News Service - If the Large Hadron Collider (LHC) at CERN is to yield miraculous discoveries in particle physics, it may also require a small miracle in grid computing.
Undaunted by a lack of suitable tools from commercial vendors, engineers at the famed Geneva laboratory are hard at work building a giant grid to store and process the vast amounts of data the collider is expected to produce when it begins operations in mid-2007. They announced last week that the computing network now encompasses more than 100 sites in 31 countries, making it what they believe is the world's largest international scientific grid.
Inside the collider, proton beams traveling in opposite directions will be accelerated to near the speed of light and steered into one another using powerful magnets. Scientists hope to analyze data from the collisions to uncover new elementary particles, solve riddles such as why elementary particles have mass, and get closer to understanding how the universe works.
The proton collisions will produce an estimated 15TB of data each year, or more than 15 million gigabytes. The role of the grid is to link a vast network of computing and storage systems and provide the scientists with access to the data and processing power when they need it.
The grid sites involved are mostly universities and research labs as far afield as Japan and Canada, as well as two Hewlett-Packard Co. data centers. The sites are contributing computational power from more than 10,000 processors in total, and hundreds of millions of gigabytes in tape and disk storage.
For all the talk about grids from big IT vendors, virtually no suitable commercial tools were available to build the grid's infrastructure, according to project leader Les Robertson. Much of the data will be stored in Oracle Corp. databases, and a few of the sites use commercial storage systems, but the hardest part -- building the middleware to operate the grid -- was left largely to Robertson and his peers.
"It's surprised me a bit that there haven't been more commercial tools available to us. What we're building is not very specialized; we're just creating a virtual clustered system, but on a large scale and with a very large amount of data," he said.

Instead, CERN based its grid on the Globus Toolkit from the Globus Alliance, adding scheduling software from the University of Wisconsin's Condor project and tools developed in Italy under the European Union's DataGrid project. "This stuff comes from a lot of different places; it's very much a component-based approach," Robertson said.
The middleware serves two main functions. One

Reprinted with permission from Story copyright 2014 International Data Group. All rights reserved.
Our Commenting Policies