Ads by TechWords

See your link here
Receive the latest technology news and information.
Hardware
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

Supercomputer race: It's a tricky task to boost (and measure) system speed

The Top500 list is always climbing to new heights. Can we believe the hype?

September 22, 2008 12:00 PM ET

Computerworld - Every June and November, with fanfare lacking only in actual drum rolls and trumpet blasts, a new list of the world's fastest supercomputers is revealed. Vendors brag, and the media reach for analogies such as "It would take a patient person with a handheld calculator x number of years (think millennia) to do what this hunk of hardware can spit out in one second."

The latest Top500 list, released in June, was seen as especially noteworthy because it marked the scaling of computing's then-current Mount Everest -- the petaflops barrier. Dubbed "Roadrunner" by its users, a computer built by IBM for Los Alamos National Laboratory in New Mexico topped the list of the 500 fastest computers, burning up the bytes at 1.026 petaflops, or more than 1,000 trillion arithmetic operations per second.

A computer to die for if you are a supercomputer user for whom no machine ever seems fast enough? Maybe not.

Richard Loft, NCAR
Richard Loft, NCAR

Richard Loft, director of supercomputing research at the National Center for Atmospheric Research in Boulder, Colo., says he doubts Roadrunner would operate at more than 2% of its peak rated power on NCAR's ocean and climate models. That would bring it in at 20 to 30 teraflops -- no slouch, to be sure, but so far short of that petaflops goal as to seem more worthy of the nickname "Roadwalker."

"The Top500 list is only useful in telling you the absolute upper bound of the capabilities of the computers," Loft says. "It's not useful in terms of telling you their utility in real scientific calculations."

The problem, he says, is that placement on the Top500 list is determined by performance on a decades-old benchmark called Linpack, which is Fortran code that measures the speed of processors on floating-point math operations -- for example, multiplying two long decimal numbers. It's not meant to rate the overall performance of an application, especially one that does a lot of interprocessor communication or memory access.

Test bench

If the Top500 list of supercomputers is based on such a narrow criterion -- floating-point performance -- why isn't a better benchmark used?

"I believe that you could come out with a measure that's more useful for what we do," says Richard Loft, director of research and development for supercomputing at NCAR, which models the Earth's oceans and atmosphere.

Such a measure, he says, might already exist in something called the HPC Challenge Benchmark, a suite of tests sponsored by the Defense Advanced Research Projects Agency and developed at the University of Tennessee. The tests consist of the Linpack floating-point benchmark plus six others that measure things such as integer math, memory updates, sustainable memory bandwidth and interprocessor communications.

"The good news -- or the bad news -- about the Linpack number is it's a single number," says University of Tennessee professor Jack Dongarra, who chose the benchmark years ago to rank computers for his list of "fastest" computers.

"If I knew the user's application, I might be able to say that you need to weight various metrics in a certain way to compare systems," he says. "But that reduction is hard to do, and I couldn't do it in the abstract for the Top500 list."

Moreover, users and vendors seeking fame high on the list go to elaborate pains to tweak their systems to run Linpack as fast as possible -- a tactic permitted by the list's compilers.

The computer models at NCAR simulate the flow of fluids over time by dividing a big space -- the Pacific Ocean, say -- into huge grids and assigning each cell or group of cells in the grid to a specific processor in a supercomputer.

It's nice to have that processor run very fast, of course, but getting to the end of a 100-year climate simulation requires an enormous number of memory accesses by a processor, something that typically happens much more slowly. In addition, some applications require passing many messages from one processor to another, which can also be relatively slow.

So, for many applications, the bandwidth of the communications network inside the box is far more important than the floating-point performance of its processors. That's even more true for business applications, such as online search or transaction processing.

An even greater bottleneck can crop up in programs that can't easily be broken into uniform, parallel streams of instructions. If a processor gets more than its fair share of work, all the others may wait for it, reducing the overall performance of the machine as seen by the user. Linpack operates on the cells of matrices, and by making the matrices just the right size, users can keep every processor uniformly busy and thereby chalk up impressive performance ratings for the system overall.

IBM's Roadrunner supercomputer
IBM's Roadrunner supercomputer broke the petaflops barrier in June.


Jump to comments

supercomputers

Additional Resources

EFD vs. HDD - What You Need to Know
WHITE PAPER
Enterprise flash drives provide a new Tier 0 storage layer capable of delivering high I/O performance at a very low latency. Proper use of EFDs in an Oracle environment can deliver increased performance compared to fibre channel drives. Read the recommendations for identification of the best DB components for EFDs.
Gartner Research Report: Magic Quadrant for Application Delivery Controllers, 2009
WHITE PAPER
The market for products to improve the delivery of application software over networks remains dynamic and innovative. Vendors focused on solving enterprises' most-pressing application problems have become the top players.
Eight Criteria for Server Load Balancing
WHITE PAPER
Server load balancers are a simple yet highly effective means to scale an application environment while ensuring its availability. Today's solutions should also address application performance and security. Read about the top eight criteria you should consider when choosing a server load balancer and how Citrix NetScaler meets those requirements.

What People Are Saying

IT Jobs