SEATTLE -- Intel has produced a new chip that can operate at a sustained speed of one teraflop -- the type of supercomputing speed the U.S. government paid $55 million for 15 years ago. A teraflop is one trillion calculations per second.
This chip, called Knights Corner, was shown for the first time at the SC11 supercomputing conference here.
Intel isn't yet releasing all of the specs on the processor, including the amount of power it uses or its exact number of cores (it's more than 50). But the chip already has one large customer and a delivery date to make next year.
Rajeeb Hazra, general manager of Technical Computing at Intel, introduced the physical chip with a bit of flourish in the basement of a steak house here, holding it up. "It's not a PowerPoint, it's a real chip," he said.
There was a live feed showing the chip's performance as it ran a Linux workload from a "secure location" (hotel room) that attendees were later shown.
Intel didn't specify exactly when Knights Corner will be commercially available, although it has at least one customer, the Texas Advanced Computing Center. The Austin-based facility will begin installing the system next year with full operation expected in 2013. It will initially run at 10 petaflops.
One way to understand the performance of Intel's Knights Corner is to measure it through time.
In 1997, ASCI Red at Sandia National Lab broke the teraflop barrier, using almost 10,000 Pentium chips to reach one trillion calculations per second. Total development cost was $55 million.
In 2008, IBM's Roadrunner system at Los Alamos National Labs achieved petaflop speeds for the first time. That's 1,000 trillion (one quadrillion) sustained floating point operations per second.
Computer makers need to find new ways to offer compute power at low power if they are going to reach exascale computing speeds in the next decade. Exascale is 1,000 times more powerful than a petaflop.
The new Intel chip is based on Intel's MIC (Many Integrated Core) architecture, and, similar to GPUs, is a 64-bit co-processor designed to handle highly parallel applications.
"The MIC processor ought to be easier to program because they use the same instruction set architecture as Intel x86 processors," said Steve Conway, an analyst at IDC.
"As [high-performance computing] approaches the exascale era, more and more systems will exploit a mix of x86 processors and accelerators," said Conway.
While Intel has produced chips previously that can break one teraflop, it has not made one for production.
A major competitor to Intel's approach is coming from Nvidia GPUs. That company is exploring ways to integrate ARM CPUs, widely used in cell phones, with GPUs.
Robert Harrison, director of the Joint Institute for Computational Sciences at Oak Ridge National Laboratory, has been using an earlier version of the MIC processor, and said its advantage is in its programming. It uses the same software stack and compilers as the x86 system.
"You can focus on tuning and optimizing rather than having the daunting task of porting everything manually from scratch into a new environment,' said Harrison.
Patrick Thibodeau covers cloud computing and enterprise applications, outsourcing, government IT policies, data centers and IT workforce issues for Computerworld. Follow Patrick on Twitter at @DCgov or subscribe to Patrick's RSS feed . His e-mail address is email@example.com.