U.S. sets plan to build two exascale supercomputers

Both systems, using different architectures, will be developed simultaneously in 2019 -- if the Trump administration goes along with the plan

Supercomputers
Credit: ARM

The U.S believes it will be ready to seek vendor proposals to build two exascale supercomputers -- costing roughly $200 million to $300 million each -- by 2019.

The two systems will be built at the same time and will be ready for use by 2023, although it's possible one of the systems could be ready a year earlier, according to U.S. Department of Energy officials.

But the scientists and vendors developing exascale systems do not yet know whether President-Elect Donald Trump's administration will change directions. The incoming administration is a wild card. Supercomputing wasn't a topic during the campaign, and Trump's dismissal of climate change as a hoax, in particular, has researchers nervous that science funding may suffer.

At the annual supercomputing conference SC16 last week in Salt Lake City, a panel of government scientists outlined the exascale strategy developed by President Barack Obama's administration. When the session was opened to questions, the first two were about Trump. One attendee quipped that "pointed-head geeks are not going to be well appreciated."

Another person in the audience, John Sopka, a high-performance computing software consultant, asked how the science community will defend itself from claims that "you are taking the money from the people and spending it on dreams," referring to exascale systems.

Paul Messina, a computer scientist and distinguished fellow at Argonne National Labs who heads the Exascale Computing Project, appeared sanguine. "We believe that an important goal of the exascale computing project is to help economic competitiveness and economic security," said Messina. "I could imagine that the administration would think that those are important things."

Politically, there ought to be a lot in HPC's favor. A broad array of industries rely on government supercomputers to conduct scientific research, improve products, attack disease, create new energy systems and understand climate, among many other fields. Defense and intelligence agencies also rely on large systems.

The ongoing exascale research funding (the U.S. budget is $150 million this year) will help with advances in software, memory, processors and other technologies that ultimately filter out to the broader commercial market.

This is very much a global race, which is something the Trump administration will have to be mindful of. China, Europe and Japan are all developing exascale systems.

China plans to have an exascale system ready by 2020. These nations see exascale -- and the computing advances required to achieve it -- as a pathway to challenging America's tech dominance.

"I'm not losing sleep over it yet," said Messina, of the possibility that the incoming Trump administration may have different supercomputing priorities. "Maybe I will in January."

The U.S. will award the exascale contracts to vendors with two different architectures. This is not a new approach and is intended to help keep competition at the highest end of the market. Recent supercomputer procurements include systems built on the IBM Power architecture, Nvidia's Volta GPU and Cray-built systems using Intel chips.

The timing of these exascale systems -- ready for 2023 -- is also designed to take advantage of the upgrade cycles at the national labs. The large systems that will be installed in the next several years will be ready for replacement by the time exascale systems arrive.

The last big performance milestone in supercomputing occurred in 2008 with the development of a petaflop system. An exaflop is a 1,000-petaflop system and building it is challenging because of the limits of Moore's Law, a 1960s-era observation that noted the number of transistors on a chip doubles about every two years.

"Now we're at the point where Moore's Law is just about to end," said Messina in an interview. That means the key to building something faster "is by having much more parallelism, and many more pieces. That's how you get the extra speed."

An exascale system will solve a problem 50 times faster than the 20-petaflop systems in use in government labs today.

Development work has begun on the systems and applications that can utilize hundreds of millions of simultaneous parallel events. "How do you manage it -- how do you get it all to work smoothly?" said Messina.

Another major problem is energy consumption. An exascale machine can be built today using current technology, but such a system would likely need its own power plant. The U.S. wants an exascale system that can operate on 20 megawatts and certainly no more than 30 megawatts.

Scientists will have to come up with a way "to vastly reduce the amount of energy it takes to do a calculation," said Messina. The applications and software development are critical because most of the energy is used to move data. And new algorithms will be needed.

About 500 people are working at universities and national labs on the DOE's coordinated effort to develop the software and other technologies exascale will need.

Aside from the cost of building the systems, the U.S. will spend millions funding the preliminary work. Vendors want to maintain the intellectual property of what they develop. If it cost, for instance, $50 million to develop a certain aspect of a system, the U.S. may ask the vendor to pay 40% of that cost if they want to keep the intellectual property.

A key goal of the U.S. research funding is to avoid creation of one-off technologies that can only be used in these particular exascale systems.

"We have to be careful," Terri Quinn, a deputy associate director for HPC at Lawrence Livermore National Laboratory, said at the SC16 panel session. "We don't want them (vendors) to give us capabilities that are not sustainable in a business market."

The work under way will help ensure that the technology research is far enough along to enable the vendors to respond to the 2019 request for proposals.

Supercomputers can deliver advances in modeling and simulation. Instead of building physical prototypes of something, a supercomputer can allow modeling virtually. This can speed the time it takes something to get to market, whether a new drug or car engine. Increasingly, HPC is used in big data and is helping improve cybersecurity through rapid analysis; artificial intelligence and robotics are other fields with strong HPC demand.

China will likely beat the U.S. in developing an exascale system, but the real test will be their usefulness.

Messina said the U.S. approach is to develop an exascale eco-system involving vendors, universities and the government. The hope is that the exascale systems will not only a have a wide range of applications ready for them, but applications that are relatively easy to program. Messina wants to see these systems quickly put to immediate and broad use.

"Economic competitiveness does matter to a lot of people," said Messina.

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
Windows 10 annoyances and solutions
Shop Tech Products at Amazon
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.