University of Queensland’s new supercomputer takes a different IT approach

The University of Queensland has invested in a high-performance computer to replace its three existing HPCs that are reaching end of life and has taken a different approach as to how it will keep the new HPC competitive and up to date.

polaris data centre uq bunya
University of Queensland

The University of Queensland (UQ) has acquired a new high-performance computer (HPC) after going to market in October 2021.

The HPC called Bunya—named after the native South-East Queensland tree—has two unusual facets, one is that as a new and more powerful machine, it will primarily replace UQ’s three existing HPCs.

Another aspect is that the university took a different approach as to how upgrades to Bunya will be done. In previous deals, UQ would have signed a contract to purchase any further components from the vendor providing the machine but it is now free to seek elsewhere if there is more value for its users. This means that UQ will be able to do upgrades by technology layers, so when something starts not performing as well and there is a better technology available it can choose to upgrade just one layer.

Purchased from Dell, Bunya has:

  • ~6000 AMD EPYC Milan series cores, 96 physical cores per node.
  • 2TB of memory per node on a standard compute node.
  • 4TB of memory per node on each of the three high-memory capability nodes.
  • a blocking topology Infiniband HDR cluster interconnect, running at a native 200Gbps per port, per node.
  • three servers worth of early-access exploratory/cutting-edge systems containing AMD Instinct MI2xx series GPU accelerators. 
  • a current generation RHEL-type Linux distribution.

In this instance no storage was acquired as the university has pre-existing high performance parallel file systems to provide storage and did not have to procure any further storage.

According to data from Forrester’s business technographics infrastructure survey from 2021, when asked what services would run in the private cloud, Australian decision makers named HPC services as the top one alongside data base both with 35%, while HPC came as No. 4 for services expected to run on the public cloud with 29%.

How Bunya is set up to replace three HPCs

Bunya will replace three HPCs: Awoonga, FlashLite, and Tinaroo, all three have been operational for seven years.

Awoonga has been decommissioned in March 2022, FlashLite is expected to be decommissioned by August 2022, and Tinaroo by 2023. The machines will be phased out as Bunya is phased into being fully operational.

UQ says it does not have plans to build another HPC to replace Awoonga, FlashLite, and Tinaroo because Bunya will integrate the specialised aspects of the predecessor HPCs under one system being able to perform a wide range of research domains from sciences to humanities.

The first phase of firing Bunya up is expected for July 2022, when early access and test users will be utilising the machine to help calibrate workflows, layouts, schedules and technologies. This testing phase is not a light test, Jake Carroll, CTO at  the University of Queensland Research Computing Centre (RCC) tells Computerworld Australia, these are aggressive users that help give quality feedback and helps to prepare the HPC to meet the high expectations of all users when it is fully functional, something that is expected to happen by the end of 2022.

Replacing three HPCs with one was possible because of how the technology around scheduling different workloads has evolved. Where before it was cost-efficient to have different machines for different workloads and researchers would be directed to the appropriate machine, this is now much easier to handle, says RCC director David Abramson.

Bunya is expected to be much faster, “you could transfer an entire 23 gigabyte Blu-Ray movie into a node on FlashLite or Tinaroo in around 3.28 seconds; Bunya can achieve the same transfer in 0.92 seconds,” Carroll said in a statement.

Bunya is expected to have a similar life span to its predecessors of between five and seven years.

When it comes to the HPCs being decommissioned, UQ is looking if it is commercially viable to reutilise any components of FlashLite.

A different approach for a new high-performance computer

Rather than buying the biggest machine available, the University of Queensland RCC—which provides coordinated management and support of UQ’s sustained and substantial investment in e-research—opted for a lightweight and fast-moving product which also allows for it to keep up with market capability.

Part of this new approach helps to keep Bunya up to date with new HPC technologies, which benefits the researchers, but also ensures a lesser impact on the environment by being more power efficient than its predecessors. This means UQ is opting for a layered refresh approach every year for at least three years to begin with.

In the past, the university would have been locked in a contract to keep working with the vendor of choice, and although UQ has the intention to work with the chosen vendor for multiple years, provided there is value in that continuation that benefits UQ, there is no contract locking the university to procure further with Dell.

Another reason for this approach is so it can keep on the current market edge of silicon.

“So instead of having situations where a researcher will come to us a year later and say, ‘well, this is great what you provided me, but now we’re running at a deficit of capability’, or ‘we don’t have enough cores’ or ‘we don’t have the right type of cores or accelerator technologies’. And we are experiencing a silicon renaissance at the moment. As far as all these new kinds of processing units are concerned, this gives us that upper hand and that ability to move quickly,” Carroll tells Computerworld Australia.

This is not the first HPC that Dell has provided to UQ, in 2017 UQ’s department of molecular biology, neuroscience, and translational research started to build Wiener with Dell.

Another difference is that, after Bunya is set and running it will be in the hands of UQ to look after everything else from the software stack and the management of packages and security, which gives UQ control, understanding and flexibility.

Funding a HPC and user access

Bunya was funded in its majority by UQ with further contributions from the Institute for Molecular Bioscience (IMB), the Australian Institute for Bioengineering and Nanotechnology (AIBN)—both are based at UQ—and the Queensland Cyber Infrastructure Foundation (QCIF), currently formed by 10 members.

For this reason, Bunya will be accessible to all this investing parties and mostly UQ, says Abramson.

Researchers with the university have easy access to the machine as opposed to those who may need to use what the researchers refer to as a tier 1 facility—such as Pawsey or National Computational Infrastructure (NCI). For these, there is usually a process a researcher would need to apply for peer review which would be longer than those already within UQ who can have access almost right away. However, as a member of QCIF, this means researchers at all ten of these QCIF members have access to QCIF’s HPC services, which, according to UQ, includes a share of time on NCI’s Gadi.

Copyright © 2022 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon