Need to crunch 150 teraflops per second? Meet first-of-a-kind supercomputer Catalyst

Although we can’t “see” unstructured data or the treasures hidden in it, in ten years we could be drowning in it as big data experts predict the global data volume will exceed 35 trillion gigabytes. How is anyone supposed to make the most of it? A massive and mind-blowing amount of crunching power from Catalyst, a “first-of-a-kind” supercomputer in California, is now available to American industry and academia for collaborative research. At Lawrence Livermore National Laboratory, or LLNL, “The 150 teraflop/s (trillion floating operations per second) Catalyst cluster has 324 nodes, 7,776 cores and employs the latest-generation 12-core Intel Xeon E5-2695v2 processors.”

You hear about big data every day; now IT is being urged to assume responsibility for “wearables” so the data from those devices can be better exploited. Even the White House issued an 85-page report, urging agencies to find a way to harness big data without harming privacy and civil liberties in "a world where data collection will be increasingly ubiquitous, multidimensional, and permanent."

"Over the next decade, global data volume is forecasted to reach more than 35 zettabytes," (a zettabyte is a trillion gigabytes) explained Fred Streitz, director of the HPC Innovation Center (HPCIC). "That enormous amount of unstructured data provides an opportunity. But how do we extract value and inform better decisions out of that wealth of raw information?" LLNL believes the answer is the sexy beast Catalyst.

Supercomputing beast Catalyst at Lawrence Livermore National Laboratory

"The Catalyst supercomputer at Lawrence Livermore employs a Cray CS300 architecture modified specifically for data-intensive computing. The system is now available for collaborative research with industry and academia." Credit: LLNL

Catalyst features include 128 gigabytes (GB) of dynamic random access memory (DRAM) per node, 800 GB of non-volatile memory (NVRAM) per compute node, 3.2 terabytes (TB) of NVRAM per Lustre router node, and improved cluster networking with dual rail Quad Data Rate (QDR-80) Intel TrueScale fabrics. The addition of an expanded node local NVRAM storage tier based on PCIe high-bandwidth Intel Solid State Drives (SSD) allows for the exploration of new approaches to application check-pointing, in-situ visualization, out-of-core algorithms and big data analytics. NVRAM is familiar to anyone who uses USB sticks or an MP3 player; it is simply memory that is persistent and that remains on files even when the power is off, hence "non-volatile."

What would anyone do with that kind of power? Sorry bitcoin miners. Instead of thinking along the lines of hijacking the supercomputing cluster to mine cryptocurrency like at Harvard, think in the realms of “science and technology in the national interest” as LLNL is a federally funded research and development center. Its strategic missions include bio-security, counterterrorism, intelligence, defense, energy, weapons & complex integration, and science, technology and engineering.

"YouTube claims that 100 hours of video are uploaded to its website every minute," and analyzing video – “the fastest-growing type of content on the Internet” – is something Catalyst can do. Doug Poland, computational engineer working on video analytics, noted that “consumer-produced videos are a wealth of information about the world that's essentially untapped."

Catalyst “will serve to host very large models for video analytics and machine learning.” Poland pointed out that “current tools are unable to search through the richness of video elements such as visual, audio and motion, and associated metadata like semantic tags and geo-coordinates.” His team is “looking to build more complex models that consider the sum of those features, and that can be recognized in real-time for user-specific search needs. Catalyst allows us to explore entirely new deep learning architectures that could have a huge impact on video analytics as well as broader application to big data analytics."

Other suggested collaborations utilizing Catalyst’s crunching power are “bioinformatics, big data analysis, graph networks, machine learning and natural language processing, or for exploring new approaches to application checkpointing, in-situ visualization, out-of-core algorithms and data analytics.” Companies interested in access to Catalyst should check out the Notice of Opportunity posted on Federal Business Opportunities.

After being developed by Intel, Cray and Lawrence Livermore, LLNL announced Catalyst in Nov. 2013, but LLNL had over 30 supercomputers before this; the Livermore Lab has a history of “acquiring and exploiting the fastest and most capable supercomputers in the world.”

Superheavy Element 117

Lawrence Livermore National Laboratory Element 117

LLNL is also in the news due to a superheavy element formerly discovered by Lawrence Livermore scientists in collaboration with researchers from the Joint Institute for Nuclear Research in Russia. Element 117 has now been reproduced by an international consortium for heavy ion research. In the German experiments that confirmed its existence, “scientists bombarded a berkelium target with calcium ions until they collided and formed element 117. Element 117 then decayed into elements 115 and 113.” If that zipped over your head like it did mine, it basically means element 117 is one step closer to being named.

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
7 Wi-Fi vulnerabilities beyond weak passwords
Shop Tech Products at Amazon