Cray brings Hadoop to supercomputing
Cray has released a package designed to allow XC30 users to easily deploy Hadoop
IDG News Service - Helping scientific supercomputing take advantage of emerging big-data technologies, high-performance computing manufacturer Cray is releasing a set of packages promising to optimize the process of running Hadoop on the company's XC30 machines.
The Cray Framework for Hadoop, along with the Cray Performance Pack for Hadoop, provides a set of tools and best practices for configuring and optimizing an XC30 to run Hadoop for scientific big-data-style projects, according to the company.
Hadoop's Java-based MapReduce model of data analysis could bring a number of benefits to supercomputing, though it has not found widespread acceptance in that community yet, even though both deploy parallel processing and extremely large data sets.
Cray has seen some interest in Hadoop from its users, though the open-source data processing platform was not set up to meet most scientific supercomputing use cases, said Bill Blake, chief technical architect of Cray, in a statement.
Hadoop's approach of bringing the computation to the data differs from the traditional supercomputing approach of moving the data to the processors.
Traditional supercomputing scientific number-crunching tends to rely on large hierarchical file formats and libraries for boosting rates of I/O (input/output), neither of which Hadoop was geared well for handling. Scientific computing relies on parallel file systems and fast interconnects typically not found in Hadoop deployments.
Scientific workloads also tend to have more complex workflows, incorporating both scientific compute and analytics workloads. Data models are also co-mingled with math models in scientific computing, also not the norm for Hadoop.
The Cray Framework for Hadoop and the Cray Performance Pack for Hadoop will address these issues, allowing users to get the most computational power out of the XC30s for Hadoop jobs, according to the company.
An update to the performance pack, to be made available in early 2014, will also include additional system code to optimize the XC30's use of the Lustre file system library and the Aries system interconnect used on Cray machines.
The XC30 is Cray's premier supercomputer, featuring integrated servers and switches, the Lustre parallel file system, Aries high-speed interconnects, an innovative cooling system, and the Dragonfly network topology for minimizing locality constraints.
Cray announced the packages at the SC2013 supercomputing conference, being held this week in Denver.
Cray also announced that it is upgrading the University of Stuttgart's XC30, nicknamed "Hornet," so it will offer more than seven petaflops (quadrillion mathematical calculations per second) of processing power.
- An Insightful Approach to Optimizing Mainframe MLC Spend This paper discusses how you can penetrate the complexity of IBM mainframe MLC products and the MLC price model to gain insight into...
- Meeting the Exploding Demand for New IT Services In this eBook, explore the top trends driving the New IT for IT Service Management, and how leading organizations are evolving to focus...
- Hybrid IT-A Low-Risk Path from On-Premise to ITaaS This white paper provides a strategy to move part or all of your ITSM suite to the cloud as a stepping stone to...
- Paving the Windows XP Migration Path to Success Support for Windows XP has ended, leaving organizations with three choices: Windows 8, Windows 7 or a combination. With the right planning and...
- Increase Your Data Center IQ Discover how to improve network efficiency, lower IT costs and more proactively manage your physical, virtual and cloud environments.
- Optimize Data Center Resources and Plan for the Future Eliminate over-provisioning and capacity shortfalls with pro-active capacity optimization. Join us in the evolution from capacity monitoring to capacity optimization in your data... All Hardware White Papers | Webcasts