IBM to develop telescope data analysis system
IBM will develop advanced technologies for what will be the world's largest radio telescope, operational in 2024
IDG News Service - IBM is developing new data management and analysis technologies for what will be the world's largest radio telescope. The Square Kilometre Array (SKA), due to become operational in 2024, will produce so much data that even tomorrow's off-the-shelf computers will have difficulty processing all of it, the company predicted.
"This is a research project to find out how to build a computer system," to handle exabytes' worth of data each day, said Ton Engbersen, an IBM researcher on the project.
The Netherlands has granted IBM and the Netherlands Institute for Radio Astronomy (ASTRON) a five-year grant of a!32.9 million ($43.6 million) to design a system, with novel technologies, that can ingest the massive amounts of data that SKA will produce.
Funded by a consortium of 20 government agencies, SKA will be the world's largest and most sensitive radio telescope, able to give scientists a better idea of how the Big Bang unfolded 13 billion years ago. SKA will actually be comprised of 3,000 small antennas, each providing a continual stream of data.
Once operational, the telescope will produce more than an exabyte of data a day (an exabyte is 1 billion gigabytes). By way of comparison, an exabyte is two times the entire daily traffic on the World Wide Web, IBM estimated. The data will have to be downloaded from the telescope, which will either be in Australia or South Africa, and then summarized and shipped to researchers worldwide. Data processing will consist of assembling individual streams from each antenna into a larger picture of how the universe first came about.
Even factoring in how much faster computers will be in 2024, IBM still will need advanced technologies to process all that data, Engbersen said. Such a computer might use stacked chips for high volumes of processing, photonic interconnects for speedy connections with the chips, advanced tape systems for data storage, and phase-change memory technologies for holding data to be processed.
"We have to push the envelope on system design," Engbersen said. The researchers have made no decisions yet about whether it should be in one data center or spread out across multiple locations.
Because the system will be so large, the researchers must figure out how to make maximum use of all the hardware components to use as little energy as possible. They also must customize the data-processing algorithms to work with this specific hardware configuration.
After processing, the resulting dataset is expected to produce between 300 and 1,500 petabytes each year. This volume will dwarf the amount of data produced by what is now by far the largest generator of scientific data, CERN's Large Hadron Collider, which churns out about 15 petabytes of data each year.
Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com
- Google I/O 2013's Coolest Products and Services
- 10 Star Trek Technologies That are Almost Here
- 19 Generations of Computer Programmers
- 25 Must-Have Technologies for SMBs
- A walking tour: 33 questions to ask about your company's security
- 15 social media scams
- The 7 elements of a successful security awareness program
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- How Storage Resource Management Suite Meets Today's Storage Management Challenges This white paper outlines the common use cases Storage Resource Management Suite addresses including comprehensive monitoring, reporting, and analysis for heterogeneous block, file,...
- When Application Performance is Better, Business Works Better Poor application performance can cost more than you think. In fact, Enterprise Management Associates reports that it can exceed $1 million per hour...
- The Practitioner's Guide to Data Profiling This paper considers the techniques used by data profiling tools, including reverse engineering, assessment for potential anomalies and validation of metadata and data...
- How to Effectively Realize Data Visualization Data visualization enables decision makers to understand what data really means. SAS Visual Analytics is a high-performance, in-memory solution for exploring massive amounts...
- Live Webcast
Webinar: Create Competitive Advantage, Featuring Synchology - View Now!
- Webinar: Create Competitive Advantage, Featuring Synchology View Now!
- Software Asset Management - Program Considerations to Help Reduce Risk and Lower Costs SAM: A must have IT tool to help reduce costs and minimize business and legal risks. All Business Intelligence/Analytics White Papers | Webcasts