Python gets a big data boost from DARPA
Continuum Analytics will extend the widely used NumPy library for distributed systems
IDG News Service - DARPA (the U.S. Defense Advanced Research Projects Agency) has awarded $3 million to software provider Continuum Analytics to help fund the development of Python's data processing and visualization capabilities for big data jobs.
The money will go toward developing new techniques for data analysis and for visually portraying large, multi-dimensional data sets. The work aims to extend beyond the capabilities offered by the NumPy and SciPy Python libraries, which are widely used by programmers for mathematical and scientific calculations, respectively.
More mathematically centered languages such as the R Statistical language might seem better suited for big-data number crunching, but Python offers an advantage of being easy to learn.
"Python is a very easy language to learn for non-programmers," said Peter Wang, president of Continuum Analytics. That's important because most big data analysts will probably not be programmers. If they can learn an easy language, they won't have to rely on an external software development group to complete their analysis, Wang said.
The work is part of DARPA's XData research program, a four-year, $100 million effort to give the Defense Department and other U.S. government agencies tools to work with large amounts of sensor data and other forms of big data.
For the XData project, DARPA awarded funding to about two dozen companies, including the University of Southern California, Stanford University and Lawrence Berkeley National Laboratory. The organizations are encouraged to use each other's technologies to further extend what can be done in big data, Wang said.
DARPA encouraged the funding recipients to release products based on their work and to release their code as open source, so the innovations can be widely used and supported outside of the military. The Defense Department is trying to avoid commissioning software that gets used only by the military, which may then become prohibitively time-consuming and expensive to update.
"With big data systems, you find new things you want to look at every week. You can't wait for that process any more," Wang said.
Headquartered in Austin, Continuum Analytics offers add-on products and services that help organizations use Python for data analysis. The company will use the DARPA money to continue development of a number of add-on technologies it has been working on, including Blaze, Numba and Bokeh, all of which provide advanced features not offered in Python itself.
At the PyData 2012 conference in New York last November, Continuum engineer Stephen Diehl discussed how Blaze would operate, describing the library as a potential successor to NumPy.
NumPy has limitations that Blaze seeks to correct, Diehl said. Most notably, NumPy only offers the ability to store a series of numbers as one continuous string of data. "It is a single buffer, a continuous block of memory. That may be OK for some uses, but the real world is more heterogenous," he said in a presentation.
- 15 Non-Certified IT Skills Growing in Demand
- How 19 Tech Titans Target Healthcare
- Twitter Suffering From Growing Pains (and Facebook Comparisons)
- Agile Comes to Data Integration
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
If you use ‘password,’ one the worst passwords, as your password, fail to keep antivirus protection updated and don’t bother to deploy security patches to close critical vulnerabilities, then maybe you should consider working for the cybersecurity-clueless federal government; you’d fit right in, according to Senator Tom Coburn's cybersecurity and critical infrastructure report.
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- Changing the Way Government Works: Four Technology Trends that Drive Down Costs and Increase Productivity
- This paper discusses four technology-based approaches to improving processes and increasing
productivity while driving down department and agency costs.
- HP HAVEn: See the big picture in Big Data
- HP HAVEn is the industry's first comprehensive, scalable, open, and secure platform for Big Data. Enterprises are drowning in a sea of data...
- What Datapipe customers need to know about the new PCI DSS 3.0 compliance standard
- This handy quick reference outlines what PCI DSS 3.0 is, who needs to be compliant and how Alert Logic solutions address the new...
- The 12 PCI DSS 3.0 requirements addressed by Peer 1 Hosting
- This handy quick reference outlines the 12 PCI DSS 3.0 requirements, who needs to be compliant and how Alert Logic solutions address the...
- Defense Throughout the Vulnerability Life Cycle
- This whitepaper provides insight into how to leverage threat and log management technologies to protect your IT assets throughout their vulnerability life cycle. All Government IT White Papers
- Meg Whitman presents Unlocking IT with Big Data During this Web Event you will hear Meg Whitman, President and CEO, HP discuss HAVEn - the #1 Big Data platform, as well...
- The New Way to Work Knowledge Vault This Knowledge Vault focuses on how, in today's increasingly virtual world, it's more important than ever to engage deeply with employees, suppliers, partners,...
- Getting Ready for BlackBerry Enterprise Service 10.2 Find out how BlackBerry® Enterprise Service 10 helps organizations address the full spectrum of EMM challenges, while balancing the needs of both the...
- Containerization Options: How to Choose the Best DLP Solution for Your Organization This webcast outlines a framework for making the right choice when it comes to containerization approaches, along with the pros and cons of...
- Mobile Apps and Devices Slash Customer Cycle Time Consolidated Engineering Laboratories' field employees used to collect data on triplicate forms that were sometimes hard to read and difficult to manage. After...
- All Government IT Webcasts