Python gets a big data boost from DARPA
Continuum Analytics will extend the widely used NumPy library for distributed systems
IDG News Service - DARPA (the U.S. Defense Advanced Research Projects Agency) has awarded $3 million to software provider Continuum Analytics to help fund the development of Python's data processing and visualization capabilities for big data jobs.
The money will go toward developing new techniques for data analysis and for visually portraying large, multi-dimensional data sets. The work aims to extend beyond the capabilities offered by the NumPy and SciPy Python libraries, which are widely used by programmers for mathematical and scientific calculations, respectively.
More mathematically centered languages such as the R Statistical language might seem better suited for big-data number crunching, but Python offers an advantage of being easy to learn.
"Python is a very easy language to learn for non-programmers," said Peter Wang, president of Continuum Analytics. That's important because most big data analysts will probably not be programmers. If they can learn an easy language, they won't have to rely on an external software development group to complete their analysis, Wang said.
The work is part of DARPA's XData research program, a four-year, $100 million effort to give the Defense Department and other U.S. government agencies tools to work with large amounts of sensor data and other forms of big data.
For the XData project, DARPA awarded funding to about two dozen companies, including the University of Southern California, Stanford University and Lawrence Berkeley National Laboratory. The organizations are encouraged to use each other's technologies to further extend what can be done in big data, Wang said.
DARPA encouraged the funding recipients to release products based on their work and to release their code as open source, so the innovations can be widely used and supported outside of the military. The Defense Department is trying to avoid commissioning software that gets used only by the military, which may then become prohibitively time-consuming and expensive to update.
"With big data systems, you find new things you want to look at every week. You can't wait for that process any more," Wang said.
Headquartered in Austin, Continuum Analytics offers add-on products and services that help organizations use Python for data analysis. The company will use the DARPA money to continue development of a number of add-on technologies it has been working on, including Blaze, Numba and Bokeh, all of which provide advanced features not offered in Python itself.
At the PyData 2012 conference in New York last November, Continuum engineer Stephen Diehl discussed how Blaze would operate, describing the library as a potential successor to NumPy.
NumPy has limitations that Blaze seeks to correct, Diehl said. Most notably, NumPy only offers the ability to store a series of numbers as one continuous string of data. "It is a single buffer, a continuous block of memory. That may be OK for some uses, but the real world is more heterogenous," he said in a presentation.
- Google I/O 2013's Coolest Products and Services
- 10 Star Trek Technologies That are Almost Here
- 19 Generations of Computer Programmers
- 25 Must-Have Technologies for SMBs
- A walking tour: 33 questions to ask about your company's security
- 15 social media scams
- The 7 elements of a successful security awareness program
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- Federal IT Innovation Caught in a Catch-22
- Fed resources shoring up old infrastructure, holding back new technologies.
- Protection for Every Enterprise: How BlackBerry 10 Security Works
- Get an IT-level review of BlackBerry® 10 Security, addressing data leakage protection, certified encryption, containerization and much more.
- Manage Virtualized and Cloud Environments and the New Software-defined Data Center
- Analyst report by Enterprise Management Associates on the newly announced EMC Service Assurance Suite, and how well it addresses operational challenges and market...
- How Storage Resource Management Suite Meets Today's Storage Management Challenges
- This white paper outlines the common use cases Storage Resource Management Suite addresses including comprehensive monitoring, reporting, and analysis for heterogeneous block, file,...
- Sepaton DBeXstream Enhancements
- Silverton Consulting weighs in on why Sepaton is a compelling response to the data protection challenges inherent in today's large enterprise database environments... All Government IT White Papers
- 3 Reasons Why Sepaton is the World's Fastest Backup Solution
- Leading analyst, Storage Switzerland learns how Sepaton backs up and deduplicates massive data volumes while maintaining the industry's fastest performance - all in...
- Enterprise File Sharing: All You Need to Know
- Security. Scalability. Control. These are just some of the many benefits of enterprise cloud file-sharing that you'll discover in this KnowledgeVault, packed with...
- Bridging HTTP and FTP with FileXpress Internet Server
- What if you could take an FTP server on your internal network, and allow external users (partners or customers) to securely access it...
- MFT and FileXpress - An Overview
- Business users and applications exchange files on a regular basis. File transfer is a core part of the flow of business activity.
- Content Analytics: Big Data Conquered, Customer Service Elevated
- For organizations looking to start a content analytics program or improve their existing capabilities, Aberdeen Group and IBM will lay out several recommendations... All Government IT Webcasts