The recipient of the funding, Continuum Analytics, produces Anaconda, which the company describes as “a collection of premium features for Python that enables large-scale data management, analysis, and visualization for Business Intelligence, Scientific Analysis, Engineering, Machine Learning, and more.” Users can purchase Anaconda for $249, try it out free for 30 days, or download Anaconda Community Edition (CE), the completely free version of the product.
Derrick Harris, writing for GigaOm, explained that Anaconda supports NumPy and SciPy, two popular scientific Python libraries. He also noted that Continuum Analytics offers Wakari, a browser-based analytics environment that the company describes as “WordPress, Github, and Youtube for science, engineering, and business data analytics.”
Both Anaconda and Wakari sound like the kinds of products that might, perhaps, be of interest to DARPA, but the government organization is likely also intrigued by Continuum Analytics’s other open source efforts. The start-up company is getting behind Blaze, which is working to extend NumPy “to handle out-of-core computations on large data that exceed the system memory capacity, as well as distributed and streaming dataset.” Continuum Analytics is also behind Bokeh, a Python data visualization library designed to support interactive visualization, statistical plotting, multidimensional datasets, and the needs of non-programmers.
DARPA’s investment fits in with President Obama’s announcement just under a year ago of plans to spend $200 million in “big data” research and development investments. Six federal departments and agencies pledged their commitment to these plans, including DARPA. DARPA’s piece of the plan, the XDATA program, involves investing $25 million a year for four years “to develop computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g., tabular, relational, categorical, meta-data) and unstructured (e.g., text documents, message traffic).”
It’s particularly interesting – and appropriate – that DARPA is funding a company focused on commercializing Python’s big data capabilities and solutions. Lately, because of the money to be made from mining personal information, the private sector has been leading the charge in developing the newest techniques. As a result, students often leave college right after (or even before) finishing their theses projects to work for Google, Facebook, and other companies focused on the profits to be gained from big data. This situation leaves universities and even government projects scrambling for brain power.
Continuum Analytics will use DARPA’s investment to “work on a new dynamic visualization system for interactive visual exploration of large, complex data sets.” The idea behind this project seems to be enabling non-programmers to look at a large set of data using visual models – perhaps in some way similar to the various kinds of charts with which we’re all familiar – to spot the unusual data points that jump out and scream for attention, or indicate possible cause-and-effect relationships that might otherwise be missed.
The investment from DARPA is yet another indication, as if it were needed, of Python’s widespread acceptance and usage. Not bad for a language that was first released in the late 1980s with the philosophy of keeping the code simple and readable. Indeed, there’s something almost poetic in the application of such a language to solving the problems of “big data” – where too much information complicates analysis. Who knows? With DARPA putting its weight behind this, we could end up with analytical tools that are as useful to our activities as the Internet.