8 cool tools for data analysis, visualization and presentation
Last year, we looked at 22 data analysis tools. This year, we add 8 more to the mix.
Mr. Data Converter
What it does: How often do you have data in one format -- while your application needs it in another? New York Times interactive graphics editor Shan Carter ran into this situation often enough that he coded a tool that converts comma- or tab-delimited data into nine different formats. It's available as either a service on the Web or an open source tool.
What's cool: Mr. Data Converter can generate XML, JSON, ASP/VBScript or basic HTML table formatting as well as arrays in PHP, Python (as a dictionary) and Ruby. It will even generate MySQL code to create a table (guessing at field formats based on the data) and insert your data. If your data is in an Excel spreadsheet, you don't need to save it as a CSV or TSV; you can just copy and paste it into the tool.
Drawbacks: Only CSV or TSV formats can be input, as well as copying and pasting in data from Excel.
Skill level: Beginner
Runs on: JavaScript-enabled Web browsers
Learn more: You can follow Mr. Data Converter on at @mrdataconverter.
Related tools: is a Web-based tool that reformats data to your specifications.
Panda Project
What it does: Panda is less about analyzing or presenting data than finding it amidst the pile of standalone spreadsheets scattered around an organization. It was specifically designed for newsrooms, but could be used by any organization where individuals collect information on their desktops that would be worth sharing. Billed as a "newsroom appliance," users can upload CSV or Excel files to Panda and then search across all available data sets or a within a single file.
What's cool: Panda makes it simple to give others access to information that's been sitting on individuals' hard drives in different stand-alone spreadsheets. Even non-technical users can easily upload and search data. Search is extremely fast, using ApacheSolr.
Drawbacks: Queries are basic -- you can't specify a particular column/field to search, so a search for "Washington" would bring back items containing both the place and a person's name. The required hosting platform is quite specific, requiring Ubuntu 11.1. (Panda's developers have created an Amazon Community Image with the required server setup for hosting on Amazon Web Services EC2.)
Skill level: Beginner (Advanced Beginner for administration)
Runs on: Must be hosted on Amazon EC2 or a server running Ubuntu 11.10. Clients can use any Web browser.
Learn more: Panda documentation, still in the works, gives basics on setup, configuration and use. Nieman Journalism Lab has some background on the project, which was funded by a $150,000 Knight News Challenge grant.