22 free tools for data visualization and analysis
Got data? These useful tools can turn it into informative, engaging graphics.
Computerworld - You may not think you've got much in common with an investigative journalist or an academic medical researcher. But if you're trying to extract useful information from an ever-increasing inflow of data, you'll likely find visualization useful -- whether it's to show patterns or trends with graphics instead of mountains of text, or to try to explain complex issues to a nontechnical audience.
There are many tools around to help turn data into graphics, but they can carry hefty price tags. The cost can make sense for professionals whose primary job is to find meaning in mountains of information, but you might not be able to justify such an expense if you or your users only need a graphics application from time to time, or if your budget for new tools is somewhat limited. If one of the higher-priced options is out of your reach, there are a surprising number of highly robust tools for data visualization and analysis that are available at no charge.
Here's a rundown of some of the better-known options, many of which were demonstrated at the Computer-Assisted Reporting (CAR) conference last month. Others are not as well known but show great promise. They range from easy enough for a beginner (i.e., anyone who can do rudimentary spreadsheet data entry) to expert (requiring hands-on coding). But they all share one important characteristic: They're free. Your only investment: time.
Data cleaning
Before you can analyze and visualize data, it often needs to be "cleaned." What does that mean? Perhaps some entries list "New York City" while others say "New York, NY" and you need to standardize them before you can see patterns. There might be some records with misspellings or numerical data-entry errors. The following two tools are designed to help get your data in tip-top shape to be analyzed.
DataWrangler
What it does: This Web-based service from Stanford University's Visualization Group is designed for cleaning and rearranging data so it's in a form that other tools such as a spreadsheet app can use.
Click on a row or column, and DataWrangler will suggest changes. For example, if you click on a blank row, several suggestions pop up such as "delete row" or "delete empty rows."
There's also a history list that allows for easy undo -- a feature that's also available in Google Refine (reviewed next).
What's cool: Text editing is especially easy. For example, when I selected "Alabama" in one row of sample data headlined "Reported crime in Alabama" and then selected "Alaska" in the next group of data, it led to a suggestion to extract every state name. Hover your mouse over a suggestion, and you can see affected rows highlighted in red.
Click to view larger image.
Drawbacks: I found that unexpected changes occurred as I attempted to explore DataWrangler's options; I constantly had to click "clear" to reset. And not all suggestions are useful ("promote row to header" seemed an odd suggestion when the row was blank) or easy to understand ("fold split 1 using 2 as key").
And while the fact that DataWrangler is a Web-based service makes it convenient to use, don't forget that it sends your data off to an external site -- which means it isn't an option for sensitive internal information. However, there are plans for a future release of a stand-alone desktop version. Another important thing to keep in mind is that DataWrangler is currently alpha code, and its creators say it's "still a work in progress."
Skill level: Advanced beginner.
Runs on: Any Web browser.
Learn more: There's a screencast on the Data Wrangler home page. Also, see this post on using DataWrangler to format data (from Tableau Public's blog).



- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- ESG Lab Review: Focus on Federated Workload Balancing, Asset Management, and Thin Provisioning
- This ESG Lab review documents hands-on testing of HP 3PAR Peer Motion Software's distributed volume management with a focus on federated workload balancing,...
- Pay-As-You-Grow: Investment Protection and Elasticity for Your Network
- NetScaler Pay-as-You-Grow. 5x capacity on-demand. No new hardware. The Iron Age is officially over. The Cloud Age is here.
- Best Practices for Implementing 2048-bit SSL
- Ready for 2048-bit SSL? Not with F5. Shift to Citrix NetScaler. 2x faster 2048-bit SSL performance. Download the Citrix and Verisign 2048-bit Performance...
- 8 Reasons Why Citrix NetScaler Beats F5
- Is your network ready for the Cloud? Not with F5. Shift to Citrix NetScaler. Shift up to the Cloud. Download 8 Reasons Citrix...
- Accelerate time to application value
- For your IT organization to keep pace with the business, you need a new, faster approach to infrastructure deployment-an approach that increases agility... All Applications White Papers
- Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
- Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
- Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
- Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
- Virtualize Business-Critical Applications with Confidence
- Virtualizing business-critical applications has become a key focus for organizations as they move along their virtualization journey. With the launch of VMware vSphere®...
- Discover the Benefits of Virtualization for Federal Applications
- Want to say goodbye to missed SLAs? VMware can help you virtualize mission-critical applications such as Oracle, MS Exchange and SharePoint to achieve...
- Reduce Application Lifecycle Management Costs with VMware ThinApp
- Traditional desktop application deployment and management is a time-consuming and costly endeavor for IT. From development to deployment, including help desk support, the... All Applications Webcasts