10/28/2008

Web based data visualization tools!

Great references:
40 Essential Tools and Resources to Visualize Data

Quote:

PHP

PHP was the first scripting language I learned that was well-suited for the Web, so I'm pretty comfortable with it. I oftentimes use PHP to get CSV files into some XML format. The function fgetcsv() does just fine. It's also a good hook into a MySQL database or calling API methods.

RESOURCES:

Python

Most computer science types - at least the ones I've worked with - scoff at PHP and opt for Python mostly because Python code is often better structured (as a requirement) and has cooler server-side functions. My favorite Python toy is Beautiful Soup, which is an HTML/XML parser. What does that mean? Beautiful Soup is excellent for screen scraping.

RESOURCES:

MySQL

When I have a lot of data - like on the magnitude of the tends to hundreds of thousands - I use PHP or Python to stick it in a MySQL database. MySQL lets me subset on the data on pretty much any way I please.

RESOURCES:

R

Ah, good old R. It's what statisticians use, and pretty much nobody else. Everyone else has it installed on their computer, but haven't gotten around to learning it. I use R for analysis. Sometimes though, I use it to extract useful subsets from a dataset if the conditions are more complex than those I'd use with MySQL and then export them as CSV files.

RESOURCES:

Microsoft Excel

We all know this one. I use Excel from time to time when my dataset is small or if I'm in a point-and-click mood.

Charts and Graphs


Alright, the data are processed, formatted, and ready to go. Now it's time to visualize. The software I use for static charts and graphs depends on the task at hand, so I try not to limit myself to anyone piece of software. For example, R is good for quick results, but no good for a Web application.

Adobe Illustrator

I use Adobe Illustrator for publication-level graphics. I learned how to use it when I was at The Times out of necessity and have been enjoying it since. You can manipulate every element of a graph with a simple click and a drag - which can be a blessing and a curse.

RESOURCES:

R

If you have a particular type of (non-animated, non-interactive) statistical visualization in mind, R has probably got it. R is free with countless libraries available. If you can't find a library to suit your needs, you can always script it yourself. One cool thing about R is that you can save your graphics as PDF and then polish it in Adobe Illustrator.

RESOURCES:

PHP Graphics Library

I've only had limited experience the the PHP GD library. There are several PHP graphing packages available, but I haven't found one that I liked a whole lot, so I'm usually more satisfied drawing my own graphs with the GD library. The Sparklines PHP graphing library isn't half bad either.

RESOURCES:

HTML + CSS + Javascript

You can surprisingly do quite a bit with some simple HTML and CSS. You can make graphs and of course tables as well as control colors and sizes. For example, a lot of the tag clouds you see on the Web are just HTML and CSS. Throw Javascript in to the mix and you've got yourself a party i.e. interaction capabilities.

RESOURCES:

Flash/Actionscript

Flash and Actionscript is better known for animating and moving data, but it can be used for static stuff too. It's pretty good if you want to add interaction to your visualization like highlighting or filtering. I've done some stuff from scratch and also played around with Flare, the Actionscript visualization toolkit.

RESOURCES:

Microsoft Excel

It's pretty rare that I use Excel for graphics. If I need something really quick though and the data are already in an Excel spreadsheet, I'll click that graph button.

RESOURCES:

Animating the Data

Twitter World

There are several options to create animated and interactive data visualization, but these are the only ones I use (and for the most part, dominate what you see on the Web).

Processing

Yeah, it's called Processing. I've seen mostly designers use it, but there's no reason it can't be used elsewhere. Processing uses a canvas metaphor where you draw and make sketches and then get a Java applet out of it. Processing was created to make programmatic goodness available to non-programmers.

RESOURCES:

Flash/Actionscript

Flash and Actionscript has been my point of interest lately – mostly because the Java applet is dead as far the Web is concerned. The interactive/animated visualization you see from places like The New York Times, Stamen Design, and web applications are usually implemented with Flash and Actionscript. Not sure if it's Flash? The tell tale sign is a simple right click on whatever you're looking at. Take a look at my previous post on How to Learn Actionscript for Data Visualization for more details.

10/07/2008

Perl Tip: Converting DOS to UNIX file format

It is as easy as changing all \r\n characters with \n to convert them to UNIX format. So, command line wise, this is all you have to do:
perl -pi -e 's/\r\n/\n/;' *.txt