Using Data – Tools & Resources

When working with (open) data, a number of tools can be very helpful. Some of them are very basic, some of them are on a very advanced level and have a steep learning curve.

Usability

usability.gov: One might not directly expect a government site on usability, especially when considering the usability of a number of government websites. But this U.S. site actually offers a concise view on how to present information in a useful and helpful manner.

Web Accessability Initiative: Ressources, tools and methods to evaluate the accessability of a site for people with disabilities.

Visualization

Datawrapper: An easy method of displaying your data (copy & paste or xls/csv file) in a number of visualization types, such as bar charts, pie charts and more.

d3.js – Data Driven Documents: This library has somewhat become the standard for building interactive visualizations. There is an extensive library of showcases, and D3 also can be used for map applications. For this its creator Mike Bostock has built the TopoJSON library, which allows storing geo data in very small file sizes.

driven-by-data.net: If you are interested in good visualizations and on helpful thought on making visual sense out of data, Gregor Aisch is a very high recommendation. He also regularly blogs about these things at vis4.net

Making Data Visualizations – A Survival Guide: Gregor Aisch shows the do’s and don’ts of infographics and visualization in this presentation at the Perugia International Journalism Festival.

improving-visualisation.org: Project which focuses on helping the public sector improving how to visualize data. It includes good practice examples and case studies, practical and step-by-step guides.

infovis-wiki.net: This website gathers information on how to visualize data and also comprises a helpful glossary.

visualisingdata.com: Website run by the specialist Andy Kirk covering different aspects of data visualization.

Mapping

Kartograph: Build interactive map applications without a mapping service – with a simple Python Library for building svg maps and a Javascript library for creating the interactive maps.

Leaflet: Open Source Javascript library for a number of interactive map applications (such as heatmaps and the like).

Mapbox: Custom map and marker designs, based on Open Street Map.

Processing data

Open Refine: If you have a „dirty“ dataset, and have to separate fields, finde name duplicates, or clean large numbers of cells, this is your application. Formerly known as Google Refine.

The R Project: This tool, mainly known just as „R“, is mighty and powerful, but it might take you some time to dive into it. Once you are comfortable with it, you can use scripts and a vast number of libraries. It is also the reference for scientists working with data.

Gephi: Open source tool for analyzing networks.

School of Data: Online Tutorials on different ascpects of working with data – from basic to geeky.

Liberating Data

Abby Finereader: This OCR software costs you some money (€ 130) but is worth it – in contrary to some web services, you really have full control on the table structure it extracts – the data you get out of the PDF tends to be instantly usable.

Tabula: An open source service developed by Mozilla Open News for uploading PDFs and receiving tabular CSV data. It is not an online service, but you have to install an instance yourself – so more something for bigger projects.

Cometdocs: A simple online converter for PDFs, which works pretty well on not so complex tables captured in PDFs.