How to convert Shapefiles to GeoJSON maps for use on GitHub (and why you should)

With GitHub natively supporting mapping and embeds, I recently wanted to put some of the free, publicly available government data published on data.dc.gov to use. They have all sorts of great information, from bus routes to polling places, to the location of every liquor license in DC. The only problem was that the data was stored in a proprietary and complex format known as a Shapefile which arose in an age when the desktop ruled supreme and requires a costly software subscriptionn for many common uses.

Luckily, a strangely named piece of open source software known as ogr2ogr can convert the data into the more modern, more open GeoJSON format that GitHub supports, and the resulting map can be automatically rendered, not to mention more easily diffed.

If you’ve got a Mac, it only takes a few seconds:

  1. If you don’t already have it, install homebrew by opening up terminal and running: $ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
  2. Install gdal with the command: $ brew install gdal
  3. Grab a Shapefile (distributed as a .zip file) from the DC Data Catalog or your favorite data source and unzip it someplace convenient
  4. cd into the directory with your shiny new unzipped Shapefile
  5. Run (replacing [name] with the name of your downloaded Shapefile): $ ogr2ogr -f GeoJSON -t_srs crs:84 [name].geojson [name].shp
  6. Grab the resulting GeoJSON file and commit it to GitHub
  7. Navigate to the GeoJSON file on GitHub.com to browse your map

In addition to converting the Shapefile over to GeoJSON, the other step in there, -t_srs crs:84, ensures that by the time the data hits GitHub, it’s encoded with the right projection so it can be mapped properly. Need to convert multiple Shapefiles in bulk? Just use this bulk Shapefile to geoJSON conversion script. Windows users? You’re covered too.

Note: The same process should work for KML files as well, replacing [name].shp with [name].kml.

So why’s this important?

For one, you’re liberating public geodata that would otherwise be inaccessible to the average citizen and making it available in a dumb-simple point, click, zoom interface that anyone can use. For another, by putting the information on GitHub in an open, text-based format, civic hackers and subject-matter experts can begin treating that data like open source code — forking, merging, diffing, tracking changes over time — and all of a sudden we’ve opened up not just the data, but the entire collaborative ecosystem that now surrounds it.

The result

This content is open source.
Please help improve it.