How to convert Shapefiles to GeoJSON maps for use on GitHub (and why you should)

TL;DR: An automated process for converting ESRI Shapefiles to .geojson map files so that they can be more easily used with GitHub.com

With GitHub natively supporting mapping and embeds, I recently wanted to put some of the free, publicly available government data published on data.dc.gov to use. They have all sorts of great information, from bus routes to polling places, to the location of every liquor license in DC. The only problem was that the data was stored in a proprietary and complex format known as a Shapefile which arose in an age when the desktop ruled supreme and requires a costly software subscription for many common uses.

Luckily, a strangely named piece of open source software known as ogr2ogr can convert the data into the more modern, more open GeoJSON format that GitHub supports, and the resulting map can be automatically rendered, not to mention more easily diffed.

If you’ve got a Mac, it only takes a few seconds

  1. If you don’t already have it, install Homebrew by opening up terminal and running: $ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
  2. Install gdal with the command: $ brew install gdal
  3. Grab a Shapefile (distributed as a .zip file) from the DC Data Catalog or your favorite data source and unzip it someplace convenient
  4. cd into the directory with your shiny new unzipped Shapefile
  5. Run (replacing [name] with the name of your downloaded Shapefile): $ ogr2ogr -f GeoJSON -t_srs crs:84 [name].geojson [name].shp
  6. Grab the resulting GeoJSON file and commit it to GitHub
  7. Navigate to the GeoJSON file on GitHub.com to browse your map

In addition to converting the Shapefile over to GeoJSON, the other step in there, -t_srs crs:84, ensures that by the time the data hits GitHub, it’s encoded with the right projection so it can be mapped properly. Need to convert multiple Shapefiles in bulk? Just use this bulk Shapefile to GeoJSON conversion script. Windows users? You’re covered too.

Note: The same process should work for KML files as well, replacing [name].shp with [name].kml.

Why’s this is important

For one, you’re liberating public geodata that would otherwise be inaccessible to the average citizen and making it available in a dumb-simple point, click, zoom interface that anyone can use. For another, by putting the information on GitHub in an open, text-based format, civic hackers and subject-matter experts can begin treating that data like open source code — forking, merging, diffing, tracking changes over time — and all of a sudden we’ve opened up not just the data, but the entire collaborative ecosystem that now surrounds it.

The result

Originally published June 26, 2013 | View revision history

If you enjoyed this post, you might also enjoy:

benbalter

Ben Balter is the Director of Engineering Operations and Culture at GitHub, the world’s largest software development platform. Previously, as Chief of Staff for Security, he managed the office of the Chief Security Officer, improving overall business effectiveness of the Security organization through portfolio management, strategy, planning, culture, and values. As a Staff Technical Program manager for Enterprise and Compliance, Ben managed GitHub’s on-premises and SaaS enterprise offerings, and as the Senior Product Manager overseeing the platform’s Trust and Safety efforts, Ben shipped more than 500 features in support of community management, privacy, compliance, content moderation, product security, platform health, and open source workflows to ensure the GitHub community and platform remained safe, secure, and welcoming for all software developers. Before joining GitHub’s Product team, Ben served as GitHub’s Government Evangelist, leading the efforts to encourage more than 2,000 government organizations across 75 countries to adopt open source philosophies for code, data, and policy development. More about the author →

This page is open source. Please help improve it.

Edit