With GitHub natively supporting mapping and embeds, I recently wanted to put some of the free, publicly available government data published on data.dc.gov to use. They have all sorts of great information, from bus routes to polling places, to the location of every liquor license in DC. The only problem was that the data was stored in a proprietary and complex format known as a Shapefile which arose in an age when the desktop ruled supreme and requires a costly software subscription for many common uses.
Luckily, a strangely named piece of open source software known as ogr2ogr can convert the data into the more modern, more open GeoJSON format that GitHub supports, and the resulting map can be automatically rendered, not to mention more easily diffed.
If you’ve got a Mac, it only takes a few seconds
- If you don’t already have it, install Homebrew by opening up terminal and running:
$ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
- Install gdal with the command:
$ brew install gdal
- Grab a Shapefile (distributed as a .zip file) from the DC Data Catalog or your favorite data source and unzip it someplace convenient
cdinto the directory with your shiny new unzipped Shapefile
- Run (replacing
[name]with the name of your downloaded Shapefile):
$ ogr2ogr -f GeoJSON -t_srs crs:84 [name].geojson [name].shp
- Grab the resulting GeoJSON file and commit it to GitHub
- Navigate to the GeoJSON file on GitHub.com to browse your map
In addition to converting the Shapefile over to GeoJSON, the other step in there,
-t_srs crs:84, ensures that by the time the data hits GitHub, it’s encoded with the right projection so it can be mapped properly. Need to convert multiple Shapefiles in bulk? Just use this bulk Shapefile to GeoJSON conversion script. Windows users? You’re covered too.
Note: The same process should work for KML files as well, replacing
Why’s this is important
For one, you’re liberating public geodata that would otherwise be inaccessible to the average citizen and making it available in a dumb-simple point, click, zoom interface that anyone can use. For another, by putting the information on GitHub in an open, text-based format, civic hackers and subject-matter experts can begin treating that data like open source code — forking, merging, diffing, tracking changes over time — and all of a sudden we’ve opened up not just the data, but the entire collaborative ecosystem that now surrounds it.
Ben Balter is a Senior Manager of Product Management at GitHub, the world’s largest software development network, where he oversees the platform’s Community and Safety efforts. Named one of the top 25 most influential people in government and technology, Fed50’s Disruptor of the Year, and winner of the Open Source People’s Choice Award, Ben previously served as GitHub’s Government Evangelist, leading the efforts to encourage government at all levels to adopt open source philosophies for code, data, and policy development. More about the author →