With GitHub natively supporting mapping and embeds, I recently wanted to put some of the free, publicly available government data published on data.dc.gov to use. They have all sorts of great information, from bus routes to polling places, to the location of every liquor license in DC. The only problem was that the data was stored in a proprietary and complex format known as a Shapefile which arose in an age when the desktop ruled supreme and requires a costly software subscription for many common uses.
Luckily, a strangely named piece of open source software known as ogr2ogr can convert the data into the more modern, more open GeoJSON format that GitHub supports, and the resulting map can be automatically rendered, not to mention more easily diffed.
If you’ve got a Mac, it only takes a few seconds
- If you don’t already have it, install Homebrew by opening up terminal and running:
$ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
- Install gdal with the command:
$ brew install gdal
- Grab a Shapefile (distributed as a .zip file) from the DC Data Catalog or your favorite data source and unzip it someplace convenient
cdinto the directory with your shiny new unzipped Shapefile
- Run (replacing
[name]with the name of your downloaded Shapefile):
$ ogr2ogr -f GeoJSON -t_srs crs:84 [name].geojson [name].shp
- Grab the resulting GeoJSON file and commit it to GitHub
- Navigate to the GeoJSON file on GitHub.com to browse your map
In addition to converting the Shapefile over to GeoJSON, the other step in there,
-t_srs crs:84, ensures that by the time the data hits GitHub, it’s encoded with the right projection so it can be mapped properly. Need to convert multiple Shapefiles in bulk? Just use this bulk Shapefile to GeoJSON conversion script. Windows users? You’re covered too.
Note: The same process should work for KML files as well, replacing
Why’s this is important
For one, you’re liberating public geodata that would otherwise be inaccessible to the average citizen and making it available in a dumb-simple point, click, zoom interface that anyone can use. For another, by putting the information on GitHub in an open, text-based format, civic hackers and subject-matter experts can begin treating that data like open source code — forking, merging, diffing, tracking changes over time — and all of a sudden we’ve opened up not just the data, but the entire collaborative ecosystem that now surrounds it.
If you enjoyed this post, you might enjoy:
- Towards a More Agile Government
- Why open source
- Twelve tips for growing communities around your open source project
- Four characteristics of modern collaboration tools
- Securing the Status Quo
- 19 reasons why technologists don't want to work at your government agency
- Five best practices in open source: external engagement
- That's not how the internet works
- 15 rules for communicating at GitHub
- Treat Data As Code
- Everything an open source maintainer might need to know about open source licensing
Ben Balter is a Senior Technical Program Manager at GitHub, the world’s largest software development network. Previously, as the Senior Product Manager overseeing the platform’s Trust and Safety efforts, Ben shipped more than 500 features in support of community management, privacy, compliance, content moderation, product security, platform health, and open source workflows to ensure the GitHub community and platform remained safe, secure, and welcoming for all software developers. Before joining GitHub’s Product team, Ben served as GitHub’s Government Evangelist, leading the efforts to encourage more than 2,000 government organizations across 75 countries to adopt open source philosophies for code, data, and policy development. More about the author →