How to convert Shapefiles to GeoJSON maps for use on GitHub (and why you should)
An automated process for converting ESRI Shapefiles to
.geojson map files so that they can be more easily used with GitHub.com
With GitHub natively supporting mapping and embeds, I recently wanted to put some of the free, publicly available government data published on data.dc.gov to use. They have all sorts of great information, from bus routes to polling places, to the location of every liquor license in DC. The only problem was that the data was stored in a proprietary and complex format known as a Shapefile which arose in an age when the desktop ruled supreme and requires a costly software subscription for many common uses.
Luckily, a strangely named piece of open source software known as ogr2ogr can convert the data into the more modern, more open GeoJSON format that GitHub supports, and the resulting map can be automatically rendered, not to mention more easily diffed.
If you’ve got a Mac, it only takes a few seconds
- If you don’t already have it, install Homebrew by opening up terminal and running:
$ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
- Install gdal with the command:
$ brew install gdal
- Grab a Shapefile (distributed as a .zip file) from the DC Data Catalog or your favorite data source and unzip it someplace convenient
cdinto the directory with your shiny new unzipped Shapefile
- Run (replacing
[name]with the name of your downloaded Shapefile):
$ ogr2ogr -f GeoJSON -t_srs crs:84 [name].geojson [name].shp
- Grab the resulting GeoJSON file and commit it to GitHub
- Navigate to the GeoJSON file on GitHub.com to browse your map
In addition to converting the Shapefile over to GeoJSON, the other step in there,
-t_srs crs:84, ensures that by the time the data hits GitHub, it’s encoded with the right projection so it can be mapped properly. Need to convert multiple Shapefiles in bulk? Just use this bulk Shapefile to GeoJSON conversion script. Windows users? You’re covered too.
Note: The same process should work for KML files as well, replacing
Why’s this is important
For one, you’re liberating public geodata that would otherwise be inaccessible to the average citizen and making it available in a dumb-simple point, click, zoom interface that anyone can use. For another, by putting the information on GitHub in an open, text-based format, civic hackers and subject-matter experts can begin treating that data like open source code — forking, merging, diffing, tracking changes over time — and all of a sudden we’ve opened up not just the data, but the entire collaborative ecosystem that now surrounds it.
If you enjoyed this post, you might also enjoy:
- Why open source
- Four characteristics of modern collaboration tools
- Twelve tips for growing communities around your open source project
- How I re-over-engineered my home network for privacy and security
- 19 reasons why technologists don't want to work at your government agency
- That's not how the internet works
- Five best practices in open source: external engagement
- Treat Data As Code
- 15 rules for communicating at GitHub
- Octoversary - eight years of optimizing for developer happiness
- Everything an open source maintainer might need to know about open source licensing
Ben Balter is Chief of Staff for Security and Engineering at GitHub, the world’s largest software development platform. Previously, as a Staff Technical Program manager for Enterprise and Compliance, Ben managed GitHub’s on-premises and SaaS enterprise offerings, and as the Senior Product Manager overseeing the platform’s Trust and Safety efforts, Ben shipped more than 500 features in support of community management, privacy, compliance, content moderation, product security, platform health, and open source workflows to ensure the GitHub community and platform remained safe, secure, and welcoming for all software developers. Before joining GitHub’s Product team, Ben served as GitHub’s Government Evangelist, leading the efforts to encourage more than 2,000 government organizations across 75 countries to adopt open source philosophies for code, data, and policy development. More about the author →