Skip to main content

Test your content

5 min read
: When you treat content as code, you unlock continuous integration for prose. Use tools like HTML Proofer and Travis CI to automatically test every link, image, and change.

I’ve written before about how we should treat prose and data with the same respect that developers treat code and how Jekyll forces you to do just that, but there’s another workflow that version-controlled, collaborative content enables: continuous integration.

Continuous integration (CI) is the idea that for every change, whether proposed or realized, a standard battery of tests run, to confirm the change does what you intend it to do. In most developer tools, like GitHub, you get instant feedback, in the form of a green light, to let you know that is in fact the case, or a red light, if something went awry, detailing exactly what doesn’t match you specified expectations.

When you treat content as code, you get the opportunity to co-opt best-of-breed developer workflows, and continuous integration is no exception. We’ve all been there. You make a simple change and end up breaking two or three other things. Links get out of date. Images move. Imagine if every time you made a change, all those things that you consciously or subconsciously worry about were automatically checked. Are my links accurate? Did any of my images break? Does this darn thing actually render the way I want it to?

With CI services like Travis CI, whether public or private, adding continuous integration to a repository of prose content, whether an entire site or a collection of HTML or Markdown files becomes trivial, especially when you use open source tools like HTML Proofer.

Let’s say you have a Jekyll site, versioned on GitHub and published on GitHub Pages, and you’d like Travis to give your content a quick checkup every time you make a change. First, you’ll want to add the following to your site’s Gemfile:

group :test do
gem 'html-proofer'
gem 'rake'
end

Next, create a file called Rakefile in your site’s root, and add the following:

require 'html/proofer'
task :test do
sh "bundle exec jekyll build"
HTML::Proofer.new("./_site").run
end

After that, you’ll want to configure Travis by adding a .travis.yml file with the following contents:

language: ruby
script: "rake test" # You may need to use "bundle exec rake test" if Travis fails on the require for the HTML/Proofer

And finally, you need to head over to travis-ci.org/profile to enable Travis for your repository.

Now, each time you push, Travis is going to verify all sorts of things, like whether your images render and contain alt tags, whether your links are valid (including internal anchors), and whether all the JavaScript files you reference actually exist. With some additional configuration, you can have it check all sorts of things like whether your page has a favicon, or whether the HTML is even valid.

You can see this in action on this site. Each time I make a change (or someone proposes one), every link and image is checked to confirm nothing broke. You’ll get something that looks like:

Terminal window
Running ["ScriptCheck", "LinkCheck", "ImageCheck"] checks on ./_site on *.html...
Checking 1187 external links...
Ran on 120 files!
HTML-Proofer finished successfully.

And with that, you can merge confidently.

Beyond “does this thing work?”#

Having accurate links and images is a great baseline (that sadly, as I’ve learned through my own continuous integration, many sites don’t check), but what about checking the things you can’t see like accessibility? In the case of §508 compliance, I wrote Ra11y, but automated tools exist to check all sorts of things.

If you regularly author content for the web, especially if it’s collaborative, I’d encourage you to take a look at what developer tools and philosophies your can co-opt for your own workflows, CI or otherwise. You content deserves it.

Originally published May 22, 2015 View revision history

More to explore

Jekyll: Where content is truly king

2 min read

Choosing Jekyll over a traditional CMS for government.github.com freed us to spend six months iterating on what mattered most — the content.

We've been trained to make paper

5 min read

If the internet is the primary medium by which content is consumed, shouldn't that be the primary medium for which content is prepared?

Pull requests are a form of documentation

3 min read

Pull requests capture not just what changed, but who, why, and what alternatives were considered. Treat every PR as a time capsule for future contributors.

Helpful 404s for Jekyll (and GitHub Pages)

3 min read

How to build 404 pages for Jekyll and GitHub Pages that automatically suggest similar URLs to those requested, using Levenshtein distance and your sitemap.

Welcome to the Post-CMS World

7 min read

Jekyll (and other static-sites) lead to simple, flexible, and reliable websites that allow for a renewed focus on what actually matters: the content.