Skip to main content

Test your content

5 min read
: When you treat content as code, you unlock continuous integration for prose. Use tools like HTML Proofer and Travis CI to automatically test every link, image, and change.

I’ve written before about how we should treat prose and data with the same respect that developers treat code and how Jekyll forces you to do just that, but there’s another workflow that version-controlled, collaborative content enables: continuous integration.

Continuous integration (CI) is the idea that for every change, whether proposed or realized, a standard battery of tests run, to confirm the change does what you intend it to do. In most developer tools, like GitHub, you get instant feedback, in the form of a green light, to let you know that is in fact the case, or a red light, if something went awry, detailing exactly what doesn’t match you specified expectations.

When you treat content as code, you get the opportunity to co-opt best-of-breed developer workflows, and continuous integration is no exception. We’ve all been there. You make a simple change and end up breaking two or three other things. Links get out of date. Images move. Imagine if every time you made a change, all those things that you consciously or subconsciously worry about were automatically checked. Are my links accurate? Did any of my images break? Does this darn thing actually render the way I want it to?

With CI services like Travis CI, whether public or private, adding continuous integration to a repository of prose content, whether an entire site or a collection of HTML or Markdown files becomes trivial, especially when you use open source tools like HTML Proofer.

Let’s say you have a Jekyll site, versioned on GitHub and published on GitHub Pages, and you’d like Travis to give your content a quick checkup every time you make a change. First, you’ll want to add the following to your site’s Gemfile:

group :test do
gem 'html-proofer'
gem 'rake'
end

Next, create a file called Rakefile in your site’s root, and add the following:

require 'html/proofer'
task :test do
sh "bundle exec jekyll build"
HTML::Proofer.new("./_site").run
end

After that, you’ll want to configure Travis by adding a .travis.yml file with the following contents:

language: ruby
script: "rake test" # You may need to use "bundle exec rake test" if Travis fails on the require for the HTML/Proofer

And finally, you need to head over to travis-ci.org/profile to enable Travis for your repository.

Now, each time you push, Travis is going to verify all sorts of things, like whether your images render and contain alt tags, whether your links are valid (including internal anchors), and whether all the JavaScript files you reference actually exist. With some additional configuration, you can have it check all sorts of things like whether your page has a favicon, or whether the HTML is even valid.

You can see this in action on this site. Each time I make a change (or someone proposes one), every link and image is checked to confirm nothing broke. You’ll get something that looks like:

Terminal window
Running ["ScriptCheck", "LinkCheck", "ImageCheck"] checks on ./_site on *.html...
Checking 1187 external links...
Ran on 120 files!
HTML-Proofer finished successfully.

And with that, you can merge confidently.

Beyond “does this thing work?”#

Having accurate links and images is a great baseline (that sadly, as I’ve learned through my own continuous integration, many sites don’t check), but what about checking the things you can’t see like accessibility? In the case of §508 compliance, I wrote Ra11y, but automated tools exist to check all sorts of things.

If you regularly author content for the web, especially if it’s collaborative, I’d encourage you to take a look at what developer tools and philosophies your can co-opt for your own workflows, CI or otherwise. You content deserves it.

Originally published May 22, 2015 View revision history
Share

More to explore

We've been trained to make paper

5 min read

If the internet is the primary medium by which content is consumed, shouldn't that be the primary medium for which content is prepared?

Treat Data as Code

5 min read

Developers learned decades ago that version control and collaboration are essential. It's time we apply those same practices to data.

Jekyll: Where content is truly king

2 min read

Choosing Jekyll over a traditional CMS for government.github.com freed us to spend six months iterating on what mattered most — the content.

Ben Balter

I'm Ben Balter — I write here about engineering leadership, open source, and showing your work. My open source projects have hundreds of millions of downloads. I was the Director of Hubber Enablement at GitHub, where I helped thousands of GitHubbers do their best remote work. Before this role: Chief of Staff for Security, enterprise PM, and GitHub's first Government Evangelist. Before GitHub: attorney, Presidential Innovation Fellow, and member of the White House's first agile development team. More about the author →

Follow along: Bluesky LinkedIn

This page is open source Help improve this article on GitHub