Explain like I’m five: Jekyll collections

TL;DR: Collections extends Jekyll’s post and pages publishing functionality, and brings Jekyll’s zen-like simplicity to all sorts of other types of content that aren’t dated, but have a relationship with one another.
6 minute read

Collections are Jekyll’s most powerful and simultaneously least understood feature. If you’re not familiar with Jekyll, Jekyll is a static site generator. Think of it like a content management system (CMS), without all the complexity and headache. No need to build a giant content-strangling Rube Goldberg machine to “manage” content, if all you’re doing at the end of the day is putting out HTML, JavaScript, and CSS, the building blocks of the internet. As a result, Jekyll gets out of your way and allows you to concentrate on what truly matters: your content.

Posts and pages

Most Jekyll sites are organized around two types of content, posts and pages.

  • Posts are organized reverse chronologically. You might use them for blog posts on a personal blog, or articles on a news site. You can recognize a post by its filename. Posts live in the _posts folder, and are always named in the form of YYYY-MM-DD-post-title.md. Because posts are dated, they’re traditionally not updated regularly once published.

  • Pages are documents that don’t have a relationship with one another. They can live anywhere within the site’s source directory and don’t have a set naming pattern. If you have a personal blog, you might have an index.html page (the site’s main page which is used to list posts), or an about me page, to name two examples. Because pages aren’t date specific, pages are often updated over time to maintain accuracy.

The problem is, not everything you might want to publish using a Jekyll falls cleanly into the those two categories of content. As I noted in the original pitch, “If people are using blog posts for a non-blog post thing, Jekyll has already failed”. That’s where Jekyll’s collections come in.

Everything that’s not a post or a page can be represented as a collection

Collections add another possibility, or use-case outside of Jekyll’s post- and page-publishing functionality; and have the potential to bring Jekyll’s zen-like simplicity to all sorts of other types of content that aren’t dated (as with posts), but have a set relationship with one another (hence the name, “collection”). If you’re familiar with traditional CMS’s, you can think of collections like WordPress custom post types or Drupal custom content types, except you do not need to program a specific class, learn any server languages, and the syntax used to specify them is very easily readable.

What then, might you use collections for? Let’s say you’re making a site for a bakery and want to list the different cupcakes varieties you sell. You might use a collection called “cupcakes”. You’d create a _cupcakes folder, and would add chocolate.md or vanilla.md to it. And just like posts or pages, your list of cupcakes would be accessible as site.cupcakes.

You wouldn’t want to use posts here, because cupcakes aren’t chronological, and likely wouldn’t want to just use a page here, because it’s a notably different animal than a document that lists your location and hours. Each cupcake in the cupcakes collection is related to one-another in the sense that they’re all cupcakes.

Collections are a very new feature to Jekyll, and according to the official documentation may be subject to change Jekyll Documentation on Collections; but you should not let that put you off of using them, because Jekyll is open-source, which means you should trust the community to work-together for the best common-case solution.

Collections in practice

But what if one day you decided to expand your offerings and sell cookies in addition to cupcakes. Simply introduce a “cookies” collection, adding chocolate-chip.md and peanut-butter.md to a _cookies directory, exposing the cookies as site.cookies. You’ll notice the collections concept start to show its value here. Pages wouldn’t make sense here, because you’d want to be able to list cupcakes and cookies separately, and besides for both being baked goods, the one cookie doesn’t really share a relationship with a cupcake, at least not in the same sense that cookies share with one another.

Of course you could at this stage, choose to have a more generic collection products, which you could develop liquid layouts for, so that you and other developers could get the basic functionality needed to display all products, with specific includes for cupcakes and cookies.

Abstractly, because they’re not outputted by default, you can think of collections somewhat like Jekyll’s _data folder support, but with the potential to generate content, and be placed into their own specific part of your Jekyll site, so a lot more robust. Like _data files, they can support arbitrary key/values through frontmatter, but they also support a full content body (like posts and pages), and can be broken out into separate files. If I wanted to break out my bakery’s hours, I might have a _data/hours.yml file that looked something like this:

monday: 9-5
tuesday: 9-5
wednesday: 9-5
thursday: 9-5
friday: 9-3

That makes sense, because my bakery’s hours is a relatively small dataset. But trying to represent all my baked goods in that format (or worse posts), would quickly get out of hand. That type of information is better represented as individual Markdown files with front matter, not one giant YAML file that will quickly become unwieldy with complexity; and rather than create the data, and pages to display the data, or the data and a plugin to turn it into pages; using collections allows the site owner to focus on the content.

For a more concrete example, take a look at the source for choosealicense.com a site which helps explain open source licenses like the MIT or GPL license. There are pages like “about” and “terms of service”, but the actual licenses live in a licenses collection and are displayed via a licensed page.

Other use-cases

Of course this is not the only use-case, which is one of the benefits of collections. You can turn on content generation to have the collection contents automatically generated, or use the where syntax to get the contents of specific collections to add common content, or devices to your site.

Using collections

The examples above were a slight simplification. There’s one other step. Before you can use a collection, you need to tell Jekyll about it. Going back to our bakery example above, I might have a _config.yml file that looks something like this:

  - cupcakes
  - cookies

This tells Jekyll to look in the _cupcakes and _cookies folders for documents, and to read them into the appropriate collection, including YAML front matter, just as it would posts (but again, without the date, because collection documents aren’t date specific).

By default, collections are read in (and exposed as site.[collection], an alias per-collection), but not included in the final site; at least not individually like you might expect posts or pages to. If you wanted a page for each type of cupcake, you’d have to modify the _config.yml a bit:

    output: true
    permalink: /cupcakes/:path/

That way, _cupcakes/chocolate.md is outputted as cupcakes/chocolate/index.html when the site is built and would be accessible as example.com/cupcakes/chocolate/. The other advantage, is, because the data is now structured and machine readable (rather than in plain text), you could also use the jsonify filter to output that same information as an API for use elsewhere.

When to use a post, a page, or a collection

I like to think the decision looks roughly like this:

+-------------------------------------+         +----------------+
| Can the things be logically grouped?|---No--->|    Use pages   |
+-------------------------------------+         +----------------+
+-------------------------------------+         +----------------+
|      Are they grouped by date?      |---No--->|Use a collection|
+-------------------------------------+         +----------------+
|            Use posts                |

So if you’re not about to open a bakery (if you do, please send cookies); what might you use collections for? In short, any discrete group of “things” that can be logically grouped by a common theme (that’s not their date). Here’s a few examples:

  • Listing employees on your company’s “about” page (or a project’s maintainers)
  • Documenting methods in an open source project (or the project’s that use it, or the plugins available)
  • Organizing jobs on your résumé (or talks given, papers written)
  • Articles on a support site
  • Recipes on your personal blog (or restaurant reviews, or dishes on a menu)
  • Students in a class (or courses being offered, or listing the faculty)
  • Cheats, tips, tricks and walkthroughs for games (by platform)
  • Creating re-usable content snippets for your site such as testimonials, forms, sentences, buzz-words or call-outs
  • And honestly just about anything else

Collections are a powerful (and often misunderstood) Jekyll feature, but hopefully you’ve now got an idea or two for your next Jekyll project. Of course, if you’re looking to dig in to collections, be sure to check out the formal documentation for a much more in-depth explanation.

Happy (organized and machine-readable) publishing!

Originally published February 20, 2015 | View revision history

If you enjoyed this post, you might also enjoy:


Ben Balter is the Director of Engineering Operations and Culture at GitHub, the world’s largest software development platform. Previously, as Chief of Staff for Security, he managed the office of the Chief Security Officer, improving overall business effectiveness of the Security organization through portfolio management, strategy, planning, culture, and values. As a Staff Technical Program manager for Enterprise and Compliance, Ben managed GitHub’s on-premises and SaaS enterprise offerings, and as the Senior Product Manager overseeing the platform’s Trust and Safety efforts, Ben shipped more than 500 features in support of community management, privacy, compliance, content moderation, product security, platform health, and open source workflows to ensure the GitHub community and platform remained safe, secure, and welcoming for all software developers. Before joining GitHub’s Product team, Ben served as GitHub’s Government Evangelist, leading the efforts to encourage more than 2,000 government organizations across 75 countries to adopt open source philosophies for code, data, and policy development. More about the author →

This page is open source. Please help improve it.