While the internet has long had a soft spot for clever 404 pages, it’s rare to see one that’s actually helpful, especially for static sites like Jekyll or GitHub Pages that make dynamic searches more difficult. Great 404 pages should help visitors find what they’re looking.
Here’s how I updated the 404 (not found) pages on my own site to resolve typos and suggest other pages potentially relevant to the visitor’s intended URL, in case you’d like to implement the same or similar functionality on your own site:
How my 404 page suggests alternate URLs
If you were to click an invalid link or typo a URL on my site, the following would occur:
- You’d see a
404 - not foundpage1 - Your browser would retrieve and parse my site’s
sitemap.xml2 - Your browser would find the valid path that has the shortest edit distance from the path you requested
- Your browser would update the
404page with a link to the suggested path
What it looks like
Let’s say you tried to navigate to a path that doesn’t exist like /2022/06/30/unhelpful-404s-for-jekyll. Along with a list of recent posts, the experience, would look something like this:
The page you are trying to view does not exist. Perhaps you’re looking for /2022/06/30/helpful-404s-for-jekyll-and-github-pages/?
How it works
This functionality is driven by a surprisingly small amount of JavaScript (really TypeScript):
import { closest } from 'fastest-levenshtein';
const div = document.getElementById('four-oh-four-suggestion');
if (div) {
const xhr = new XMLHttpRequest();
xhr.onload = () => {
if (xhr.status === 200) {
const xml = xhr.responseXML;
const urls = Array.from(xml.querySelectorAll('urlset > url > loc')).map((el) => el.textContent);
const url = new URL(closest(window.location.href, urls));
div.innerHTML = `<a href="${url.href}">${url.pathname}</a>`;
} else {
div.innerHTML = '<a href="/">/</a>';
}
};
xhr.open('GET', `${window.location.protocol}//${window.location.host}/sitemap.xml`);
xhr.send();
}The v0.1
Could it be written better? Absolutely (but it works!). For now, I’m using fastest-levenshtein to find the closest path to the one requested, and the lower level XMLHttpRequest and querySelectorAll to retrieve and parse the XML sitemap.
Along with better error handling, this could also be implemented with the more modern fetch API to retrieve the sitemap and something like fast-xml-parser to more properly parse the XML, but my modern JavaScript knowledge is limited.3 If you’d like to take a pass at a better implementation, pull requests are always welcome.
Conclusion
When I click on a broken link, the site that I land on should point me in the right direction. After all typo’d or updated URLs are not uncommon, and the site I’m visiting knows more about the site’s content and structure than I ever will. While it’s still true that everything should have a URL, sometimes those URLs change or get lost in translation. Although you might hope a visitor would never see one, great 404 pages go that extra step and help visitors find what they’re looking for. If you’re interested in implementing the same functionality on your own site, the code above is part of the retlab Jekyll theme, and is licensed under The MIT License.
Footnotes
When a visitor tries to access a URL that does not exist, GitHub Pages will serve the
404.htmlfile in the site’s root directory, if one exists. ↩Generated automatically by the Jekyll Sitemap plugin. The same implementation would work with any other static site (or static site generator), so long as your site has a comprehensive
sitemap.xml. ↩I’m proud to say that no
jQuerywas harmed in the making of this functionality. ↩