I needed a quick-and-easy way to parse Microsoft Word’s footnote format into a more web-friendly format for a recent project. After a bit of regular expression hacking, I was able to build a WordPress plugin to automatically convert content pasted from Word into a format readable by Andrew Nacin’s popular Simple Footnotes plugin.
The process is surprisingly simple given WordPress’s extensive filter API. First, to grab the footnotes from Word’s
This creates an array (
$footnotes) with the both the footnote number and the text of the footnote. We then need a way to replace the in-text reference with the parsed footnotes so that Simple Footnotes can understand them. I did this by creating two arrays, a find array and a replace array with each Word-style footnote reference and its Simple Footnote formatted counterpart:
Finally, so that the entire replacement can be done in a single pass, push a final find/replace pair into the end of the array, to remove the original footnotes: