Converting a Markdown Article into an Ebook

It's not something I've ever been particularly motivated to do before, but I recently found cause to turn one of my blog posts into an ebook.

A couple of years ago, I moved my site over to using Nikola and, since then, all my posts have been written using Markdown (often being drafted in Obsidian).

As well as being an extremely convenient medium to write posts in, Markdown also benefits from being extremely well supported by various tools, include Pandoc.

In this post, I'll detail the process that I used in order to convert a Markdown document into an EPUB e-book, ready for publishing into online book stores. I did this on Linux, but Pandoc supports Windows, MacOS, ChromeOS and BSD as well.


Installing Pandoc

For me, installing pandoc was as simple as

sudo apt-get install -y pandoc

If you're on a different operating system, you can find the answers you need in pandoc's installation instructions.


Preparing The Markdown

There are a few small changes that need to be made to the markdown in order to have it render nicely into an e-book.

When publishing to the web, I tend to avoid using a top-level header (i.e. <h1> or #), because my site's template will automatically insert the header for me - everything I write is <h2>/# or below.

However, when publishing in book format, I want some (if not all) of the page's sections to be treated as if they were the beginning of a chapter.

In order to achieve that, I moved all the headers up one level:

## Foo -> # Foo
### Bar -> ## Bar

When drafting, I also tend to include a horizontal rule between sections. It doesn't quite look right in an ebook, though, so I removed any occurrences of ----

This is fairly specific to my setup, but I also had to go through and convert any image references to be relative. When drafting for the web, I use a path from root (e.g. /images/foo/bar.jpg) because that's what will be needed when published (and Obsidian handles this cleanly, because it looks from the root of it's vault rather than the filesystem).

Pandoc, though, won't be able to find those images, because it'll work from the root of the file-system.

The change is just to add a period (.) to the beginning of the path so that it becomes relative:

![an image](/images/foo/bar.jpg)

becomes

![an image](./images/foo/bar.jpg)

To summarise, preparing the markdown involves:

  • Changing chapter headings to H1's (i.e. ensure they're prefixed with a single #)
  • Removing horizontal lines between chapters
  • Updating image paths to be relative (i.e. ./images/foo/bar rather than /images/foo/bar)

Adding E-Book Metadata

We also need to provide Pandoc with some information so that it can set the EPUB metadata appropriately.

To do this, we add metadata to the top of the markdown file, in the following format

% Title
% Author
% Date

For example:

% My Example e-book
% B Tasker
% Dec 12, 2023

Rendering the e-book

With the markdown file prepared, it's then just a case of invoking pandoc

pandoc \
-f markdown amazon_ebook_absolute_links.md \
-t epub3 -o amazon_ebook.epub

Pandoc does it's thing and generates an EPUB format ebook.

When viewed in an ebook viewer, the front page looks something like this:

Screenshot of the ebook cover page, it provides title, author and date


Adding a Cover Page

The default cover is functional, but somewhat underwhelming: ideally, the book should have an image for a cover.

You'll need to create an image for the cover yourself - there are tools online that can assist with this.

Once you have an image, you just need to pass it in whilst invoking pandoc:

pandoc \
-f markdown amazon_ebook_absolute_links.md \
--epub-cover-image=./images/Documentation/markdown_to_ebook/fake_cover.jpg \
-t epub3 -o amazon_ebook.epub

Now, when viewed in an ebook viewer, the first page is graphic cover material

Screenshot of the ebook open in a reader, the first page is the cover image provided in the command line


The Result

The result should be a fully functional e-book.

The reader should list each of the chapters and sections in the table of contents.

Screenshot of the ebook reader with table of contents open, each of the sections and subsections is listed

Any images referenced within the markdown itself should also now be present. Depending on reader support, that can even include animated gifs:

Gif showing a page open in an ebook reader with a gif embedded


Troubleshooting: Pandoc not rendering Markdown Headings

I initially had an issue with headers not rendering, resulting in the book having a single section consisting of partially rendered markdown:

Screenshot of misrendered markdown, a heading hasn't been picked up on and so is included as literal text

In my case, my source article had named anchors just before the headings:

<a name="introduction"></a>
# Introduction

Blah blah

This was enough for the parser to decide to skip them.

I removed the anchors (because they serve no practical purpose in an ebook), but it would also have been sufficient to add a line of whitespace between the two:

<a name="introduction"></a>

# Introduction

Blah blah

Conclusion

Pandoc makes creating an e-book from Markdown incredibly straightforward. It's not just Markdown that can be used either - Pandoc supports a wide array of formats and even lets you define your own.

The resulting e-book can be edited with tools like Calibre or uploaded directly into online e-book stores for sale/promotion.