Ebook Formatting with XHTML and Calibre by Paul Salvette

| Posted in Guest Posts |


America-Goes-On-By-Paul-SalvetteLast week, we had a guest post on a simple way to get your ebook formatted without worrying about dinking around with HTML or special software tools. Today, Paul Salvette is going to give us a more in-depth version for those who like to control every aspect and don’t mind reading some guides and getting into the nitty gritty of customizing their ebooks.

This kind of ebook formatting is a big topic for one guest post, so many of the links here will take you over to Paul’s site where he’s written up detailed formatting guides. While you’re there (or here), check out his first novella, America Goes On.

Ebook Formatting with HTML and Calibre

Thanks to Lindsay for letting me out of the cage here in Thailand to do a guest post. I write this eBook formatting post, not as an eBook author, but as a reader. I am getting a bit perturbed by all the eBooks out there with that have shoddy formatting: changing fonts, goofy page breaks, busted hyperlinks, etc. Self-publishers get ridiculed for being unprofessional by the suits in New York City, but I’ve found these errors in eBooks I shelled out $9.99 for from major publishing houses! There is no excuse to treat the customer with such contempt.

That is why those of us who self-publish have a real chance to set the standard for proper eBook formatting. You don’t have to be a nerd (although it helps) and you don’t need to spend a lot of time. If you have spent months writing, marketing, and updating your twitter feed 40 times a day to make it big as a self-publisher, you can take the time to format an eBook properly. I did it with my debut novella, America Goes On, and you can do it too.

First off, you have to think about eBooks differently from regular books and even the manuscript sitting in your word processor. While your word processor has a defined font, defined number of pages, and a fixed layout, and eBook has what’s called reflowable text. That means no matter what the dimensions of the eReading device, the text will wrap neatly into the next line. Not coincidentally, this is how a web browser reads HTML code. Try pressing Ctrl-U right now to see the source code of Lindsay’s most excellent blog. It may look a bit daunting, but you basically want to turn that manuscript in your word processor into this type of code.

There are two basic types of eBook formats that are commonly in use right now. There are the MOBI/PRC/AZW formats, which are primarily the domain of the Kindle Store. And then there is the EPUB format, which is used by everybody else (iBookstore, Barnes & Noble NOOK, etc.) It should be noted that Smashwords is a special beast, because it requires that you upload a document in Microsoft Word for their meatgrinder, which converts your manuscript into the major formats for distribution. Both MOBI and EPUB are based on old and simple HTML code that geeks were using to write Star Trek fan fiction on the internet back in the mid-90s. It’s really nothing too advanced.

The first step toward getting your manuscript converted into an eBook is obtaining some free tools. This includes a good text editor (I like Notepad++) and an open source program called Calibre which converts HTML into EPUB and MOBI. Next, you need to learn a little bit about HTML programming, and more specifically XHTML programming (XHTML has more robust standards than its cousin HTML). Don’t worry, you don’t need to be like Matthew Broderick in WarGames and hack into the computer at NORAD. You just need to have a basic knowledge of XHTML to include properly wrapping text in paragraph tags (<p> and </p>), adding styles to text (such as different font-sizes), adding margins around text, aligning text, and maybe even how to add images. Basically, every basic function you do in your word processor, you need to learn how to do in XHTML. I whipped up an XHTML tutorial for those of us who didn’t receive daily wedgies in high school, and it’s very easy to follow. If you get confused, drop me a comment and I’ll be happy to help.

Once you know the basics of XHTML, you need to take the entire manuscript out of your word processor and into a text editor. Once you are in a text editor, you can guarantee that you will have perfectly clean XHTML code. Word processor’s like to leave nasty bits of formatting and corruption within your seemingly beautiful manuscript. This crap that gets hidden in your word processor will make your eBook look all screwed up on certain eReaders, guaranteeing that your readers will NOT come back and purchase your other works. If you work from a simple text editor, you can guarantee a perfect eBook. There are some tips, tricks, and best practices regarding copying a manuscript into a text editor and then coding XHTML inside the text editor. You can learn more about taking a sloppy manuscript and turning it into perfect XHTML in another tutorial I prepared for indie authors.

Once you have the XHTML file for your book, you can run it through Calibre to get a MOBI and EPUB file that is ready to upload to the major markets. Calibre has some features on it that you need to familiarize yourself with, but I whipped up another tutorial that helps indie authors work with Calibre. However, it is a very user-friendly program. If you really want to geek out, you can even learn about regular expressions, which helps find and manipulate complex strings of text inside your text editor. This knowledge can cut your formatting time in half. You can also get real fancy and learn about building an EPUB from the ground up (without Calibre) and converting it into MOBI with a free program called KindleGen.

Whatever path you choose, there is no excuse for a sloppy eBook. It takes years to learn how to write well, but it only takes a few days to learn how to format an eBook well. hese knuckleheads charging $150 to format fiction should be ashamed of themselves. It took me less than an hour to format my first novella, and you can watch me do it on these video tutorials. I can only think of one other job where you can make $150 for less than an hour of work, and I don’t look that good in knee-high boots and a pink skirt.

I hope this gives you a decent overview of what’s in store for you when formatting an eBook, and these tutorials I hope will be useful for those of us in the self-publishing community. Please let me know if you have any questions, and drop me a line on Twitter or at my website. I’m happy to help, and I won’t charge you a $150. Good luck formatting your eBook!

Subscribe to the blog: EMAIL | RSS.

Comments (14)

Thank you for the information. I will have to do this for my first ebook soon and I will be sure to check out your links.

Sure, Sara. Let me know if you need any help.

I think I need to check out those videos and the site. Formatting my eBook took a bit of fannying about. One for Amazon, one for Smashwords – oh, wait, one of them looks iffy? Wait 24hrs, rinse, repeat, upload… still dodgy, preview in Kindle previewer again and so on and so on.

I’m not new to tech by any means – or HTML – but man alive, talk about tarnishing the polish of publishing your own book.


Getting the MOBI right is the worst. It has a tendency to screw up margin spacing. If you want to go the geek option, try getting a clean EPUB and then use KindleGen to convert to MOBI. It seemed to work alright for me. Let me know if you have any problems, and I’d be happy to help.

Maybe I’ve been blessed by the gods with good fortune, but I haven’t had any trouble with Amazon. A little with B&N because graphics don’t always turn out just right once the files goes through them, though fine before it does.

Smashwords… My patience is wearing thin. The Meatgrinder is demeaning and pointless. A really cool technology four years ago. I could happily delete Word from my machine if not for them. You follow all their little rules, perfectly, and believe me, I’m OCD enough to do so, and it’s still a crapshoot sometimes. And if you’ve got a few chapter heading pics and a map, good luck.

This is great, Paul! I’ve always wanted to know how to do this formatting on my own. I’m definitely bookmarking this blog post so I can get to all your links in the future. Thanks so much!

Sure, Cathy. Let me know if you hit any snags. We’re all in this together!

Surely you jest! No, and don’t call me Shirley.

Anyway, this is far more work than I’d want to put in. I used to code web pages the hard way using HTML, but was quite happy with the invention of so many widgets and tools. Past strong, underline, newline, and paragraph, I’m lost. I wonder if those codes will show up in a comment.

The HTML is beyond my grasp. I have Calibre though. I’ve used it to convert ebooks over for my nook.

You can do it!

The only way to learn something is to start failing miserably! If you don’t think you can do it you’ve already given up.

But if you really don’t want to go through the work, I have a ebook formatting and conversion service that might interest you. Conversions are only $20 and I back it up with unlimited free revisions. Check it out!

I can do XHTML by hand, but haven’t found a need to with my ebooks. With the exception of Smashwords (curse their silly Meatgrinder system), I’ve had perfectly fine results using the following method (on Mac OS 10.6):

1. Scrivener: Export to EPUB.
2. Calibre: Convert EPUB to EPUB
3. Calibre: Convert converted EPUB to MOBI

I cannot vouch for Scrivener for Windows achieving the same results. Haven’t a clue. is useful for validating your epub, of course.

Sigil is a nice program for taking your epub and going through the code to clean up things if needed.

Sometimes strange things can happen to perfect files once they’re uploaded. Who knows. But a bit of patience and careful attention will usually clear them up. Usually, it’s hard to get a file with any graphics to appear perfectly on every device.

That’s an interesting method to get the file in a MOBI. The MOBI can be a real pain in the tuckus (it frequently screws up margins). I’ve heard of SIGIL, but I’ve never tried it out. I usually just do EPUB by hand. I agree, the Meatgrinder is a serious pain, but at least they’ve fixed a lot of their bugs with the NCX Table of Contents lately.

Hello! For those who need alternatives to Calibre, so they can easily convert webpages to pdf or epub to mobi format, I suggest to have a look at this free converter tool: Its interface is user-friendly and can generate fast results, try it.

Post a comment

\r\n"; } // end function form_reset() Contact";