From the Ebook Formatting Files: Why Is My Mobi File So Huge?!

From the Ebook Formatting Files:: Why Is My Mobi File So Huge?!One of the most frequently asked questions we get from clients is “Why is the Mobi file so huge? Amazon charges a delivery fee and the Kindle book you made will eat all of my royalty. All of it. All. Of. It.”

The concern, of course, is well-placed: as a publisher—whether you’re a self-publisher or running a small press—you’re operating a small business and keeping your costs in check is important. There’s quite a lot that’s out of your control when doing business with Amazon’s KDP, so any time you can wrestle a little bit of it back is a good thing… for your readers and your bottom line.

What goes into an ebook file?

The best way to answer the “why is the Mobi file so big?” question is to first talk about what goes into an ebook file, in general.

Let’s examine what the components of the ebook are.

  • XML files. The acronym XML stands for “eXtensible Markup Language,” a computer language defined by a set of rules that makes it easy for both humans and machines to read the instructions contained within. The XML files in ebooks—both Mobi files and ePub files—are basic instructions on how the ebook is structured.

    First, there’s an XML file that contains instructions on how to find another XML file with more instructions. It’s like an invitation to a 90s rave!

    The aforementioned second XML file—called the Open Packaging Format, or OPF, file—is a set of instructions that contains the ebook’s identifying information, metadata (author, publisher, category, etc.), the list of individual files that make up the ebook, and the order in which those files should be played.

    Finally, there may also be what’s called a Navigation Center eXtended, or NCX, file that sets waypoints in the ebook to help readers navigate through the content. These waypoints are typically at each chapter, but can be set anywhere in the content.

  • XHTML files. These files are the ebook’s content. In most cases, each chapter or content item (title page, copyright page, acknowledgments page, etc.) is given its own XHTML file.

    While it may sound counterintuitive to break each piece into its own file, the primary advantage to doing so is efficiency. A single XHTML file with all of the ebook’s content requires the ereader device software—whether it’s a Kindle, a tablet or phone—to read the full contents of the file and display only what can appear on the screen according to the personalized reading settings.

    This process repeats every time a reader turns the page. The longer the book, the longer it takes the device to re-read the contents to the end and decide what to show next.

    But there’s another thing at work here: one of the most power-hungry processes for any computer is displaying what it’s doing. Ereader devices are, by design, low power machines. Reading long files consumes more power which may likely drain device batteries faster.

  • CSS files. These files, known as Cascading Style Sheets, are a set of instructions that define what the content should look like. Basically, CSS is the “formatting” in “ebook formatting.”

There may be other files inside an ebook—like images, the cover, and font files—but the XML, XHTML and CSS files are a given with every ebook.

What goes into a Mobi file?

Here is where things can get complex because of the various types of Mobi files roaming the wild.

There’s the KF7 mobi file type that’s playable on all things named Kindle. There’s the KF8 mobi file type that’s also playable on every device and app named Kindle, but has a feature set that works only on select devices and apps (more on that in a moment). And there’s a rumored update to KF8, called KFX, for which details are unknown to us at the time of this posting.

We could go in-depth on KF7 and KF8, but let’s stick to the KF8 mobi file type, as it’s the one we make. You with us? We hope so because we’re going to ask you to hang onto your hats for this: there are three versions of your book inside each Mobi file we make.

Take deep breaths. It sounds worse than it really is. We’ll explain soon. But first, here are the things in a Mobi file:

  1. The first version is one that will play well on legacy Kindle devices and apps, namely Kindle 1, 2 and DX on the devices front, and Kindle Cloud Reader and the retail sample engine at the Amazon Web site. This version is a rather simplified one so that the Kindle legacy systems can still provide a pleasant reading experience to people who use them.

    At present, there’s little to no support for the legacy devices. And Amazon has assured us in the past that Kindle Cloud Reader and the retail sampling engine are KF8-ready, but that’s not the case.

  2. The second version is for KF8-ready devices and apps. This version has all the fancy stuff we can do with the features enabled in KF8: block quotes with margins on all sides, embedded fonts, drop caps, better rendering of numbered and bulleted lists, text transformations, and more.

  3. The third version is an ePub file built from the source files (or if an ePub file was used to create the Mobi file, then it uses that as the source). We remain unsure why this occurs, but it our best guess is that helps Amazon learn what authors, publishers and development shops like us are doing with the format so that it can make improvements to the KF8 specification.

The result is a complete Mobi file package that’s roughly three times as large as its ePub file counterpart. Amazon, then, delivers the version that’s supported by the reader’s device or app.

Wait! What?! THREE TIMES THE SIZE? Think of the royalties, man!

Amazon understands this is unfair. After all, it prefers high-quality ebooks that play nicely with the various Kindle ereading systems.

Rather than penalize authors and publishers for making beautiful full-featured ebooks, it calculates the delivery fee based on the smallest version which is usually the KF7 version. You can verify this by checking the Rights & Pricing page at your KDP Bookshelf.

Here’s a screenshot from one of our KDP testing accounts:

How Amazon calculates Kindle delivery fees

Before anyone wonders about the size of the file noted in the screenshot, we should point out that it’s an ebook with 40 high resolution images. 😉

So what does this mean for you?

It means you don’t have to worry about the Mobi file size for the Kindle books we make for you. Our process is designed to produce the smallest possible footprint for the legacy KF7 version.

Here’s how:

  • The text from your manuscript is extracted from the original file as-is using custom-built software. Any basic formatting—such as headings, lists, bold and italic text—is kept, as well as any literary devices or other fancy effects.

    The extraneous code that word processing apps insert is jettisoned, and we split the book into logical parts to ensure the ebooks are device- and reader-friendly.

  • Images, including your cover, are optimized for ereading system display. We ask for large, high resolution assets so that we can ensure the best possible versions go into the finished ebooks. This is important if you’re also getting a POD file from us, as low resolution images do not print well.

  • We create the smallest possible Cascading Style Sheet we can. These aren’t heavy files to begin, but the more efficiency we can build into your ebooks the better.

  • We use Kindlegen, Amazon’s command line ebook creation software, and its built-in compression algorithms to produce Mobi files ready to upload to KDP.

The result is that your KDP royalty isn’t gobbled up by the delivery fee. Plus, your readers will thank you for the better-built, beautiful ebooks that turn pages fast and don’t run down their batteries.