CSS paged media

In paged media the content of the document is split into one or more discrete pages. Paged media includes paper, transparencies, pages, Microsoft Word documents, Adobe PDF (portable document format) files among others

This needs to be tested with modern browsers. As far as I know this is only supported on Chrome and, maybe, Firefox

I am currently working on a sample page and associated stylesheet to test the concept and see how well it works. See the bottom of this post for more information

We no longer need to convert our documents into PDF or Word in order to print them. Using paged media and good layout and typography there is no reason not to expect high quality print results from our web content.

This is not a widely supported technology in browsers but it’s getting there and, when it does, it’ll make HTML an even better format to publish content.

There are formatters; programs that will take an HTML file and the style sheet and output a file suitable for printing. Some of the formatters include:

This is not quite the ideal I had in mind when I first started looking at Paged Media but until browser support gets better than it may just have to do.

Before we jump in to the code, there’s one great article from A List Apart that covers book making with CSS: Building books with CSS

Defining Pages: the @page rule

CSS defines a “page box”, a box of finite dimensions where content is rendered. It is formed by two areas:

  • The page area: The page area includes the boxes laid out on that page. The edges of the page area act as the initial containing block for layout that occurs between page breaks.
  • The margin area: which surrounds the page area.

You can specify the dimensions, orientation, margins, etc. of a page box within an @page rule. The dimensions of the page box are set with the ‘size’ property. The dimensions of the page area are the dimensions of the page box minus the margin area.

For example, the following @page rule sets the page box size to 8.5 x 11 inches and creates ‘1in’ margin on all sides between the page box edge and the page area:

<style tyle="text/css">
<!--
@page { size:8.5in 11in; margin: 1in; }
-->
</style>

You can use the margin, margin-top, margin-bottom, margin-left, and margin-right properties within the @page rule to set margins for your page same as you would for your regular CSS rules.

Setting Page Size:

The size property specifies the size and orientation of a page box. There are four values which can be used for page size:

  • auto: Same as the target paper size.
  • landscape: Vertical layout, larger side of the page are left and right
  • portrait: Horizontal layout, larger sides of the pare are top and bottom
  • length: Length values for the ‘size’ property create an absolute page box. Values are entered manually.

We’ll concentrate in length for this document. Once we’ve done some testing we will go back to how the auto, portrait and landscape value interact with the other parameters set up on the stylesheet

The example belows explicitly says the dimensions of the page to 8.5in by 11in. The pages created from the example requires paper that is 8.5″x11″ or larger.

<style tyle="text/css">
<!--
@page {
  size: 8.5in 11in;  /* width height */
}
-->
</style>

Once you create a named page layout, you can use it in your document by adding the page property to a style that is later applied to an element in your document. For example, this style renders all the tables in your document on landscape pages:

<style tyle="text/css">
<!--
@page { size : portrait }
@page rotated { size : landscape }
table { page : rotated }
-->
</style>

If the browser encounters a <table> element in your document and the current page layout is the default portrait layout, it will print the table in a new landscape page.

Left, right, and first pages:

When printing double-sided documents, the page boxes on left and right pages should be different. This can be expressed through two CSS pseudo-classes below:

<style tyle="text/css">
@page :first {
  size: 8.5in 11in;
}
@page :left {
  margin-left: 2.5in;
  margin-right: 1in;
}

@page :right {
  margin-left: 1in;
  margin-right: 2.5in;
}
</style>

The margins are mirror opposites from each other. When printed the pages will acommodate the margins for binding by providing additional space on the spine side.

The first page has it’s own pseudo class. Using the :first attribute we can style our first page independently from the rest of our content and make our first or title page look different like we do in the example below:

<style tyle="text/css">
<!--
@page { 
  margin: 1in /* All margins set to 2cm */
} 

@page :first {
  margin-top: 4in    /* Top margin on first page 10cm */
}
-->
</style>

Controlling pagination

Unless you specify otherwise, a page break only happens when there is a change in the page format or when the content fills the current page. To force or suppress page breaks, use the page-break-before, pagebreak-after, and page-break-inside properties.

Keywords for both page-break-before and page-break-after properties are: auto, always, avoid, left, and right.

The keyword auto is the default, it generate page breaks as needed. The keyword always forces a page break before or after the element, while avoid suppresses a page break immediately before or after the element. The left and right keywords force one or two page breaks, so that the element is rendered on a left-hand or right-hand page.

Suppose your document has level-1 headers start new chapters, with sections denoted by level-2 headers. We will start chapters on a new, right-hand page, but don’t want section headers to be split across a page break from the subsequent content. You can achieve this using following rule:

<style tyle="text/css">
<!--
h1 { page-break-before : right }
h2 { page-break-after : avoid }
-->
</style>

Use only the auto and avoid keywords with the page-break-inside property. To prevent tables from being broken accross pages, if possible, you would use the following rule:

<style tyle="text/css">
<!--
table { page-break-inside : avoid }
-->
</style>

Controlling widows and orphans:

Widow
A paragraph-ending line that falls at the beginning of the following page/column, thus separated from the rest of the text.

Orphan
A paragraph-opening line that appears by itself at the bottom of a page/column.

A word, part of a word, or very short line that appears by itself at the end of a paragraph. Orphans result in too much white space between paragraphs or at the bottom of a page.

From Wikipedia

Generally, printed pages do not look attractive with single lines of text stranded at the top or bottom. Most printers try to leave at least two or more lines of text at the top or bottom of each page.

  • The orphans property specifies the minimum number of lines of a paragraph that must be left at the bottom of a page.
  • The widows property specifies the minimum number of lines of a paragraph that must be left at the top of a page.

Here is the example to create 4 lines at the bottom and 3 lines at the top of each page:

<style tyle="text/css">
@page{
  orphans:4; 
  widows:2;
}
</style>

Initial stab at paged media stylesheet

See the test stylesheet for a possible way to make this work and the HTML document I’m using to experiment with the technology.

Expressing colors in CSS

As part of my research in Web Typography I got reacquainted with the many ways you can express colors in CSS. This was originally in my typography document but I think it’s better if I move it out to prevent an already long document from becoming unmanageable.

sRGB colors (3 or 6 digits)

This was what I always associted with colors in CSS and until not too long ago it was the only way to express colors. You can use either syntax but, as you can see in the example below, the 6 digit syntax allows for more precision in defining your colors.

The three definitions of a div container express the same color.

div {
  color: #0f0;   // The color 'lime' defined using the 3-digit hexadecimal notation
}

div {
  color: #00ff00; // The color 'lime' defined using the 6-digit hexadecimal notation
}

div {
  color: rgb(0, 255, 0) // The color 'lime' expressed with RGB notation
}

RGBa colors

This should be run behind a modernizr test as it’s not widely supported

This allows us to fill areas with transparent color; the first thee numbers representing the color in RGB values and the fourth representing a transparency value between 0 and 1 (zero being fully transparent and one being fully opaque). We have long had the opacity property, which is similar, but opacity forces all decendant elements to also become transparent and there is no way to fight it (except weird positional hacks) Cross-browser opacity is also a bit sloppy.

With RGBa, we can make a box transparent and leave its descendants alone

div {
   background: rgba(200, 54, 54, 0.5); 
}

Declaring a fallback color

Not all browsers support RGBa, so if the design permits, you should declare a “fallback” color. This color will be most likely be solid (fully opaque). Not declaring a fallback means no color will be applied in browsers that don’t support it. This fallback does fail in some really old browsers.

div {
   background: rgb(200, 54, 54); /* The Fallback */
   background: rgba(200, 54, 54, 0.5); 
}

Table below taken from the Mozilla Documentation Project. Credited according to a Creative Common Attribution-Share Alike License

Color keywords

Color keywords are case-insensitive identifiers which represent a specific color, e.g. red, blue, brown, lightseagreen. The name describes the color, though it is mostly artificial. The list of accepted values varied a lot through the different specification:

  • CSS Level 1 only accepted 16 basic colors, named the VGA colors as they were taken from the set of displayable colors on VGA graphic cards
  • CSS Level 2 added the orange keyword
  • From the beginning, browsers accepted other colors, mostly the X11 named colors list as some early browsers were X11 applications, though with a few differences. SVG 1.0 was the first standard to formally define these keywords; CSS Colors Level 3 also formally defined these keywords. They are often referred as the extended color keywords, the X11 colors, the SVG colors

There are a few caveats to consider when using keywords:

  • Except the 16 basic colors which are common with HTML, the others cannot be used in HTML. HTML will convert these unknown values with a specific algorithm which will lead to completely different colors. These keywords should only be used in SVG & CSS
  • Unknown keywords make the CSS property invalid. Invalid properties being ignored, the color will have no effect. This is a different behavior than the one of HTML.
  • No keyword-defined colors in CSS have any transparency, they are plain, solid colors.
  • Several keywords denote the same colors:
    • darkgray / darkgrey
    • darkslategray / darkslategrey
    • dimgray / dimgrey
    • lightgray / lightgrey
    • lightslategray / lightslategrey
    • gray / grey
    • slategray / slategray
  • Though the names of the keywords have been taken by the usual X11 color names, the color may diverge from the corresponding system color on X11 system as these are tailored for the specific hardware by the manufacturer .
Specifications Color Keyword RGB cubic coordinates Live Example
CSS3 CSS2 CSS1   black rgb(  0,   0,   0)  
  silver rgb(192, 192, 192)  
  gray[*] rgb(128, 128, 128)  
  white rgb(255, 255, 255)  
  maroon rgb(128,   0,   0)  
  red rgb(255,   0,   0)  
  purple rgb(128,   0, 128)  
  fuchsia rgb(255,   0, 255)  
  green rgb(  0, 128,   0)  
  lime rgb(  0, 255,   0)  
  olive rgb(128, 128,   0)  
  yellow rgb(255, 255,   0)  
  navy rgb(  0,   0, 128)  
  blue rgb(  0,   0, 255)  
  teal rgb(  0, 128, 128)  
  aqua rgb(  0, 255, 255)  
    orange rgb(255, 165,   0)  
    aliceblue rgb(240, 248, 255)  
  antiquewhite rgb(250, 235, 215)  
  aquamarine rgb(127, 255, 212)  
  azure rgb(240, 255, 255)  
  beige rgb(245, 245, 220)  
  bisque rgb(255, 228, 196)  
  blanchedalmond rgb(255, 235, 205)  
  blueviolet rgb(138,  43, 226)  
  brown rgb(165,  42,  42)  
  burlywood rgb(222, 184, 135)  
  cadetblue rgb( 95, 158, 160)  
  chartreuse rgb(127, 255,   0)  
  chocolate rgb(210, 105,  30)  
  coral rgb(255, 127,  80)  
  cornflowerblue rgb(100, 149, 237)  
  cornsilk rgb(255, 248, 220)  
  crimson rgb(220,  20,  60)  
  darkblue rgb(  0,   0, 139)  
  darkcyan rgb(  0, 139, 139)  
  darkgoldenrod rgb(184, 134,  11)  
  darkgray[*] rgb(169, 169, 169)  
  darkgreen rgb(  0, 100,   0)  
  darkgrey[*] rgb(169, 169, 169)  
  darkkhaki rgb(189, 183, 107)  
  darkmagenta rgb(139,   0, 139)  
  darkolivegreen rgb( 85, 107,  47)  
  darkorange rgb(255, 140,   0)  
  darkorchid rgb(153,  50, 204)  
  darkred rgb(139,   0,   0)  
  darksalmon rgb(233, 150, 122)  
  darkseagreen rgb(143, 188, 143)  
  darkslateblue rgb( 72,  61, 139)  
  darkslategray[*] rgb( 47,  79,  79)  
  darkslategrey[*] rgb( 47,  79,  79)  
  darkturquoise rgb(  0, 206, 209)  
  darkviolet rgb(148,   0, 211)  
  deeppink rgb(255,  20, 147)  
  deepskyblue rgb(  0, 191, 255)  
  dimgray[*] rgb(105, 105, 105)  
  dimgrey[*] rgb(105, 105, 105)  
  dodgerblue rgb( 30, 144, 255)  
  firebrick rgb(178,  34,  34)  
  floralwhite rgb(255, 250, 240)  
  forestgreen rgb( 34, 139,  34)  
  gainsboro rgb(220, 220, 220)  
  ghostwhite rgb(248, 248, 255)  
  gold rgb(255, 215,   0)  
  goldenrod rgb(218, 165,  32)  
  greenyellow rgb(173, 255,  47)  
  grey rgb(128, 128, 128)  
  honeydew rgb(240, 255, 240)  
  hotpink rgb(255, 105, 180)  
  indianred rgb(205,  92,  92)  
  indigo rgb( 75,   0, 130)  
  ivory rgb(255, 255, 240)  
  khaki rgb(240, 230, 140)  
  lavender rgb(230, 230, 250)  
  lavenderblush rgb(255, 240, 245)  
  lawngreen rgb(124, 252, 0)  
  lemonchiffon rgb(255, 250, 205)  
  lightblue rgb(173, 216, 230)  
  lightcoral rgb(240, 128, 128)  
  lightcyan rgb(224, 255, 255)  
  lightgoldenrodyellow rgb(250, 250, 210)  
  lightgray[*] rgb(211, 211, 211)  
  lightgreen rgb(144, 238, 144)  
  lightgrey[*] rgb(211, 211, 211)  
  lightpink rgb(255, 182, 193)  
  lightsalmon rgb(255, 160, 122)  
  lightseagreen rgb( 32, 178, 170)  
  lightskyblue rgb(135, 206, 250)  
  lightslategray[*] rgb(119, 136, 153)  
  lightslategrey[*] rgb(119, 136, 153)  
  lightsteelblue rgb(176, 196, 222)  
  lightyellow rgb(255, 255, 224)  
  limegreen rgb( 50, 205,  50)  
  linen rgb(250, 240, 230)  
  mediumaquamarine rgb(102, 205, 170)  
  mediumblue rgb(  0,   0, 205)  
  mediumorchid rgb(186,  85, 211)  
  mediumpurple rgb(147, 112, 219)  
  mediumseagreen rgb( 60, 179, 113)  
  mediumslateblue rgb(123, 104, 238)  
  mediumspringgreen rgb(  0, 250, 154)  
  mediumturquoise rgb( 72, 209, 204)  
  mediumvioletred rgb(199,  21, 133)  
  midnightblue rgb( 25,  25, 112)  
  mintcream rgb(245, 255, 250)  
  mistyrose rgb(255, 228, 225)  
  moccasin rgb(255, 228, 181)  
  navajowhite rgb(255, 222, 173)  
  oldlace rgb(253, 245, 230)  
  olivedrab rgb(107, 142,  35)  
  orangered rgb(255,  69,   0)  
  orchid rgb(218, 112, 214)  
  palegoldenrod rgb(238, 232, 170)  
  palegreen rgb(152, 251, 152)  
  paleturquoise rgb(175, 238, 238)  
  palevioletred rgb(219, 112, 147)  
  papayawhip rgb(255, 239, 213)  
  peachpuff rgb(255, 218, 185)  
  peru rgb(205, 133,  63)  
  pink rgb(255, 192, 203)  
  plum rgb(221, 160, 221)  
  powderblue rgb(176, 224, 230)  
  rosybrown rgb(188, 143, 143)  
  royalblue rgb( 65, 105, 225)  
  saddlebrown rgb(139,  69,  19)  
  salmon rgb(250, 128, 114)  
  sandybrown rgb(244, 164,  96)  
  seagreen rgb( 46, 139,  87)  
  seashell rgb(255, 245, 238)  
  sienna rgb(160,  82,  45)  
  skyblue rgb(135, 206, 235)  
  slateblue rgb(106,  90, 205)  
  slategray[*] rgb(112, 128, 144)  
  slategrey[*] rgb(112, 128, 144)  
  snow rgb(255, 250, 250)  
  springgreen rgb(  0, 255, 127)  
  steelblue rgb( 70, 130, 180)  
  tan rgb(210, 180, 140)  
  thistle rgb(216, 191, 216)  
  tomato rgb(255,  99,  71)  
  turquoise rgb( 64, 224, 208)  
  violet rgb(238, 130, 238)  
  wheat rgb(245, 222, 179)  
  whitesmoke rgb(245, 245, 245)  
  yellowgreen rgb(154, 205,  50)  

[*] The ‘e’-grey colors (with an e) (grey, darkgrey, darkslategrey, dimgrey, lightgrey, lightslategrey) are only supported since IE 8.0. IE 3 to IE 6 only support the ‘a’ variants: gray, darkgray, darkslategray, dimgray, lightgray, lightslategray.

See the Mozilla Documentation Project CSS Color page for more information.

HSLa colors

This needs to be used behind a modernizr test, it is not fully supported

Using HSLa is similar to RGBa in that you declare three values determining the color and then a fourth value for its transparency level. You can read more about browser support below, but it’s basically any browser that supports rgba supports hsla too.

#some-element {
   background-color: hsla(170, 50%, 45%, 1);
}
  • Hue Think of a color wheel. Around 0o and 360o are reds 120o are greens, 240o are blues. Use anything in between 0-360. Values above and below will be modulus 360
  • Saturation 0% is grayscale. 100% is fully saturated (full color)
  • Lightness 0% is completely dark (black). 100% is completely light (white). 50% is average lightness
  • alpha Opacity/Transparency value. 0 is fully transparent. 1 is fully opaque. 0.5 is 50% transparent.

Case Study: MOBI publishing from Docbook

BACKGROUND

In October of last year (2012) i was contacted by a German developer to do consulting on Docbook based ebook design and publishing. Because I had recently started a full time job I had to turn him down.

4 months later things had slowed down at work enough for me to reach out and check with the potential client to check if I could offer any assistance. I was sad to find out that he had given up and pretty much abandoned the project.

I offered to run the project for free in exchange of being allowed to write a case study on the experience.

This is the first time I work with content I didn’t author. It might have had something to do with some of the issues discussed throughout or it might not.

Source material and initial challenges.

The source material was a set of Docbook XML files along with the corresponding images in the screenshot directory. Since i cloned the files from a Github repository i shouldn’t have been surprised that thee were more inages tha those i needed and in different formats. This only became aparent later in the process.

The main file (index.xml) used XInclude to reference the rest of the content. Working on an integrated file (Docbook and XInclude have their own separate namespaces) required some modifications to my XML validation process. It wasn’t anything major (just use a different grammar file) but it’s important to keep it in mind for when using XInclude in the future.

FILES TO NOTICE

  • build.xml The Apache Ant build script
  • local-style.css Additional CSS classes to make the ePub document look nicer. It will only work with Kindle Fire devices (those supporting the KF8 format)
  • docbook-custom.xsl The Docbook customization layer.

This project also makes use of open source fonts for the ePub3 and KF8 versions:

THE PROCESS

My standard Docbook to ePub process looks like this

  1. Prepare the Docbook customization layer
  2. Clean the working directory from prior conversion files
  3. Validate the XML file using Jing
  4. Fix any XML/Docbook structural and syntax errors
  5. Repeat the XML validation as many times as necessary
  6. use XSLTproc to convert the XML file to (X)HTML
  7. Create and test the CSS stylesheets you want to use with your content
  8. Copy images, fonts and additional CSS stylesheets to the book directory
  9. Zip the content according to the ePub specification requirements
  10. Validate the ePub file using epubcheck
  11. Fix any ePub validation errors and rerun epubcheck
  12. Convert to Mobi and KF8 using kindlegen
  13. Test on your target devices

THE TOOLS

And and Ant build script

Rather than do all the steps manually I build an Ant build file that takes care of all the steps and the some additional stuff for general house keeping.

I picked Ant rather than a make file because it is truly cross platform. It is written in Java which may cause a problem for the most security conscious people but it is a command line application and in all the years I’ve used it it has never caused a problem.

The Ant build script is an xml-based file that contains one or more targets to accomplish your project goals.

XSLTProc

I use XSLTProc as my transformation engine for a variety of reasons. It is a part of the OS X. Developer tools (or you can install it using Homebrew if you don’t want the full multi-gig download from Apple) and it is the fastest XSLT 1.0 processor that I’m aware of.

It also handles XInclude without needing extensions or additional software.

For XSLT 2.0 and beyond I will most likely use Saxon. It is java based and authored by the editor of the XSLT specification at the W3C.

Jing

Until last ear I used XMLlint as my XML validator. When using a RelaxNG vocabulary such as Docbook, it is not enough. When I first encountered this problem it was suggested that I move to Jing, mostly because it was built to validate RelaxNG rather than older W3C Schemas and DTDs.

In order to validate Docbook / XInclude documents you have to download a RelaxNG grammar file from the Docbook website.

epubcheck

Tool developed by idpf to validate ePub 2 and 3 documents. It is a pain in the ass sometimes as it will not validate things we know are correct but it also has a great group behind it and bugs are fixed fairly quickly.

Current version is 3.0 (final)

Kindlegen

Kindlegen is a command line tool that allows you to convert your ePub 3 file into a Kindle file for both the older eInk devices as well as the newer Kindle Fire readers.

Because I’ve automated my workflow I use Kindlegen rather than the Kindle Previewer for the automatic validation. I will still use Kindle Previewer to validate the content.

THE PROCESS

Developing your customization layer.

As counterintuitive as this may sound this is always my first step. There are things I know I’ll need and they need to be used from the first test of the book conversion.

Some of the things I knew I needed for this particular project:

  • kindle.extensions: Provides an umbrella for all Kindle-specific changes the stylesheets need to make to the ePub. Since the target platform is the Kindle family of devices we need to make sure that whatever is Kindle specific is addressed

  • html.stylesheet: Adds additional CSS stylesheets to use with your book. This is important as the default stylesheet provides only minimal styling support

  • user.manifest.items: The template, empty by default, allows you to add additional elements to the book package manifest (package.opf). In this particular instance we use it to add fonts for the ePub3 and KF8 (Kindle Fire) formats. ePub2 and eInk kindles (I believe all of them except the white paper model) will ignore material

What’s a customization layer, why would you create one and how to do it is best covered in Bob Stayton’s book Docbook XSL: The Complete Guide, particularly chapter 9.

Create and test the CSS stylesheets

At this time we can’t do much with the stylesheets other than set up the default fonts using @font-face CSS rules. My normal process includes adding a font for the body (usually a sans serif font like Adobe Code Sans) and a font for any pre formated styles like program listings or screen captures (using Ubuntu Mono as my font)

There are times when this is all that is needed but more often than not you need to do additional CSS work based on the specific classes and structure created during the conversion process.

Clean up the working directory from prior conversion files

I know it shouldn’t make a difference but I’ve learned over time to always clean up prior conversion artifacts from the working directory. This has saved a lot of grief when trying to identify problems with the CSS styles in particular. There is no real way to tell if the changes you’re making is based on the old or the new code so I tend to err on the side of being OCD

Validate the XML file using Jing

Fix any XML/Docbook structural and syntax errors

Repeat the XML validation as many times as necessary

Run the Docbook file (index.xml in this case) through Jing using a command similr to this:

java -jar /path/to/jing/jing -jar /path/to/jing/lib/docbookxi.rng index.xml

Before converting the Docbook file into our ePub (X)HTML we need to make sure that the file is valid XML or it’ll cause no end of problems later on. Validating the XML gives use the chance to work with the Docbook content and fix any issues that we find.

This is an iterative process: Run Jing, fix any issues that it reports and the run Jing again. The idea is to have no issues in the Docbook XML code.

Use XSLTproc to convert the XML file to (X)HTML

We will use XSLTproc to convert the valid XML file into a set of (X)HTML pages. At this point I’m confident that the input XML file has no serious issues that will affect the remaining stages of the process.

Two important things to remember are XIncludes and to set ups base directory for the transformed content.

While XSLTProc supports XInclude you have to tell it to use them as it is not a default activity.

XSLTProc, by default, will work in the directory where it is run. We need to tell it not to do that and to create a directory for our ePub content. We use the following command:

xsltproc --xinclude --stringparam base.dir OEBPS/ docbook-custom.xsl index.xml
  1. The first parameter (–xinclude) tells XSLTProc to process any XInclude references and add them to the document

  2. The second parameter (–stringparam base.dir OEBPS/) tells XSLTProc to create the OEBPS directory if it doesn’t exists and to create all the files in this directory.

  3. docbook-custom.xsl is the Docbook customization layer we created earlier in the process.

  4. index.xml refers to the main Docbook file. It contains all the XInclude references and will generate all out XHTML content.

Copy images, fonts and additional CSS stylesheets to the book directory

In the step above we only created the XHTML files with their references to images and other internal resources. We now need to copy all additional resources (in this case they are: CSS, fonts, and images) to the corresponding directory. For example, if we referenced the images like this:

images/sample.png

We would copy the image to:

OEBPS/images

Otherwise you will get validation errors later in the process. We need to do the same thing with all additional resources.

Zip the content according to the ePub specification requirements

Ziping the epub book is a two step process.

The first step is to add the mimetype file (generated by Docbook) into the ePub archive without any compression. This is required by the ePub specification.

zip -X0 mybook.epub mimetype

The second zip command will add the remaining content, the OEBPS directory and the META-INF with normal compression settingsand exclusing all metadata; excluding the metadata is also required by the ePub specification.

zip -r -X9 mybook.epub META-INF OEBPS

Some people say that it’s OK to work both steps in a single zip command but I’ve had enough issues with it in the past that I always run the two commands even if it’s not needed.

Validate the ePub file using epubcheck

Fix any ePub validation errors and rerun epubcheck

Running epubcheck will detect a variety of ePub specific errors that can happen with your HTML’s structure, the CSS, whether content should be present or not, among other events.

These errors may or may not have been caught by Jing when we validated the XML structure of the document. I’m assuming that if we get this far the HTML product is valid and will pass structural validation without a problem.

This is also an itertive process. If there are errors you will have to go back to your HTML, CSS or XML manifest package, make any changes, zip the content again and validate it. I know it’s a pain in the ass but right now it’s the only way to do it.

To run epubcheck type the following command:

java -jar /path/to/epubcheck/epubcheck.jar mybook.epub

Convert to Mobi and KF8 using kindlegen

Once you have validated your ePub3 file you can use Kindlegen to convert it to both eInk (KF6) and Kindle Fire (KF8) formats. The resulting Mobi file contains both versions, you have no choice.

Kindlegen is a command line application chosen because it is easier to merge with the command line based Ant workflow. Another possibility worth exploring but it’s beyond the scope of this essay.

Tu run Kindlegen run it like this:

/path/to/kindlegen mybook.epub

If there are any errors at this point the fixes have to be made on the XHTML documents, then zipped and validated as ePub before validating it again as Mobi. Warnings are, for the most part, OK. Make a note of what they are and, if so inclined, report them to the docbook-apps mailing list.

Test on your target devices

The final thing to do is test in as many actual devices as you can. It is important to test in as many devices as you can; the experience with the cloud readers is definitely not the same as reading in the device, particularly the older Kindle and Kindle DX devices that only support the KF6 format (no fonts, no colors, no audio and video).

Once you are happy with the final product you can submit it through the KDP program.

Happy publishing!