Creating printable content from the web

February 27, 2017
19 min. read

# Creating printable content from the web All the stylesheets are written in SCSS and converted to CSS through the build process. In early 2015 I played with the idea of creating a custom XML vocabulary and the associated style sheets (XSLT, CSS and CSS Paged Media) to convert it to HTML, EPUB and PDF. It worked, the files were fairly easy to work with (at least for me who created them) and it was a good way to learn how to write XML schemas, XSLT style sheets and hone my skills with CSS. The [repository](https://github.com/caraya/xml-workflow) is available While it works the project relies on a custom XML vocabulary and it would take some work to convert to use (x)HTML5. I think I would start with the CSS stylesheet, change it to suit the needs of shorter content, integrate the PDF creation system to the existing Gulp build system and host the PDF content in a third party space like Amazon S3. ## Differences with @print media As far as I understand the differences is that @print media type was designed to make content printable but not to create a printed version of the content we're working with but it can be printed directly from the browser. In an ideal world Paged Media would be fully supported by all browsers and some of the work we're doing in this project would be unnecessary but it is not an ideal world. In this world we create pages for each type of content we want and associate portions of HTML to each of the page types we created. We can then add additional material as needed. We then use a third party tool (all commercial unfortunately) to process the HTML with a the paged media style sheet we just created and produce PDF or other formats depending on the formatter For a model to follow when using the stylesheet check [Oreilly's HTMLBook specification](https://oreillymedia.github.io/HTMLBook/) A good example of a book created with this specification and edited with [Atlas](https://oreillymedia.github.io/HTMLBook/) is Lea Verou's [CSS Secrets](http://shop.oreilly.com/product/0636920031123.do) ## Creating the paged media work We'll break this process into three parts: 1. Creating the additional paged media style sheet 2. Modify the HTML template and add additional HTML to our markup 3. Create additional Gulp tasks to automate the process ### Creating the additional paged media style sheet This stylesheets sets up a printed stylesheet with a basic set of parameters. The stylesheet works as is but it's also meant as a starting point for printed media work and you can certainly refine it based on your needs. The style sheet will control the print layout but not any other attribute controlled by CSS. In this case we've taken care of this in the stylesheet associated with the template, otherwise you'll have to provide your own styles. We begin the process by defining our standard page as an 8.5 x 11 inch page (US Letter) and give it default margins of 1 inch. We also define a `footnote` at-rule to increase the count on each page. Next we define add defaults attributes for the book defined as `body[data-type="book"]`. In this element we define the default text color. Paged media agents (at least PrinceXML) supports CMYK colors so we use that format for opaque black. If the rendering engine doesn't support CMYK we fallback to RGBA for the same color. If you're using Pantone colors you can use this handy [converter](http://rgb.to/pantone) to get the fallback RGBA color. Lastly we define the default hyphenation The HTML version of the document centers the `.container` element and makes it 48em (768 pixels) wide. For printed media we want to work with the full width of the document so we remove the centering and make the width 100%. It will not affect the HTML version of the document because it's in a separate style sheet that will never be called together with the `main.scss` styles. ```scss /* DEFINE THE DEFAULT PAGE */ @page { size: 8.5in 11in; margin: 1in; /* Footnote related attributes */ @footnote { counter-increment: footnote; float: bottom; column-span: all; height: auto; } } /* Default definitions for the book*/ body[data-type='book'] { color: cmyk(0%,0%,100%,100%); color: rgba(0, 0 ,0 ,1); hyphens: auto; } .container { width: 100%; } ``` This is a small block to define how we'll in handle increasing the page number counter. The first part of the book and the first article will reset the page number counter (`page`) because we would normally have front matter using a different numbering system and we don't want that number to carry over to the beginning of our content. THe second selector in this block will not reset counters when a article child of body is followed by a part. We have to be explicit to use this rule because, otherwise, it would trigger the `:first-of-type` rule immediately above. ```scss /* PAGE COUNTERS */ body[data-type='book'] > div[data-type='part']:first-of-type, body[data-type='book'] > section[data-type='article']:first-of-type { counter-reset: page; counter-reset: footnote; } body[data-type='book'] > section[data-type='article'] + div[data-type='part'] { counter-reset: none; } ``` Other than book (defined as an attribute of body) and part (defined as attribute of div elements) the rest of our book is defined in section elements. This is a two step process. In the first step (shown below) we associate a given type (`data-type`) with a page pseudo element we'll define later in the stylesheet. In this block we also define page specific attributes for each type of page. Most of the pages break before (`page-break-before`) and after (`page-break-after`). THe only additional element that I've placed in this block is the dotted leader for the table of content's page number. `section[data-type='toc'] nav ol li a:after` will place a dotted leader followed by the target content's page number. Because we want this to work only on the table of contents we have to be too specific. In most other circumstances I may shorten the selector. To create additional content type, create a new definition here and a `@page` definition later in the document. ```scss /* MATCH THE PARTS OF OUR BOOK FILE TO EACH OF OUR PAGES DEFINED LATER */ /* Title Page*/ section[data-type='titlepage'] { page: titlepage; page-break-before: always; page-break-after: always; } /* Copyright page */ section[data-type='copyright'] { page: copyright; page-break-before: always; page-break-after: always; } /* Dedication */ section[data-type='dedication'] { page: dedication; page-break-before: always; page-break-after: always; } /* TOC */ section[data-type='toc'] { page: toc; page-break-before: always; page-break-after: always; } /* Leader for toc page */ section[data-type='toc'] nav ol li a:after { content: leader(dotted) ' ' target-counter(attr(href, url), page);; } /* Foreword */ section[data-type='foreword'] { page: foreword; } /* Preface*/ section[data-type='preface'] { page: preface; } /* Part */ div[data-type='part'] { page: part; } /* Chapter */ section[data-type='article'] { page: article; page-break-before: always; } /* Appendix */ section[data-type='appendix'] { page: appendix; page-break-before: always; } /* Glossary*/ section[data-type='glossary'] { page: glossary; } /* Bibliography */ section[data-type='bibliography'] { page: bibliography; } /* Index */ section[data-type='index'] { page: index; } /* Colophon */ section[data-type='colophon'] { page: colophon; } ``` These are the page definitions. We have two options, defining both left and right versions of the page (`toc:right` and `toc:left`) for content that changes between both sides or create a single page template (`toc`) that will apply equally to left and right sides. It is here where we set specific attributes for the different types of pages we're working with. For example, the front-matter pages are formatted with lower case roman numerals where the main body uses arabic numerals (this is another reason why we reset the counter on first part or article). ```scss /* Comon Front Mater Page Numbering in lowercase ROMAN numerals*/ @page toc:right { @bottom-right-corner { content: counter(page, lower-roman);; } @bottom-left-corner { content: normal; } } @page toc:left { @bottom-left-corner { content: counter(page, lower-roman); } @bottom-right-corner { content: normal; } } @page foreword:right { @bottom-center { content: counter(page, lower-roman); } @bottom-left-corner { content: normal; } } @page foreword:left { @bottom-left-corner { content: counter(page, lower-roman); } @bottom-right-corner { content: normal; } } @page preface:right { @bottom-center {content: counter(page, lower-roman);} @bottom-right-corner { content: normal; } @bottom-left-corner { content: normal; } } @page preface:left { @bottom-center {content: counter(page, lower-roman);} @bottom-right-corner { content: normal; } @bottom-left-corner { content: normal; } } /* Common Content Page Numbering in Arabic numerals 1... 199 */ @page titlepage{ /* Need this to clean up page numbers in titlepage in Prince*/ margin-top: 18em; @bottom-right-corner { content: normal; } @bottom-left-corner { content: normal; } } @page dedication { /* Need this to clean up page numbers in titlepage in Prince*/ page-break-before: always; margin-top: 18em; @bottom-right-corner { content: normal; } @bottom-left-corner { content: normal; } } @page article { @bottom-center { vertical-align: middle; text-align: center; content: element(heading); } } @page article:blank { /* Need this to clean up page numbers in titlepage in Prince*/ @top-center { content: "This page is intentionally left blank"; } @bottom-left-corner { content: normal;} @bottom-right-corner {content:normal;} } @page article:right { @bottom-right-corner { content: counter(page); } @bottom-left-corner { content: normal; } } @page article:left { @bottom-left-corner { content: counter(page); } @bottom-right-corner { content: normal; } } @page appendix:right { @bottom-right-corner { content: counter(page); } @bottom-left-corner { content: normal; } } @page appendix:left { @bottom-left-corner { content: counter(page); } @bottom-right-corner { content: normal; } } @page glossary:right { @bottom-right-corner { content: counter(page); } @bottom-left-corner { content: normal; } } @page glossary:left { @bottom-left-corner { content: counter(page); } @bottom-right-corner { content: normal; } } @page bibliography:right { @bottom-right-corner { content: counter(page); } @bottom-left-corner { content: normal; } } @page bibliography:left { @bottom-left-corner { content: counter(page); } @bottom-right-corner { content: normal; } } @page index:right { @bottom-right-corner { content: counter(page); } @bottom-left-corner { content: normal; } } @page index:left { @bottom-left-corner { content: counter(page); } @bottom-right-corner { content: normal; } } ``` To create a running footer we'll use the [running()](https://www.w3.org/TR/css-gcpm-3/#running-syntax) value to use the element represented with the `rh` to do the following: - Take it out of the regular flow of the document - Use it as the text of the running header - Make the text italic and center it ```scss p.rh { position: running(heading); text-align: center; font-style: italic; } ``` Footnotes are a little more complicated than most of what we've seen so far ![](https://www.w3.org/TR/css-gcpm-3/footnote-diagram.001.jpg) Footnote terminology as it relates to generated page media Now that we've the terminology let's dive into specifics footnote element (`span.footnote`) The element containing the content of the footnote, which will be removed from the flow and displayed as a footnote. footnote marker (also known as footnote number) (`::footnote-marker`) A number or symbol adjacent to the footnote body, identifying the particular footnote. The footnote marker should use the same number or symbol as the corresponding footnote call, although the marker may contain additional punctuation (the content of the `::fotnote-marker::after` selector). footnote body (content inside `span.footnote`) The footnote marker is placed before the footnote element, and together they represent the footnote body. footnote call (also known as footnote reference) A number or symbol, found in the main text, which points to the footnote body. The default is to use Arabic numbers but it can be customized to use other symbols. footnote area The page area used to display footnotes. Usually located at the bottom of a page footnote rule (also known as footnote separator) A horizontal rule is often used to separate the footnote area from the rest of the page. The separator (and the entire footnote area) cannot be rendered on a page with no footnotes. The footnote element will be removed from the flow of text and replaced by the footnote call symbol. The footnote counter number increases. The footnote body is placed in the footnote area in document order ```scss /* Footnotes */ span.footnote { float: footnote; } ::footnote-marker { content: counter(footnote); list-style-position: inside; } ::footnote-marker::after { content: '. '; } ::footnote-call { content: counter(footnote); vertical-align: super; font-size: 65%; } ``` Creating cross references is as simple as creating an internal link with the `xref` class like this: `

<%= contents %>