Unicode and multilingual support

Unicode is an ISO international standard (ISO/IEC 10646) for the consistent encoding, representation, and handling of text in most of the world’s writing systems. It answers the question: How do I combine multiple languages in a single document?

In the past we had different encodings for different language codified in the ISO/IEC 8859 series of standards. English, for example, uses ISO/IEC 8859-1 which is mostly ASCII. This is not sufficient to work with eastern European languages that use additional symbols accents and punctuation to fully express the language. See the Wikipedia entry for ISO 8859 for more details regarding what languages are supported on each section of the standard.

HTML and XHTML both come with a predefined set of entities to handle unicode characters using a name (for example using Υ for the uppercase Upsilon greek letter.) But that is still no guarantee that your chosen font will have the glyphs matching the character you need.

I’ve had a discussion in the XML Content/InDesign Publishing Linked in Group and, as part of my research, discovered that:

  • The level of unicode glyph support depends on the fonts available in the OS
  • Specific Unicode glyphs may not be available in all platforms

In my opinion he best solution is still to use web fonts you know have all the characters your text uses and you have tested with your content. This is doubly important when looking at the glyphs that appear on your document… we need to make doubly sure that whatever glyphs you use are available in the typeface you’ve selected.

To make things easier on yourself, you can also subset the fonts you’re using so that only the characters you need will be added to your font; This is particularly useful in multibyte characters like Japanese, (traditional) Chinese or Korean where there are at least several hundreds, if not thousands, of glyphs.

When I want to use font subsets my favorite tool is Font Squirrel’s Webfont generator. I’ve documented how I use the generator and other possible subsetting solutions so I won’t go into too much detail of the mechanics… but I will cover the basics just to be sure.

Font Squirrel has three options for subsetting fonts. We will work with custom options because they provide the most flexible approach

Font Squirrel Font Subset Options when using custom subsetting
Font Squirrel Font Subset Options

You can subset by character type, language, unicode table, single characters or, if you know what you’re doing by Unicode Ranges. The last option is only good if you know the exact range or ranges of characters that you need to support and the preview may show you that your chose font doesn’t support the characters you need.

Open Type features

Open Type fonts have a set of features that make typographical work easier both in print and, where supported, on the web. These features can be enabled. These are a subset of all the Open Type features available.

  • “c2sc” : small caps from caps: Substitutes capital letters with small caps
  • “calt” : contextual alternates: Applies a second substitution feature based on a match of a character pattern within a context of surrounding patterns
  • “clig” : contextual ligatures: Applies a second ligature feature based on a match of a character pattern within a context of surrounding patterns
  • “dlig” : discretionary ligatures: Ligatures to be applied at the user’s discretion
  • “hist” : historical character alternatives: Obsolete forms of characters to be applied at the user’s discretion
  • “hlig” : historical ligatures: Obsolete ligatures to be applied at the user’s discretion
  • “kern” : enable use of embedded kerning table
  • “liga” : common ligatures: Replaces a sequence of characters with a single ligature glyph
  • “nalt” : alternate annotation: Provides user access to circled digits, inverse letters etc.
  • “salt” : stylistic alternatives: Either replaces with, or displays list of, stylistic alternatives for a character
  • “smcp” : small caps: Substitutes lower-case letters with small caps versions
  • “ss01 — ss05”: alternate stylistic set 1 through 5: Replaces character with one from a font-specific set of alternatives
  • “swsh” : swashes: Either replaces character with or displays multiple swashed versions
  • “zero” : slashed-zero: Replaces the digit 0 with slashed 0

Adobe has more information about Open Type features, both the ones listed above and additional features used in desktop publishing. When in doubt use the name above to check Adobe’s list. Furthermore make sure that the font you want to work with supports the features we intend to use.

The best way to check what opentype features a font supports is to check the font specimen, if available. If not contact the foundry or the font creator to check on feature support.

To enable these features in CSS use something like the code below to account for most modern browser versions:

body {
  -moz-font-feature-settings: "liga=1, dlig=1";
  -ms-font-feature-settings: "liga", "dlig";
  -webkit-font-feature-settings: "liga", "dlig";
  -o-font-feature-settings: "liga", "dlig";
  font-feature-settings: "liga", "dlig";
}

This feature is a prime candidate for autoprefixer or some other automation tool. Repeating the same sequence of value-pair for 3 or 4 different alternatives is a nightmare; according to caniuse.com, Only IE, Chrome/Opera and Firefox support the feature and Chrome/Opera support it behind the -webkit flag.

For a better explanation and specific examples, check this Typekit Help article

Kerning

kerning

Kerning can refer to one of two things: spacing instructions that type designers put into font files to mitigate awkward character combinations, or spacing adjustments that graphic designers make as they typeset

In this section we will only discuss the second definition of Kerning. We will cover both the built-in CSS kerning features and Lettering.js, a Javascript library that provides enhanced support for kerning and other formatting using CSS. The idea is to adjust the kerning without having to do any refactoring of the content.

Automatic kerning in CSS depends on Open Type Features, particularly in the kerning attribute being enabled. This will change from font to font and is embedded in the font when created.

[codepen_embed height=”618″ theme_id=”2039″ slug_hash=”RPZrKv” default_tab=”result” user=”caraya”]See the Pen Automatic Kerning by Carlos Araya (@caraya) on CodePen.[/codepen_embed]

There may situations where you want a less uniform kerning based on specific attributes. We can control the kerning between letters, as in example 8-2, below. The first paragraph has normal kerning; the second paragraph has letters spaced 0.25em; and the third one has a negative kerning of 0.25em.

[codepen_embed height=”618″ theme_id=”2039″ slug_hash=”rVzxyz” default_tab=”result” user=”caraya”]See the Pen Kerning using word spacing by Carlos Araya (@caraya) on CodePen.[/codepen_embed]

Another way to work with kerning is to change letter spacing within words as in example 8-3. Notice the difference between the example above using word spacing and the one below using letter spacing: When we kern letters the text looks significantly different than when we work with words.

[codepen_embed height=”475″ theme_id=”2039″ slug_hash=”ZGJQjX” default_tab=”result” user=”caraya”]See the Pen Kerning using letter spacing by Carlos Araya (@caraya) on CodePen.[/codepen_embed]

As with many other things in the web, different browsers have different levels of support for kerning. This makes the feature another prime candidate for Autoprefixer. If you’re not inclined to automate, you can do something like this:

div#kerningExample {
  border: 1px solid #cc092f;
  padding: 1em;
  width: 100%;
  text-rendering: optimizeLegibility;
  -moz-font-feature-settings: "kern";
  -moz-font-feature-settings: "kern=1";
  -ms-font-feature-settings: "kern";
  -o-font-feature-settings: "kern";
  -webkit-font-feature-settings: "kern";
  font-feature-settings: "kern";
}

There are multiple entries for Firefox (indicated as -moz-font-feature-settings) indicating that the format for the feature changed in an incompatible way.

The version for Opera (-o-font-feature-settings) is for older versions before adopting Blink.

Using Lettering.js

One of the biggest drawbacks of using CSS Kerning is that it only work with all letters and all words in a document. If you want to work at a finer level you have to work with libraries like Lettering.js.

Lettering.js is a jQuery plugin that will automatically add tags (span) and class names to letters, words or lines depending on how we call the plugin. We can then style each individual class as we see fit.

You can wrap spans around characters, words, lines and combination of the three to get as tight a control over the text style as you need or want. Because it insert classes, designers must be careful to use it in headings or small chunks of text or it will affect performance.

Working with Lettering.js is a three step process.

Following the Lettering.js Wiki Example we will use this as our example text:

<h1 class="fancy_title">Some Title

First, we need to load and initialize the plugin as you would any other plugin:

<script src="path/to/jquery.min.js"></script>
<script src="path/to/jquery.lettering.min.js"></script>
<script>
$(document).ready(function() {
  $(".fancy_title").lettering();
});
</script>

Contrary to what most people tell you, I put all my script initializers and loaders at the bottom of my file to make sure that there is content loaded before I initialize code that will change it. It’s always been a catch 22.

If you let your content load before scripts and stylesheets are loaded then the content will flash and change as the stles modify the way the content looks and behaves. But if you put the scripts and style sheets first then they will all have to load before the content is displayed and that may take significantly long (at least in web terms) up to several seconds.

Web performance patterns advise that you put Javascripts at the bottom of your page before your </body> tag. There is an unfortunate side effect where you may experiences a FOUT (Flash of Unstyled Text) when you’re manipulating your text after the DOM has loaded. Unfortunately, we found the best solution to avoid/minimize the FOUT caused by this plugin is to put your scripts (jQuery, Lettering.js) in the document <head>. On the one hand, your page will load slower. On the other hand, a flash/restyling makes your site feel slow. Users might ultimately feel the site is faster if they don’t see the FOUT.

Dave Rupert. Lettering.js Readme

The result will appear like this:

<h1 class="fancy_title">
  <span class="char1">S</span>
  <span class="char2">o</span>
  <span class="char3">m</span>
  <span class="char4">e</span>
  
  <span class="char6">T</span>
  <span class="char7">i</span>
  <span class="char8">t</span>
  <span class="char9">l</span>
  <span class="char10">e</span>

Which then we can style with CSS looking something like this:

.fancy_title .ch2 {
         margin-left: -.0125em;
         color: purple;
}

We can be even more detailed in how we break our content out and how much we style it. In the example below:

<h1>My Three Sons</h1>
<script>
  $("h1").lettering('words').children('span').lettering();
</script>

The lettering invocation will create spans for words and then it will split each word into its component characters, producing HTML like the one below:

<h1>
  <span class="word1">
    <span class="char1">M</span>
    <span class="char2">y</span>
  </span>
  <span class="word2">
    <span class="char1">T</span>
    <span class="char2">h</span>
    <span class="char3">r</span>
    <span class="char4">e</span>
    <span class="char5">e</span>
  </span>
  <span class="word3">
    <span class="char1">S</span>
    <span class="char2">o</span>
    <span class="char3">n</span>
    <span class="char4">s</span>
  </span>
</h1>

Although, as mentioned earlier, this is not good for larger chunks of text the posibilities for headings and smaller pieces of text are only limited by your imagination.

Links and resources

From MDN

Other resources

If you really hate someone, teach them to recognize bad kerning
If you really hate someone, teach them to recognize bad kerning

Text Alignment and Hyphenation

Given the same font, alignment and hyphenation have a definitive impact on the way we read content and how it appears on-screen

[codepen_embed height=”691″ theme_id=”2039″ slug_hash=”XbabOL” default_tab=”result” user=”caraya”]See the Pen Text Alignment Possibilities by Carlos Araya (@caraya) on CodePen.[/codepen_embed]

None of the paragraphs above is hyphenated. Note in particular how the Justified paragraph leaves larger gaps between words to accommodate the justification. It is the same, although not so noticeable, in the other paragraphs.

Although CSS support hyphenation using the hyphens rule the support is inconsistent (it only works in Firefox.) A good alternative is to use libraries such as hyphenator to offer a consistent hyphenation experience.

Some of the Hyphenator’s drawbacks (also from http://mnater.github.io/Hyphenator/):

  • Hyphenator.js and the hyphenation patterns are quite large. Good compression and caching is vital.
  • Automatic hyphenation can’t be perfect: it may lead to misleading hyphenation like leg-ends (depends on the pattern quality)
  • There’s no support for special (aka non-standard) hyphenation (e.g. omaatje->oma-tje)
  • There’s no way for Javascript to influence the algorithm for laying out text in the browser. Thus we can’t control how many hyphens occur on subsequent lines nor can we know which words have actually to be hyphenated. Hyphenator.js just hyphenated all of them.

[codepen_embed height=”689″ theme_id=”2039″ slug_hash=”PqKPvV” default_tab=”result” user=”caraya”]See the Pen Text Alignment Possibilities With Hyphenated Text. by Carlos Araya (@caraya) on CodePen.[/codepen_embed]

Advanced Typography

Elliot Jay Stocks presented on Advanced Typography at the Generate 2014 conference in London. It provides a lot of information about this subject.

Elliot Jay Stocks — Generate 2014 conference