How Does a Unicode Character Get Mapped to a Glyph in a Font?

By Xah Lee. Date: . Last updated: .

How does a unicode character get mapped to a glyph in a font?

TrueType fonts consist of a number of sections, most importantly for this question a table of “glyphs” and a table (“cmap”) for mapping characters to those glyphs.

Long story short, the operating system uses the “cmap” table to convert characters into glyph indexes, substituting a default glyph for any which have no matching entry. Unfortunately there are multiple versions of the font file specification (not to mention different types of fonts) and different character encodings of the same mappings in those tables, so the actual process of doing the mapping, and doing it efficiently so that text drawing is fast, ends up being extremely complex.

A “Code Point” is completely independent of characters, encodings and fonts. A particular code point is universal, but there are many encodings for it (UTF-8, UTF-16, etc.) and it will map to different glyph indexes in different fonts.

Apple's developer documentation has a pretty good section on the details of TrueType fonts:

http://developer.apple.com/fonts/ttrefman/

Specifically:

Glyph table: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6glyf.html

Character map: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html

I also recommend an application called BabelMap, which gives you a lot of interesting information about fonts. Specifically look at Tools/Unicode Summary, Fonts/Font Analysis Utility, and Fonts/Font Information, where you can extract the entire glyph mapping table to the clipboard.

from Stackoverflow http://stackoverflow.com/questions/3582944/how-does-a-unicode-character-get-mapped-to-a-glyph-in-a-font