Spec for Thai OpenType Font Creation
Glyph Set
- Latin glyphs (space…ascitilde, endash, emdash, quoteleft, quoteright,
quotedblleft, quotedblright, bullet, ellipsis)
- dottedcircle
- Thai glyphs as listed in Unicode (uni0E01…uni0E3A, uni0E3F…uni0E5B)
- Bigger alternate glyphs for Thai tone marks (uni0E48.low…uni0E4C.low)
for placing at lower position, in the absence of upper vowel
- Descender-less alternate glyphs for ญ (uni0E0D.descless) ฐ
(uni0E10.descless)
- Alternate glyph for Nikhahit for high position (uni0E4D.high),
and optionally Maitaikhu (uni0E47.high), to prevent glyph class
overload in GSUB rules, as discussed below, and probably to provide
smaller variants
Glyph Positioning (GPOS)
- Anchor classes: AboveBase, AboveMark, BelowBase
- Base characters
- AboveBase base anchors: for ป ฝ ฟ and optionally ฬ
(depending on the design of the glyph itself)
- BelowBase base anchors: for ฎ ฏ
- Upper vowels (◌ั ◌ิ ◌ี ◌ึ ◌ื ◌ํ)
- AboveBase mark anchor, for attaching to base characters
- AboveMark base anchor, for placing tone/diacritic marks
- Diacritics (◌็ ◌๎)
- AboveBase mark anchor, for attaching to base characters
- Topmost marks (◌่ ◌้ ◌๊ ◌๋ ◌์, ◌ํ high version [uni0E4D.high], and optionally uni0E47.high)
- AboveMark mark anchor, for attaching to upper vowels
- Alternate tones/diacritics (uni0E48.low…uni0E4C.low)
- AboveBase mark anchor, for attaching to base characters
- Lower vowels/marks (◌ุ ◌ู ◌ฺ)
- BelowBase mark anchor, for attaching to base characters
Notes:
- NIKHAHIT (◌ํ, uni0E4D) falls in two categories, namely upper vowel and
topmost mark. So it can be both base mark and mark for the AboveMark
anchor class. This is not a problem for GPOS, but it can cause
confusion in GSUB rules. To prevent such overload, a separate variant
glyph, uni0E4D.high, should be created for the mark version.
- MAITAIKHU (◌็, uni0E47) falls in two categories, namely topmost marks
and diacritics. So it requires 2 anchors in total. Note that the use
of MAITAIKHU as topmost mark is for writing some minority languages,
such as Kuy. A picky implementation may provide a smaller alternate
glyph for it for that purpose.
Glyph Substitution (GSUB)
- The only feature used for Thai rendering engines is ‘ccmp’.
So, every substitution rule should be named as such.
- Rules:
- Reordering of lower marks (◌ุ|◌ู|◌ฺ) and topmost marks (◌่ ◌้ ◌๊ ◌๋ ◌์), plus selection of low variants
- (Chain Contextual Substitution)
[C] L T → T.low L
- Glyph selection for topmost marks (◌่ ◌้ ◌๊ ◌๋ ◌์) for normal case
- (Chain Contextual Substitution)
[C] T → T.low
- Glyph selection for base characters with descender part
- (for Pali, as Single Substitution)
{ญ|ฐ} → {ญ|ฐ}.descless
- (for others, as Chain Contextual Substitution)
{ญ|ฐ} [L] → {ญ|ฐ}.descless
- Decomposition of SARA AM with tone marks combined
- (Chain Contextual Substitution)
[C] T [SARA-AM] → NIKHAHIT T
SARA-AM → SARA-AA
- Decomposition for SARA AM (◌ำ) for normal case
- (Multiple Substitution)
[C] SARA-AM → NIKHAHIT SARA-AA
- Choose NIKHAHIT/MAITAIKHU high version if preceeded by upper vowel (◌ั ◌ิ ◌ี ◌ึ ◌ื)
- (Chain Contextual Substitution)
[U] {NIKHAHIT|MAITAIKHU} → {NIKHAHIT|MAITAIKHU}.high
Copyright © 2004, 2005 by Theppitak Karoonboonyanan.
All right reserved.