6.4.2.1.3. Character Decomposition Tags

The decomposition is a normative property of a character. The tags supplied with certain decompositions generally indicate formatting information. Where no such tag is given, the decomposition is designated as canonical. Conversely, the presence of a formatting tag also indicates that the decomposition is a compatibility decomposition and not a canonical decomposition. In the absence of other formatting information in a compatibility decomposition, the tag <compat> is used to distinguish it from canonical decompositions.

In some instances a canonical decomposition or a compatibility decomposition may consist of a single character. For a canonical decomposition, this indicates that the character is a canonical equivalent of another single character. For a compatibility decomposition, this indicates that the character is a compatibility equivalent of another single character.

The compatibility formatting tags used are:

TagMeaning
<font>A font variant (e.g. a blackletter form)
<noBreak>A no-break version of a space or hyphen
<initial>An initial presentation form (Arabic)
<medial>A medial presentation form (Arabic)
<final> A final presentation form (Arabic)
<isolated>An isolated presentation form (Arabic)
<circle>An encircled form
<super> A superscript form
<sub> A subscript form
<vertical>A vertical layout presentation form
<wide>A wide (or zenkaku) compatibility character
<narrow>A narrow (or hankaku) compatibility character
<small>A small variant form (CNS compatibility)
<square>A CJK squared font variant
<fraction>A vulgar fraction form
<compat>Otherwise unspecified compatibility character