Sources for the original collection of CJK Compatibility Ideographs include: • South Korean
KS X 1001 (U+F900–U+FA0B, 268 characters; see that page for the explanation) • Taiwanese
Big5 (U+FA0C–U+FA0D, 2 characters) • "IBM 32": 32 Japanese characters from IBM (U+FA0E–U+FA2D; see below) In ensuing versions of the standard, more characters have been added to the block from: • South Korean KS X 1001 (U+FA2E–U+FA2F, 2 characters) • Japanese
JIS X 0213 (U+FA30–U+FA6A, 59 characters) • Japanese
ARIB STD-B24 (U+FA6B–U+FA6D, 3 characters) • North Korean
KPS 10721-2000 (U+FA70–U+FAD9, 106 characters)
The "IBM 32" characters IBM Japanese double-byte EBCDIC includes several
kanji which do not exist in, or do not round-trip from,
JIS X 0208. These were included as
gaiji in extensions to
Shift JIS and
EUC-JP from
IBM (e.g.
code page 942),
NEC, the
Open Software Foundation, and
Microsoft (e.g.
Windows code page 932). However, they were not used as a source for the original
Unified Repertoire and Ordering (URO). Instead, 32 of the IBM extension kanji, those which had not been included in the URO from other sources, were included in the CJK Compatibility Ideographs block in the range U+FA0E–U+FA2D. Of these 32 characters: • 19 are unifiable with characters in the URO, and are therefore compatibility ideographs in the strict sense. • 12 are kokuji characters which are
actually unified ideographs (with the property, and which do not change upon normalisation). In spite of their inclusion in the CJK Compatibility Ideographs block and their algorithmically generated character names beginning with "", they are not duplicates of characters in the original
CJK Unified Ideographs block in any respect; 11 of these 12 are completely non-duplicate, while was later unintentionally duplicated in
CJK Unified Ideographs Extension B as . They are placed there because they do not have a URO encoding, yet IBM 32 is one of the encodings where duplicate encodings are of concern. All of them are rarely used or are variants of common kanji. They are as follows: • • • • • • • • • • • • • Uniquely, () is intended to be encoded as the
kyūjitai form of a
kokuji which received a separate encoding for a variant that is straightforwardly the
(extended) shinjitai form . The URO only encoded the shinjitai form, and uses its stroke count to place it in this position. It is furthermore one variant of the many
variants of the
jinmeiyō kanji (i.e.
Kummerowia). U+FA20 was assigned a normalisation to U+8612, even though the 龜 and 亀 components, while both forms of
radical 213, are not usually considered unifiable. ==Block==