Following are the code charts for KS X 1001 in Wansung layout. Where a pair of hexadecimal numbers is given, the smaller is used when encoded over GL (0x21-0x7E), as in
ISO-2022-KR when the Korean set has been shifted to, and the larger is used in the more typical case of it being encoded over GR (0xA1-0xFE), as in
EUC-KR or UHC.
Johab changes the arrangement to encode all 11172 Hangul clusters separately and in order. To illustrate vendor differences in implementation, multiple Unicode mappings are shown for some characters. Apple's
HangulTalk extensions to the Wansung plane (i.e. where both bytes are in the 0xA1-0xFE range) are shown, but other HangulTalk extension ranges are not. The additional codes for composed syllables in Unified Hangul Code, and IBM's extensions in
IBM-949, are also not shown, since both fall outside of the Wansung plane.
Lead bytes }
Non-Hanja non-precomposed sets The rows 41 and 94 may be used for user-defined purposes. or U+223C (favoured by Microsoft). Compare the similar but not identical handling of the
JIS wave dash, and the handling of the tilde in the next row. Except for the backslash, if two mappings are shown below, the first is used by Apple and the second is used by Microsoft. Mapping of the
circled dot also differs. Microsoft updated its
Unified Hangul Code implementation to add the 1998 additions including the euro sign, but did not add the Korean postal mark when it was added to the standard.
Character set 0x23 / 0xA3 (row number 3, basic Latin / ISO 646-KR) This set corresponds to KS X 1003 (the
ISO 646 variant for Korean, a similar set to
ASCII), but as two-byte codes preceded by 0x23 (or 0xA3 in GR-invoked (EUC) form). It includes the
English alphabet /
Basic Latin alphabet,
western Arabic numerals and punctuation. Compare the Roman set of
JIS X 0201, which differs by including a
Yen sign rather than a
Won sign. Contrast the third rows
of KPS 9566 and
of JIS X 0208, which follow the ISO 646 layout but only include letters and digits. Encodings such as EUC-KR and UHC combine KS X 1001 with single-byte ASCII or KS X 1003, and hence use alternative Unicode mappings to the
Halfwidth and Fullwidth Forms block for the double-byte representations of these characters.
Character set 0x24 / 0xA4 (row number 4, Hangul jamo) This set includes modern Hangul consonants, followed by vowels, both ordered by South Korean collation customs, followed by obsolete consonants. When used individually, these characters map to the Unicode
Hangul Compatibility Jamo block, and do not have a one-to-one mapping with the position-specific characters in the
Hangul Jamo block. Compare with
row 4 of the North Korean KPS 9566. Character 04-52 is a Hangul Filler (see
above), used in combining sequences.
Character set 0x25 / 0xA5 (row number 5, Roman numerals and Greek) This set contains
Roman numerals and basic support for the
Greek alphabet, without diacritics or the
final sigma. Apple includes some additional punctuation in this row, as well as some black circled list markers continuing from those in row 6. Apple also includes some bracketed list markers continuing from those in rows 9 and 10. Compare
row 11 of KPS 9566, which uses the same layout. Compare and contrast
row 5 of JIS X 0208, which also uses the same layout, but in a different row.
Character set 0x2C / 0xAC (row number 12, Cyrillic) This set contains the modern
Russian alphabet, and is not necessarily sufficient to represent other forms of the
Cyrillic script. Apple also includes some black boxed list markers. Compare
row 5 of KPS 9566 and
row 7 of JIS X 0208, which use the same layout (but in a different row).
Extended character set 0x2D / 0xAD (row number 13, Apple additional punctuation) Precomposed Hangul sets (rows number 16 through 40) Code points for precomposed Hangul are included in a continuous sorted block between code points 16-01 and 40-94 inclusive. Not all possible syllable clusters are included in this range. Compare
the different ordering and availability in KPS 9566. Initial+vowel+final syllables 뢨, 썅, 쏀, 쓩, and 쭁 are included but their initial+vowel counterparts 뢔, 쌰, 쎼, 쓔, and 쬬 are not. This can cause a problem with inputting, because input methods have to go through an initial+vowel syllable first in order to get to an initial+vowel+final syllable (e.g. ㅎ → 하 → 한). Those which are not listed here may be represented using eight-byte composition sequences. All other modern-jamo clusters are assigned codes elsewhere by UHC. All possible modern-jamo clusters are assigned codes by Johab.
Statistics by jamo ; Vowels ; Final consonants
Hanja sets (rows number 42 through 93) KS X 1001 encodes hanja with multiple pronunciations multiple times, once for each pronunciation. (Some pronunciations are inherited from
Middle Chinese, and others are an effect of the
initial sound rule.) One character, 樂, is encoded four times. The first 268 characters (U+F900–U+FA0B) in the
CJK Compatibility Ideographs block correspond to these duplicates. In the table below, the first row-cell value (and reading) for each Hanja maps to the
CJK Unified Ideographs block; others map to the CJK Compatibility Ideographs block. == Johab encoding ==