EUC-CN which encodes GB 2312 the same way as EUC-CN, but deviates from the EUC structure by extending the lead byte range back to 0x8C, adding 31 IBM-selected characters in 0x8CE0 through 0x8CFE and adding 1880
user-defined characters with lead bytes 0x8D through 0xA0. IBM code page 1383 (CCSID 1383) comprises the single-byte
code page 367 and the double-byte code page 1382 (CPGID 1382 as CCSID 1382), which differs by conforming to the EUC structure, adding the 31 IBM-selected characters in 0xFEE0 through 0xFEFE instead, and including only 1360 user-defined characters, interspersed in the positions not used by GB 2312. The alternative CCSID 5479 is used for the pure EUC-CN code page: it uses CCSID 9574 as its double-byte set, which uses CPGID 1382 but excludes the IBM-selected and user-defined characters.
GBK and GB 18030 GBK is an extension to . It defines an extended form of the EUC-CN encoding capable of representing a larger array of
CJK characters sourced largely from , including
traditional Chinese characters and characters used only in
Japanese. It is not, however, a true EUC code, because ASCII bytes may appear as trail bytes (and
C1 bytes, not limited to the single shifts, may appear as lead or trail bytes), due to a larger encoding space being required. Variants of GBK are implemented by
Windows code page 936 (the
Microsoft Windows code page for simplified Chinese), and by IBM's code page 1386. The Unicode-based character encoding defines an extension of GBK capable of encoding the entirety of
Unicode. However, Unicode encoded as is a
variable-length encoding which may use up to four bytes per character, due to an even larger encoding space being required. Being an extension of GBK, it is a superset of EUC-CN but is not itself a true EUC code. Being a Unicode encoding, its repertoire is identical to that of other
Unicode transformation formats such as
UTF-8.
Mac OS Chinese Simplified Other EUC-CN variants deviating from the EUC mechanism include the
classic Mac OS Chinese Simplified script (known as Code page 10008 or x-mac-chinesesimp). It uses the bytes 0x80, 0x81, 0x82, 0xA0, 0xFD, 0xFE, and 0xFF for the
U with umlaut (ü), two special font metric characters, the
non-breaking space, the
copyright sign (©), the
trademark sign (™) and the ellipsis (...) respectively. This differs in what is regarded as a single-byte character versus the first byte of a two-byte character from both EUC (where, of those, 0xFD and 0xFE are defined as lead bytes) and GBK (where, of those, 0x81, 0x82, 0xFD and 0xFE are defined as lead bytes). This use of 0xA0, 0xFD, 0xFE and 0xFF matches
Apple's Shift_JIS variant. Besides these changes to the lead byte range, the other distinctive feature of the double-byte portion of Mac OS Chinese Simplified is the inclusion of two extensions to the basic GB 2312-80 set in rows 6 and 8. and both extensions are included by
GB 18030 (the successor to GB 2312). ==EUC-JP==