The original Big-5 only include CJK logograms from the
Charts of Standard Forms of Common National Characters (4808 characters) and Less-Than-Common National Characters (6343 characters), but not letters from people's names, place names, dialects,
chemistry,
biology, and Japanese kana. As a result, many Big-5 supporting programs include extensions to address the problems. The plethora of variations make
UTF-8 (or
UTF-16 or the Chinese
GB 18030 standard, which is also a full Unicode Transformation Format, i.e. not only for simplified Chinese) a more consistent code page for modern use.
Vendor extensions ETen extensions In the
ETen (倚天) Chinese operating system, the following code points are added, to add support for some characters present in the
IBM 5550's code page but absent from generic Big5: • 0xA3C0–0xA3E0: 33 control characters. • 0xC6A1–0xC875: circle 1–10, bracket 1–10,
Roman numerals 1–9 (i–ix), CJK radical glyphs, Japanese
hiragana, Japanese
katakana,
Cyrillic characters • 0xF9D6–0xF9FE: the characters '
碁', '
銹', '
恒', '
裏', '
墻', '
粧' and '
嫺', followed by 34 additional
semigraphic symbols. In some versions of ETen, there are extra graphical symbols and
simplified Chinese characters.
Microsoft code pages Microsoft (微軟) created its own version of Big5 extension as
code page 950 for use with
Microsoft Windows, which supports the F9D6–F9FE code points from ETEN's extensions. In some versions of Windows, the
euro currency symbol is mapped to Big-5 code point A3E1. After installing Microsoft's HKSCS patch on top of traditional Chinese Windows (or any version of Windows 2000 and above with proper language pack), applications using code page 950 automatically use a hidden code page 951 table. The table supports all code points in HKSCS-2001, except for the compatibility code points specified by the standard.
IBM code pages In contrast to Microsoft's code page 950, IBM's
CCSID 950 comprises single byte code page 1114 (CCSID 1114) and double byte code page 947 (CCSID 947). It incorporates ETEN extensions for lead bytes , , and , while omitting those with lead byte (which Microsoft includes), mapping them instead to the
Private Use Area as user-defined characters. It also includes two non-ETEN extension regions with trail bytes , i.e. outside the usual Big5 trail byte range but similar to the Big5+ trail byte range: area 5 has lead bytes and contains IBM-selected characters, while area 9 has lead bytes and is a user-defined region. IBM refers to the euro sign update of their Big-5 variant as CCSID 1370, which includes both single-byte () and double-byte () euro signs. It comprises single byte code page 1114 (CCSID 5210) and double byte code page 947 (CCSID 21427). For better compatibility with Microsoft's variant in
IBM Db2, IBM also define the pure double-byte code page 1372 and the associated variable-width CCSID 1373, which corresponds to Microsoft's code page 950. IBM assigns CCSID 5471 to the HKSCS-2001 Big5 code page (with CPGID 1374 as CCSID 5470 as the double byte component), CCSID 9567 to the HKSCS-2004 code page (with CPGID 1374 as CCSID 9566 as the double byte component), and CCSID 13663 to the HKSCS-2008 code page (with CPGID 1374 as CCSID 13662 as the double byte component), while CCSID 1375 is assigned to a growing HKSCS code page, currently equivalent to CCSID 13663.
ChinaSea font ChinaSea fonts (中國海字集) are Traditional Chinese fonts made by ChinaSea. The fonts are rarely sold separately, but are bundled with other products, such as the Chinese version of
Microsoft Office 97. The fonts support Japanese kana,
kokuji, and other characters missing in Big-5. As a result, the ChinaSea extensions have become more popular than the government-supported extensions. Some Hong Kong
BBSes had used encodings in ChinaSea fonts before the introduction of HKSCS.
'Sakura' font The 'Sakura' font (日和字集 Sakura Version) is developed in Hong Kong and is designed to be compatible with HKSCS. It adds support for
kokuji and proprietary dingbats (including
Doraemon) not found in HKSCS.
Unicode-at-on Unicode-at-on (
Unicode補完計畫), formerly BIG5 extension, extends BIG-5 by altering code page tables, but uses the ChinaSea extensions starting with version 2. However, with the bankruptcy of ChinaSea, late development, and the increasing popularity of HKSCS and
Unicode (the project is not compatible with HKSCS), the success of this extension is limited at best. Despite the problems, characters previously mapped to Unicode Private Use Area are remapped to the standardized equivalents when exporting characters to Unicode format.
OPG The web sites of the
Oriental Daily News and
Sun Daily, belonging to the
Oriental Press Group Limited (東方報業集團有限公司) in Hong Kong, used a downloadable font with a different Big-5 extension coding than the HKSCS.
Official extensions Taiwan Ministry of Education font The Taiwan Ministry of Education supplied its own font, the Taiwan Ministry of Education font (臺灣教育部造字檔) for use internally.
Taiwan Council of Agriculture font Executive Yuan introduced a 133-character custom font, the Taiwan Council of Agriculture font (臺灣農委會常用中文外字集), that includes 84 characters from the
fish radical and 7 from the
bird radical.
Big5+ The
Chinese Foundation for Digitization Technology (中文數位化技術推廣委員會) introduced Big5+ in 1997, which used over 20000 code points to incorporate all CJK logograms in Unicode 1.1. However, the extra code points exceeded the original Big-5 definition (Big5+ uses high byte values 81-FE and low byte values 40-7E and 80-FE), preventing it from being installed on Microsoft Windows without new codepage files.
Big-5E To allow Windows users to use custom fonts, the Chinese Foundation for Digitization Technology introduced Big-5E, which added 3954 characters (in three blocks of code points: 8E40-A0FE, 8140-86DF, 86E0-875C) and removed the Japanese kana from the ETEN extension. Unlike Big-5+, Big5E extends Big-5 within its original definition.
Mac OS X 10.3 and later supports Big-5E in the fonts LiHei Pro (儷黑 Pro.ttf) and LiSong Pro (儷宋 Pro.ttf).
Big5-2003 The Chinese Foundation for Digitization Technology made a Big5 definition and put it into
CNS 11643 in note form, making it part of the official standard in Taiwan. Big5-2003 incorporates all Big-5 characters introduced in the 1984 ETEN extensions (code points A3C0-A3E0, C6A1-C7F2, and F9D6-F9FE) and the Euro symbol. Cyrillic characters were not included because the authority claimed CNS 11643 does not include such characters.
CDP The
Academia Sinica made a Chinese Data Processing font (漢字構形資料庫) in late 1990s, which the latest release version 2.5 included 112,533 characters, some less than the
Mojikyo fonts.
HKSCS Hong Kong also adopted Big5 for character encoding. However,
written Cantonese has its own characters not available in the normal Big5 character set. To solve this problem, the
Hong Kong Government created the Big5 extensions
Government Chinese Character Set (GCCS) in 1995 and
Hong Kong Supplementary Character Set in 1999. The Hong Kong extensions were commonly distributed as a patch. It is still being distributed as a patch by Microsoft, but a full Unicode font is also available from the Hong Kong Government's web site. There are two encoding schemes of HKSCS: one encoding scheme is for the Big-5 coding standard and the other is for the
ISO 10646 standard. Subsequent to the initial release, there are also HKSCS-2001 and HKSCS-2004. The HKSCS-2004 is aligned technically with the ISO/IEC 10646:2003 and its Amendment 1 published in April 2004 by the International Organization for Standardization (ISO). HKSCS includes all the characters from the common ETen extension, plus some characters from simplified Chinese, place names, people's names, and Cantonese phrases (including
profanity). , the most recent edition of HKSCS is HKSCS-2016; however, the last edition of HKSCS to encode all of its characters in Big5 was HKSCS-2008, while the characters added in more recent editions are mapped to ISO 10646 /
Unicode only (as a
CJK Unified Ideographs horizontal glyph extension where appropriate). Additionally, similarly to Hong Kong's situation, there are also characters that are needed by Macao but is neither included in Big5 nor HKSCS, hence, the
Macao Supplementary Character Set was developed, comprising characters not found in Big5 or HKSCS; this, however, is also not encoded in Big5. The first batch of 121 MSCS characters were submitted for inclusion in or mapping to Unicode in 2009, and the first final version of MSCS was established in 2020. ==Kana and Cyrillic==