MarketCode page 932 (Microsoft Windows)
Company Profile

Code page 932 (Microsoft Windows)

Microsoft Windows code page 932, also called Windows-31J amongst other names, is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding. It contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

Terminology
Microsoft's Shift JIS variant is known simply as "Code page 932" on Microsoft Windows, however this is ambiguous as IBM's code page 932, while also a Shift JIS variant, lacks the NEC and NEC-selected double-byte vendor extensions which are present in Microsoft's variant (although both include the IBM extensions) and preserves the 1978 ordering of JIS X 0208. The "Windows-31J" label is IANA's and not recognized by Microsoft, which has historically used "shift_jis" instead. The W3C/WHATWG encoding standard used by HTML5 treats the label "shift_jis" interchangeably with "windows-31j" with the intent of being "compatible with deployed content" and matches Windows code page 932 Windows code page 932 is also called MS_Kanji, although IANA treat MS_Kanji as an alias for standard Shift JIS. Python, for example, uses the label MS-Kanji (or cp932) for Windows-932 and the label Shift_JIS (or sjis) for JIS X 0208-defined Shift JIS, without recognising the Windows-31J label. In Japanese editions of Windows, this code page is referred to as "ANSI", since it is the operating system's default 8-bit encoding, even though ANSI was not involved in its definition. ==Differences from standard Shift JIS==
Differences from standard Shift JIS
Windows-31J is often mistaken for standard Shift JIS (as defined in JIS X 0208:1997 Appendix 1): while similar, the distinction is significant for computer programmers wishing to avoid mojibake. Double-byte character differences comparing repertoires of JIS X 0208, JIS X 0212, JIS X 0213, Windows-31J, the Microsoft standard repertoire and Unicode In addition to the standard JIS X 0201:1997 and JIS X 0208:1997 characters, Windows-31J includes several JIS X 0208 extensions, namely "NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119)", This also differs from IBM-932, which does not include the NEC extensions or NEC selection. The NEC extensions also encode the entirety of the IBM repertoire, but in a separate extension within the 94×94 JIS X 0208 grid (in rows 89–92, besides the characters already included in NEC row 13), rather than using Shift JIS codes beyond the JIS X 0208 range; Windows code page 932 includes these 388 characters in both locations. to row 89 as used by JIS X 0208 with IBM/NEC extensions (beginning 纊, 褜, 鍈…). Consequently, Shift JIS-2004 is not compatible with Windows-31J. In addition to the above, Microsoft uses different (but visually similar) Unicode mapping for several double-byte punctuation characters compared to standard Shift JIS, such as the wave dash being mapped to U+FF5E rather than U+301C, which is followed by ibm-943_P15A-2003 but not ibm-943_P130-1999, and using different mapping for the double byte backslash. By contrast, 0x5C is mapped to U+00A5 YEN SIGN (¥) in ISO-646-JP and consequently JIS X 0201, of which standard Shift JIS is an extension. Correspondingly, Windows-31J avoids duplicate encoding of the backslash by mapping the double byte 0x815F to U+FF3C FULLWIDTH REVERSE SOLIDUS, whereas standard Shift JIS maps it to U+005C. For this reason, in many Japanese fonts, U+005C is displayed as a Yen symbol, which would normally be represented as U+00A5, rather than as a backslash per Unicode's suggested rendering. U+00A5 is one-way best-fit mapped onto 0x5C in Windows-932. However, code 0x5C in Windows-932 behaves as a reverse solidus (backslash) in all respects (e.g. in file paths on Windows systems) other than how it is displayed by some fonts, this is followed by the encoding named "ibm-943_P130-1999" in ICU. Code page 897 (and therefore also IBM-943 and IBM-932) also adds single-byte box-drawing characters replacing certain C0 control characters, and are mapped to control characters in ICU. ==Layout==
tickerdossier.comtickerdossier.substack.com