Windows-31J is often mistaken for standard Shift JIS (as defined in
JIS X 0208:1997 Appendix 1): while similar, the distinction is significant for computer programmers wishing to avoid
mojibake.
Double-byte character differences comparing repertoires of
JIS X 0208,
JIS X 0212,
JIS X 0213, Windows-31J, the Microsoft standard repertoire and
Unicode In addition to the standard
JIS X 0201:1997 and
JIS X 0208:1997 characters, Windows-31J includes several JIS X 0208 extensions, namely "
NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119)", This also differs from
IBM-932, which does not include the NEC extensions or NEC selection. The NEC extensions also encode the entirety of the IBM repertoire, but in a separate extension within the 94×94 JIS X 0208 grid (in rows 89–92, besides the characters already included in
NEC row 13), rather than using Shift JIS codes beyond the JIS X 0208 range; Windows code page 932 includes these 388 characters in both locations. to row 89 as used by JIS X 0208 with IBM/NEC extensions (beginning 纊, 褜, 鍈…). Consequently, Shift JIS-2004 is not compatible with Windows-31J. In addition to the above, Microsoft uses different (but visually similar) Unicode mapping for several double-byte punctuation characters compared to standard Shift JIS, such as the
wave dash being
mapped to U+FF5E rather than U+301C, which is followed by ibm-943_P15A-2003 but not ibm-943_P130-1999, and using different mapping for the double byte backslash. By contrast, 0x5C is mapped to U+00A5
YEN SIGN (¥) in
ISO-646-JP and consequently
JIS X 0201, of which standard
Shift JIS is an extension. Correspondingly, Windows-31J avoids duplicate encoding of the backslash by mapping the double byte 0x815F to U+FF3C FULLWIDTH REVERSE SOLIDUS, whereas standard Shift JIS maps it to U+005C. For this reason, in many Japanese fonts, U+005C is displayed as a Yen symbol, which would normally be represented as U+00A5, rather than as a backslash per Unicode's suggested rendering. U+00A5 is one-way best-fit mapped onto 0x5C in Windows-932. However, code 0x5C in Windows-932 behaves as a reverse solidus (backslash) in all respects (e.g. in
file paths on Windows systems) other than how it is displayed by some fonts, this is followed by the encoding named "ibm-943_P130-1999" in ICU. Code page 897 (and therefore also IBM-943 and IBM-932) also adds single-byte box-drawing characters replacing certain
C0 control characters, and are mapped to control characters in ICU. ==Layout==