ISO/IEC 646

ISO/IEC 646 Information technology — ISO 7-bit coded character set for information interchange, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-6 and developed in cooperation with ASCII at least since 1964. The first version of ECMA-6 had been published in 1965, based on work the ECMA's Technical Committee TC1 had carried out since December 1960. The first edition of ISO/IEC 646 was published in 1973, and the most recent, third, edition in 1991.

History

:1963) ISO/IEC 646 and its predecessor ASCII (ASA X3.4) largely endorsed existing practice regarding character encodings in the telecommunications industry. As ASCII did not provide a number of characters needed for languages other than English, a number of national variants were made that substituted some less-used characters with needed ones. Due to the incompatibility of the various national variants, an International Reference Version (IRV) of ISO/IEC 646 was introduced, in an attempt to at least restrict the replaced set to the same characters in all variants. The original version (ISO 646 IRV) differed from ASCII only in that code point 0x24, ASCII's dollar sign was replaced by the international currency symbol . The final 1991 version of the code ISO/IEC 646:1991 is also known as ITU T.50, International Reference Alphabet or IRA, formerly International Alphabet No. 5 (IA5). This standard allows users to exercise the 12 variable characters (i.e., two alternative graphic characters and 10 national defined characters). Among these exercises, ISO 646:1991 IRV (International Reference Version) is explicitly defined and identical to ASCII. or ISO/IEC 646:1991 (in force), == Code page layout ==

Code page layout

The following table shows the ISO/IEC 646:1991 International Reference Version character set. Each character is shown with its Unicode equivalent. Code points open for substitution in national variants are shown with a grey background. Yellow background indicates a character that, in some variants, could be combined with a previous character as a diacritic using the backspace character, which may affect glyph choice. In addition to the invariant set restrictions, 0x23 is restricted to be either # or £ and 0x24 is restricted to be either $ or ¤. However, these restrictions are not followed by all national variants. == Composite Graphic Characters ==

Composite Graphic Characters

According to ISO/IEC 646, every graphic character must be a spacing character; that is, it must advance the character position forward. As a result, non-spacing combining characters are not permitted in any national version. This is in contrast to later standards such as ISO/IEC 2022 and ISO/IEC 10646 which permit or include combining characters. Several spacing characters can be used as diacritical marks, when preceded or followed with a backspace C0 control to create accented letters, referred to as composite graphic characters in the standard. For example, the sequence may be used to image the character . This encoding method originated in the typewriter/teletype era when use of backspace would overstrike a glyph, and may be considered deprecated. This method is attested in the code charts for the IRV, as well as the GB, FR1, CA, and CA2 national versions, which note that , , , and may behave as the diaeresis, acute accent, cedilla, and circumflex (rather than quotation marks, a comma, and an upward arrowhead), respectively, when preceded or followed by a backspace. The current PL-2002 standard explicitly directs the use of the backspace and apostrophe to form Polish letters with an acute accent. Some editions of ISO/IEC 646 also suggest that the solidus may be used with the equal sign to compose the not equal sign, , and that the underscore may be used to effect underlined text. The tilde character was similarly introduced as a diacritic , although the standard is silent about its use. Later, when wider character sets gained more acceptance, ISO/IEC 8859, vendor-specific character sets and eventually Unicode became the preferred methods of coding accented letters. == Variant codes and descriptions ==

Variant codes and descriptions

ISO/IEC 646 national variants Some national variants of ISO/IEC 646 are as follows: National derivatives Some national character sets also exist which are based on ISO/IEC 646 but do not strictly follow its invariant set (see also § Derivatives for other alphabets): Control characters All the variants listed above are solely graphical character sets, and are to be used with a C0 control character set such as listed in the following table: Associated supplementary character sets The following table lists supplementary graphical character sets defined by the same standard as specific ISO/IEC 646 variants. These would be selected by using a mechanism such as shift out or the NATS super shift (single shift), or by setting the eighth bit in environments where one was available: == Variant comparison chart ==

{{anchor|CA|CA2|CN|CU|DE|DK|ES|ES2|FI|FR|FR1|GB|HU|INV|IRV|IT|JP|JP-OCR-B|KR|NO|NO2|PT|PT2|SE|SE2|US|YU|DANO|SEFI|2|6|13|27|49|T.61|MT|TW|IS|IE|NL|CH|pl|greek|teletex|INIS}}Variant comparison chart

The specifics of the changes for some of these variants are given in the following table. Character assignments unchanged across all listed variants (i.e. which remain the same as ASCII) are not shown. For ease of comparison, variants detailed include national variants of ISO/IEC 646, DEC's closely related National Replacement Character Set (NRCS) series used on VT200 terminals, the related European World System Teletext encoding series defined in ETS 300 706, and a few other closely related encodings based on ISO/IEC 646. Individual code charts are linked from the second column. The cells with non-white background emphasize the differences from US-ASCII (also the Basic Latin subset of ISO/IEC 10646 and Unicode). == Related encoding families ==

Related encoding families

National Replacement Character Set The National Replacement Character Set (NRCS) is a family of 7-bit encodings introduced in 1983 by DEC with the VT200 series of computer terminals. It is closely related to ISO/IEC 646, being based on a similar invariant subset of ASCII, differing in retaining as invariant but not . All NRCS variants except Swiss retain in its ASCII position, and are therefore in conformance with ISO/IEC 646. Several NRCS variants are identical to ISO/IEC 646 variants, and others are very similar, with the exception of the Dutch variant. World System Teletext The European telecommunications standard ETS 300 706, "Enhanced Teletext specification", defines Latin, Greek, Cyrillic, Arabic, and Hebrew code sets with several national variants for both Latin and Cyrillic. Code page 1052 replaces a few ASCII characters from code page 1054. == Derivatives for other alphabets ==

{{anchor|GR}}Derivatives for other alphabets

Some 7-bit character sets for non-Latin alphabets are derived from the ISO/IEC 646 standard: these do not themselves constitute ISO/IEC 646 due to not following its invariant code points (often replacing the letters of at least one case), due to supporting differing alphabets which the set of national code points provide insufficient encoding space for. Examples include: • 7-bit Turkmen (ISO-IR-230). • 7-bit Greek. • In ELOT 927 (ISO-IR-088), maps the Greek alphabet over both letter cases using a different scheme (not in alphabetical order, but trying where possible to match Greek letters over Roman letters which correspond in some sense), and ISO-IR-019 maps the Greek uppercase alphabet over the Latin lowercase letters using the same scheme as ISO-IR-018. • The lower half of the Symbol font character encoding uses its own scheme for mapping Greek letters of both cases over the ASCII Roman letters, also trying to map Greek letters over Roman letters which correspond in some sense, but making different decisions in this regard (see chart below). It also replaces invariant code points 0x22 and 0x27 and five national code points with mathematical symbols. Although not intended for use in typesetting Greek prose, it is sometimes used for that purpose. • ISO-IR-027 • 7-bit Cyrillic • KOI-7 or Short KOI, used for Russian. The Cyrillic characters are mapped to positions 0x60–0x7E, on top of the Latin lowercase letters, matching homologous letters where possible (where в is mapped to w, not v). Superseded by the KOI-8 variants. • SRPSCII and MAKSCII, Cyrillic variants of YUSCII (the Latin variant is YU/ISO-IR-141 in the chart above), used for Serbian and Macedonian respectively. Largely homologous to the Latin variant of YUSCII (following Serbian digraphia rules), except for Љ (lj), Њ (nj), Џ (dž), and ѕ (dz), which correspond to digraphs in Latin-script orthography, and are mapped over letters which are not used in Serbian or Macedonian (q, w, x, y). • The G0 sets for the World System Teletext encodings for Russian/Bulgarian and Ukrainian use G0 sets similar to KOI-7 with some modifications. The corresponding G0 set for Serbian Cyrillic uses a scheme based on the Teletext encoding for Latin-script Serbo-Croatian and Slovene, as opposed to the significantly different YUSCII. • 7-bit Hebrew, SI 960. The Hebrew alphabet is mapped to positions 0x60–0x7A, on top of the lowercase Latin letters (and grave accent for aleph). 7-bit Hebrew was always stored in visual order. This mapping with the high bit set, i.e. with the Hebrew letters in 0xE0–0xFA, is ISO/IEC 8859-8. The World System Teletext encoding for Hebrew uses the same letter mappings, but uses BS_Viewdata as its base encoding (whereas SI 960 uses US-ASCII) and includes a shekel sign at 0x7B. • 7-bit Arabic, ASMO 449 (ISO-IR-089). The Arabic alphabet is mapped to positions 0x41–0x5A and 0x60–0x6A, on top of both uppercase and lowercase Latin letters. A comparison of some of these encodings is below. Only one case is shown, except in instances where the cases are mapped to different letters. In such instances, the mapping with the smallest code is shown first. Possible transcriptions are given for some letters; where this is omitted, the letter can be considered to correspond to the Roman one which it is mapped over. == See also ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com