The main use of UTF-32 is in internal APIs where the data is single code points or
glyphs, rather than strings of characters. For instance, in modern text rendering, it is common that the last step is to build a list of structures each containing
coordinates (x, y), attributes, and a single UTF-32 code point identifying the glyph to draw. Often non-Unicode information is stored in the "unused" 11 bits of each word. Use of UTF-32 strings on Windows (where is 16 bits) is almost non-existent. On Unix systems, UTF-32 strings are sometimes, but rarely, used internally by applications, due to the type being defined as 32-bit. UTF-32 is also forbidden as an HTML character encoding.
Programming languages Python versions up to 3.2 can be compiled to use UTF-32 strings instead of
UTF-16; from version 3.3 onward, Unicode strings are stored in UTF-32 if there is at least 1 non-
BMP character in the string, but with leading zero bytes optimized away "depending on the [code point] with the largest Unicode ordinal (1, 2, or 4 bytes)" to make all code points that size. "\U0001F51F" != "\ud83d\udd1f") unlike most programming languages. --> The
Julia programming language moved away from built-in UTF-32 support with its 1.0 release, simplifying the language to having only UTF-8 strings (with all the other encodings considered legacy and moved out of the standard library to package) following the "UTF-8 Everywhere Manifesto".
C++11 has 2 built-in data types that use UTF-32. The char32_t data type stores 1 character in UTF-32. The u32string data type stores a string of UTF-32-encoded characters. A UTF-32-encoded character or string literal is marked with U before the character or string literal. • include char32_t UTF32_character = U'🔟'; // also written as U'\U0001F51F' std::u32string UTF32_string = U"UTF–32-encoded string"; // defined as `const char32_t*´
C# has a UTF32Encoding class which represents Unicode characters as bytes, rather than as a string. == Variants ==