Transmitting binary data as text A binary-to-text encoding enables transmitting data on a
communication channel that does not allow arbitrary binary data (such as
email or
NNTP) or is not
8-bit clean. The encoding enables transmitting binary data over a
communications protocol that is designed to carry
human-readable (i.e. English language) text. Often such a protocol only supports 7-bit character values (and within that avoids certain control codes), and may require
line breaks at certain maximum intervals, and may not maintain
whitespace. Thus, only the 94
printable ASCII characters are safe to use to convey data. The
ASCII text-encoding standard uses 7 bits to encode characters. With this it is possible to encode 128 (i.e. 27) unique values (0–127) to represent the alphabetic, numeric, and punctuation characters commonly used in
English, plus a selection of non-printable
control characters. For example, the capital letter
A is represented as 65 (4116, 100 00012), the numeral
2 is 50 (3216, 011 00102), the right curly brace
} is 125 (7D16, 111 11012), and the carriage return control character
CR is 13 (0D16, 000 11012). In contrast, most computers store data in memory organized in eight-bit
bytes (a.k.a.
octets). Files that contain machine-executable code and non-textual data typically contain all 256 possible eight-bit byte values. Many computer programs came to rely on this distinction between seven-bit
text and eight-bit
binary data, and would not function properly if non-ASCII characters appeared in data that was expected to include only ASCII text. For example, if the value of the eighth bit is not preserved, the program might interpret a byte value above 127 as a flag telling it to perform some function. It is often desired to send non-textual data through a text-based system, such as attaching an image to an e-mail message. To accomplish this, the data is encoded in some way, such that 8-bit data is encoded as 7-bit ASCII characters (generally using only alphanumeric and punctuation characters—the ASCII printable characters). Upon arrival at its destination, it is then decoded back to its 8-bit form. This process is referred to as binary to text encoding. Many programs perform this conversion to allow for data-transport, such as
PGP and
GNU Privacy Guard.
Encoding plain text Binary-to-text encoding methods are also used as a mechanism for encoding
plain text. Some systems have a more limited character set they can handle; not only are they not
8-bit clean, some cannot even handle every printable ASCII character. Other systems have limits on the number of characters that may appear between line breaks, such as the "1000 characters per line" limit of some
Simple Mail Transfer Protocol software, as allowed by . Still others add
headers or
trailers to the text. A few poorly-regarded but still-used protocols use
in-band signaling, causing confusion if specific patterns appear in the message. The best-known is the string "From " (including trailing space) at the beginning of a line, used to separate mail messages in the
mbox file format. By using a binary-to-text encoding on messages that are already plain text, then decoding on the other end, one can make such systems appear to be completely
transparent. This is sometimes referred to as 'ASCII armoring'. For example, the ViewState component of
ASP.NET uses
base64 encoding to safely transmit text via HTTP POST, in order to avoid
delimiter collision. ==Examples==