On most operating systems, the name
text file refers to a file format that allows only
plain text content with very little formatting (e.g., no
bold or
italic types). Such files can be viewed and edited on
text terminals or in simple
text editors. Text files usually have the
MIME type text/plain, usually with additional information indicating an encoding.
Microsoft Windows text files 's
Notepad opening an example text file DOS and
Microsoft Windows use a common text file format, with each line of text separated by a two-character combination:
carriage return (CR) and
line feed (LF). It is common for the last line of text
not to be terminated with a CR-LF marker, and many text editors (including
Notepad) do not automatically insert one on the last line. On Microsoft Windows operating systems, a file is regarded as a text file if the suffix of the name of the file (the "
filename extension") is .txt. However, many other suffixes are used for text files with specific purposes. For example, source code for computer programs is usually kept in text files that have file name suffixes indicating the
programming language in which the source is written. Most Microsoft Windows text files use ANSI, OEM, or Unicode (
UTF-16 or UTF-8) encoding. What Microsoft Windows terminology calls "ANSI encodings" are usually single-byte
ISO/IEC 8859 encodings (i.e. ANSI in the Microsoft Notepad menus is really "System Code Page", non-Unicode, legacy encoding), except for in locales such as Chinese, Japanese and Korean that require double-byte character sets. ANSI encodings were traditionally used as default system locales within Microsoft Windows, before the transition to Unicode. By contrast, OEM encodings, also known as
DOS code pages, were defined by
IBM for use in the original
IBM PC text mode display system. They typically include graphical and
line-drawing characters common in DOS applications. "Unicode"-encoded Microsoft Windows text files contain text in UTF-16 Unicode Transformation Format. Such files normally begin with
byte order mark (BOM), which communicates the endianness of the file content. Although UTF-8 does not suffer from endianness problems, many Microsoft Windows programs (i.e. Notepad) prepend the contents of UTF-8-encoded files with BOM, to differentiate UTF-8 encoding from other 8-bit encodings.
Unix text files On
Unix-like operating systems, text files format is precisely described:
POSIX defines a text file as a file that contains characters organized into zero or more lines, where lines are sequences of zero or more non-newline characters plus a terminating newline character, normally LF. Additionally, POSIX defines a '''''' as a text file whose characters are printable or space or backspace according to regional rules. This excludes most control characters, which are not printable.
Apple Macintosh text files Prior to the advent of
macOS, the
classic Mac OS system regarded the content of a file (the data fork) to be a text file when its
resource fork indicated that the type of the file was "TEXT". Lines of classic Mac OS text files are terminated with CR characters. Being a Unix-like system, macOS uses Unix format for text files.
Uniform Type Identifier (UTI) used for text files in macOS is "public.plain-text"; additional, more specific UTIs are: "public.utf8-plain-text" for utf-8-encoded text, "public.utf16-external-plain-text" and "public.utf16-plain-text" for utf-16-encoded text and "com.apple.traditional-mac-plain-text" for classic Mac OS text files. == Rendering ==