MarketDesign of the FAT file system
Company Profile

Design of the FAT file system

The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well-suited file system for data exchange between computers and devices of almost any type and age from 1981 through to the present.

Structure
A FAT file system is composed of four regions: FAT uses little-endian format for all entries in the header (except for, where explicitly mentioned, some entries on Atari ST boot sectors) and the FAT(s). It is possible to allocate more FAT sectors than necessary for the number of clusters. The end of the last sector of each FAT copy can be unused if there are no corresponding clusters. The total number of sectors (as noted in the boot record) can be larger than the number of sectors used by data (clusters × sectors per cluster), FATs (number of FATs × sectors per FAT), the root directory (n/a for FAT32), and hidden sectors including the boot sector: this would result in unused sectors at the end of the volume. If a partition contains more sectors than the total number of sectors occupied by the file system it would also result in unused sectors, at the end of the partition, after the volume. == Reserved sectors area ==
Reserved sectors area
Boot Sector On non-partitioned storage devices, such as floppy disks, the Boot Sector (VBR) is the first sector (logical sector 0 with physical CHS address 0/0/1 or LBA address 0). For partitioned storage devices such as hard disks, the Boot Sector is the first sector of a partition, as specified in the partition table of the device. BIOS Parameter Block DOS 3.0 BPB: The following extensions were documented since DOS 3.0, however, they were already supported by some issues of DOS 2.11. FS Information Sector The "FS Information Sector" was introduced in FAT32 for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a logical sector number specified in the FAT32 EBPB boot record at position (usually logical sector 1, immediately after the boot record itself). The sector's data may be outdated and not reflect the current media contents, because not all operating systems update or use this sector, and even if they do, the contents is not valid when the medium has been ejected without properly unmounting the volume or after a power-failure. Therefore, operating systems should first inspect a volume's optional shutdown status bitflags residing in the FAT entry of cluster 1 or the FAT32 EBPB at offset and ignore the data stored in the FS information sector, if these bitflags indicate that the volume was not properly unmounted before. This does not cause any problems other than a possible speed penalty for the first free space query or data cluster allocation; see fragmentation. If this sector is present on a FAT32 volume, the minimum allowed logical sector size is 512 bytes, whereas otherwise it would be 128 bytes. Some FAT32 implementations support a slight variation of Microsoft's specification by making the FS information sector optional by specifying a value of (or ) in the entry at offset . == FAT region ==
FAT region
File Allocation Table Cluster map A volume's data area is divided into identically sized clusters—small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the drive; typical cluster sizes range from 2 to . Each file may occupy one or more clusters depending on its size. Thus, a file is represented in the FAT by a singly linked list. The FAT is made up of entries that are 12, 16, or 32 bits long, depending on whether it's FAT12, FAT16, or FAT32. The consecutive entries correspond to consecutive clusters, and the value of the entry is a link telling what the next cluster in the particular file is. Looking at the entry corresponding to that cluster tells where the next one is, and so on, until there's an entry that indicates that it corresponds to the last cluster in the cluster chain holding that file. The clusters are not necessarily adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region. Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT, but waste space in large partitions by needing to allocate in large clusters. The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if those three bytes are considered as one little-endian 24-bit number, the 12 least significant bits represent the first entry (e.g. cluster 0) and the 12 most significant bits the second (e.g. cluster 1). In other words, while the low eight bits of the first cluster in the row are stored in the first byte, the top four bits are stored in the low nibble of the second byte, whereas the low four bits of the subsequent cluster in the row are stored in the high nibble of the second byte and its higher eight bits in the third byte. The FAT16 file system uses 16 bits per FAT entry, thus one entry spans two bytes in little-endian byte order: The FAT32 file system uses 32 bits per FAT entry, thus one entry spans four bytes in little-endian byte order. The four top bits of each entry are reserved for other purposes; they are cleared during formatting and should not be changed otherwise. They must be masked off before interpreting the entry as 28-bit cluster address. The File Allocation Table (FAT) is a contiguous number of sectors immediately following the area of reserved sectors. It represents a list of entries that map to each cluster on the volume. Each entry records one of four things: • the cluster number of the next cluster in a chain • a special end of cluster-chain (EOC) entry that indicates the end of a chain • a special entry to mark a bad cluster • a zero to note that the cluster is unused For very early versions of DOS to recognize the file system, the system must have been booted from the volume or the volume's FAT must start with the volume's second sector (logical sector 1 with physical CHS address 0/0/2 or LBA address 1), that is, immediately following the boot sector. Operating systems assume this hard-wired location of the FAT in order to find the FAT ID in the FAT's cluster 0 entry on DOS 1.0-1.1 FAT diskettes, where no valid BPB is found. Special entries The first two entries in a FAT store special values: The first entry (cluster 0 in the FAT) holds the FAT ID since MS-DOS 1.20 and PC DOS 1.1 (allowed values - with - reserved) in bits 7-0, which is also copied into the BPB of the boot sector, offset since DOS 2.0. The remaining 4 bits (if FAT12), 8 bits (if FAT16) or 20 bits (if FAT32, the 4 MSB bits are zero) of this entry are always 1. These values were arranged so that the entry would also function as a "trap-all" end-of-chain marker for all data clusters holding a value of zero. Additionally, for FAT IDs other than (and ) it is possible to determine the correct nibble and byte order (to be) used by the file system driver, however, the FAT file system officially uses a little-endian representation only and there are no known implementations of variants using big-endian values instead. 86-DOS 0.42 up to MS-DOS 1.14 used hard-wired drive profiles instead of a FAT ID, but used this byte to distinguish between media formatted with 32-byte or 16-byte directory entries, as they were used prior to 86-DOS 0.42. The second entry (cluster 1 in the FAT) nominally stores the end-of-cluster-chain marker as used by the formater, but typically always holds / / , that is, with the exception of bits 31-28 on FAT32 volumes these bits are normally always set. Some Microsoft operating systems, however, set these bits if the volume is not the volume holding the running operating system (that is, use instead of here). (In conjunction with alternative end-of-chain markers the lowest bits 2-0 can become zero for the lowest allowed end-of-chain marker / / ; bit 3 should be reserved as well given that clusters / / and higher are officially reserved. Some operating systems may not be able to mount some volumes if any of these bits are not set, therefore the default end-of-chain marker should not be changed.) For DOS 1 and 2, the entry was documented as reserved for future use. Since DOS 7.1 the two most-significant bits of this cluster entry may hold two optional bitflags representing the current volume status on FAT16 and FAT32, but not on FAT12 volumes. These bitflags are not supported by all operating systems, but operating systems supporting this feature would set these bits on shutdown and clear the most significant bit on startup: If bit 15 (on FAT16) or bit 27 (on FAT32) is not set when mounting the volume, the volume was not properly unmounted before shutdown or ejection and thus is in an unknown and possibly "dirty" state. On FAT32 volumes, the FS Information Sector may hold outdated data and thus should not be used. The operating system would then typically run SCANDISK or CHKDSK on the next startup (but not on insertion of removable media) to ensure and possibly reestablish the volume's integrity. If bit 14 (on FAT16) or bit 26 (on FAT32) is cleared, the operating system has encountered disk I/O errors on startup, a possible indication for bad sectors. Operating systems aware of this extension will interpret this as a recommendation to carry out a surface scan (SCANDISK) on the next boot. (A similar set of bitflags exists in the FAT12/FAT16 EBPB at offset or the FAT32 EBPB at offset . While the cluster 1 entry can be accessed by file system drivers once they have mounted the volume, the EBPB entry is available even when the volume is not mounted and thus easier to use by disk block device drivers or partitioning tools.) If the number of FATs in the BPB is not set to 2, the second cluster entry in the first FAT (cluster 1) may also reflect the status of a TFAT volume for TFAT-aware operating systems. If the cluster 1 entry in that FAT holds the value 0, this may indicate that the second FAT represents the last known valid transaction state and should be copied over the first FAT, whereas the first FAT should be copied over the second FAT if all bits are set. Some non-standard FAT12/FAT16 implementations utilize the cluster 1 entry to store the starting cluster of a variable-sized root directory (typically 2). This may occur when the number of root directory entries in the BPB holds a value of 0 and no FAT32 EBPB is found (no signature or at offset ). This extension, however, is not supported by mainstream operating systems, as it conflicts with other possible uses of the cluster 1 entry. Most conflicts can be ruled out if this extension is only allowed =2 and -->for FAT12 with less than and FAT16 volumes with less than clusters and 2 FATs. Because these first two FAT entries store special values, there are no data clusters 0 or 1. The first data cluster (after the root directory if FAT12/FAT16) is cluster 2, marking the beginning of the data area. Cluster values FAT entry values: FAT32 uses 28 bits for cluster numbers. The remaining 4 bits in the 32-bit FAT entry are usually zero, but are reserved and should be left untouched. A standard conformant FAT32 file system driver or maintenance tool must not rely on the upper 4 bits to be zero and it must strip them off before evaluating the cluster number in order to cope with possible future expansions where these bits may be used for other purposes. They must not be cleared by the file system driver when allocating new clusters, but should be cleared during a reformat. == Root directory region ==
Root directory region
The root directory table in FAT12 and FAT16 file systems occupies the special Root Directory Region location. == Data region ==
Data region
Aside from the root directory table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all directory tables are stored in the data region. The actual number of entries in a directory stored in the data region can grow by adding another cluster to the chain in the FAT. Directory table A directory table is a special type of file that represents a directory (also known as a folder). Since 86-DOS 0.42, each file or (since MS-DOS 1.40 and PC DOS 2.0) subdirectory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the address of the first cluster of the file/directory's data, the size of the file/directory, and the date and (since PC DOS 1.1) also the time of last modification. Earlier versions of 86-DOS used 16-byte directory entries only, supporting no files larger than 16 MB and no time of last modification. The FAT file system itself does not impose any limits on the depth of a subdirectory tree for as long as there are free clusters available to allocate the subdirectories, however, the internal Current Directory Structure (CDS) under MS-DOS/PC DOS limits the absolute path of a directory to 66 characters (including the drive letter, but excluding the NUL byte delimiter),A:\", therefore their 63 characters equals our 66 characters --> thereby limiting the maximum supported depth of subdirectories to 32, whatever occurs earlier. Concurrent DOS, Multiuser DOS and DR DOS 3.31 to 6.0 (up to including the 1992-11 updates) do not store absolute paths to working directories internally and therefore do not show this limitation. The same applies to Atari GEMDOS, but the Atari Desktop does not support more than 8 sub-directory levels. Most applications aware of this extension support paths up to at least 127 bytes. FlexOS, 4680 OS and 4690 OS support a length of up to 127 bytes as well, allowing depths down to 60 levels. PalmDOS, DR DOS 6.0 (since BDOS 7.1) and higher, Novell DOS, and OpenDOS sport a MS-DOS-compatible CDS and therefore have the same length limits as MS-DOS/PC DOS. Each entry can be preceded by "fake entries" to support a VFAT long filename (LFN); see further below. Legal characters for DOS short filenames include the following: • Upper case letters A–Z • Numbers 0–9 • Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name; also filenames with space in them could not easily be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system). Another exception are the internal commands MKDIR/MD and RMDIR/RD under DR-DOS which accept single arguments and therefore allow spaces to be entered. • ! # $ % & ' ( ) - @ ^ _ ` { } ~ • Characters 128–228 • Characters 230–255 This excludes the following ASCII characters: • " * / : ? \ | Windows/MS-DOS has no shell escape character • + , . ; = [ ] Allowed in long file names only • Lower case letters a–zStored as A–Z; allowed in long file names • Control characters 0–31 • Character 127 (DEL) Character 229 () was not allowed as first character in a filename in DOS 1 and 2 due to its use as free entry marker. A special case was added to circumvent this limitation with DOS 3.0 and higher. The following additional characters are allowed on Atari's GEMDOS, but should be avoided for compatibility with MS-DOS/PC DOS: • " + , ; [ ] | The semicolon (;) should be avoided in filenames under DR DOS 3.31 and higher, PalmDOS, Novell DOS, OpenDOS, Concurrent DOS, Multiuser DOS, System Manager and REAL/32, because it may conflict with the syntax to specify file and directory passwords: "...\DIRSPEC.EXT;DIRPWD\FILESPEC.EXT;FILEPWD". The operating system will strip off one (and also two—since DR-DOS 7.02) semicolons and pending passwords from the filenames before storing them on disk. (The command processor 4DOS uses semicolons for include lists and requires the semicolon to be doubled for password protected files with any commands supporting wildcards.) The at-sign character (@) is used for filelists by many DR-DOS, PalmDOS, Novell DOS, OpenDOS and Multiuser DOS, System Manager and REAL/32 commands, as well as by 4DOS and may therefore sometimes be difficult to use in filenames. Under Multiuser DOS and REAL/32, the exclamation mark (!) is not a valid filename character since it is used to separate multiple commands in a single command line. Under IBM 4680 OS and 4690 OS, the following characters are not allowed in filenames: • ? * : . ; , [ ] ! + = " - / \ | Additionally, the following special characters are not allowed in the first, fourth, fifth and eight character of a filename, as they conflict with the host command processor (HCP) and input sequence table build file names: • @ # ( ) { } $ & The DOS file names are in the current OEM character set: this can have surprising effects if characters handled in one way for a given code page are interpreted differently for another code page (DOS command CHCP) with respect to lower and upper case, sorting, or validity as file name character. Directory entry Before Microsoft added support for long filenames and creation/access time stamps, bytes – of the directory entry were used by other operating systems to store additional metadata, most notably the operating systems of the Digital Research family stored file passwords, access rights, owner IDs, and file deletion data there. While Microsoft's newer extensions are not fully compatible with these extensions by default, most of them can coexist in third-party FAT implementations (at least on FAT12 and FAT16 volumes). 32-byte directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename): Versions of DOS prior to 5.0 start scanning directory tables from the top of the directory table to the bottom. In order to increase chances for successful file undeletion, DOS 5.0 and higher will remember the position of the last written directory entry and use this as a starting point for directory table scans. Under DR DOS 6.0 and higher, including PalmDOS, Novell DOS and OpenDOS, the volume attribute is set for pending delete files and directories under DELWATCH. An attribute combination of is used to designate a VFAT long file name entry since MS-DOS 7.0. Older versions of DOS can mistake this for a directory volume label, as they take the first entry with volume attribute set as volume label. This problem can be avoided if a directory volume label is enforced as part of the format process; for this reason some disk tools explicitly write dummy "NO␠NAME␠␠␠␠" directory volume labels when the user does not specify a volume label. Since volume labels normally don't have the system attribute set at the same time, it is possible to distinguish between volume labels and VFAT LFN entries. The attribute combination could occasionally also occur as part of a valid pending delete file under DELWATCH, however on FAT12 and FAT16 volumes, VFAT LFN entries always have the cluster value at set to and the length entry at is never , whereas the entry at is always non-zero for pending delete files under DELWATCH. This check does not work on FAT32 volumes. • CP/M-86 and DOS Plus store user attributes F1'—F4' here. (DOS Plus 1.2 with BDOS 4.1 supports passwords only on CP/M media, not on FAT12 or FAT16 media. While DOS Plus 2.1 supported logical sectored FATs with a partition type , FAT16B and FAT32 volumes were not supported by these operation systems. Even if a partition would have been converted to FAT16B it would still not be larger than 32 MB. Therefore, this usage does not conflict with FAT32.IFS, FAT16+ or FAT32+ as they can never occur on the same type of volume.): • MSX-DOS 2: For a deleted file, the original first character of the filename. For the same feature in various other operating systems, see offset if enabled in MSX boot sectors at sector offset . MSX-DOS supported FAT12 volumes only, but third-party extensions for FAT16 volumes exist. Therefore, this usage does not conflict with FAT32.IFS and FAT32+ below. It does not conflict with the usage for user attributes under CP/M-86 and DOS Plus as well, since they are no longer important for deleted files. • Windows NT and later versions uses bits 3 and 4 to encode case information (see VFAT); otherwise 0. • DR-DOS 7.0x reserved bits other than 3 and 4 for internal purposes since 1997. The value should be set to 0 by formatting tools and must not be changed by disk tools. • On FAT32 volumes under OS/2 and eComStation the third-party FAT32.IFS driver utilizes this entry as a mark byte to indicate the presence of extra "␠EA.␠SF" files holding extended attributes with parameter /EAS. Version 0.70 to 0.96 used the magic values (no EAs), (normal EAs) and (critical EAs), whereas version 0.97 and higher since 2003-09 use , (normal EAs) and (critical EAs) as bitflags for compatibility with Windows NT. • First character of a deleted file under Novell DOS, OpenDOS and DR-DOS 7.02 and higher. A value of (229), as set by DELPURGE, will prohibit undeletion by UNDELETE, a value of will allow conventional undeletion asking the user for the missing first filename character. S/DOS 1 and PTS-DOS 6.51 and higher also support this feature if enabled with SAVENAME=ON in CONFIG.SYS. For the same feature in MSX-DOS, see offset . • Create time, fine resolution: 10 ms units, values from 0 to 199 (since DOS 7.0 with VFAT). Double usage for create time ms and file char does not create a conflict, since the creation time is no longer important for deleted files. • Under DR DOS 3.31 and higher including PalmDOS, Novell DOS and OpenDOS as well as under Concurrent DOS, Multiuser DOS, System Manager, and REAL/32 and possibly also under FlexOS, 4680 OS, 4690 OS any non-zero value indicates the password hash of a protected file, directory or volume label. The hash is calculated from the first eight characters of a password. If the file operation to be carried out requires a password as per the access rights bitmap stored at offset , the system tries to match the hash against the hash code of the currently set global password (by PASSWORD /G) or, if this fails, tries to extract a semicolon-appended password from the filespec passed to the operating system and checks it against the hash code stored here. A set password will be preserved even if a file is deleted and later undeleted. • Create time (since DOS 7.0 with VFAT). The hour, minute and second are encoded according to the following bitmap: :The seconds is recorded only to a 2 second resolution. Finer resolution for file creation is found at offset . If bits 15-11 > 23 or bits 10-5 > 59 or bits 4-0 > 29 here, or when bits 12-0 at offset hold an access bitmap and this is not a FAT32 volume or a volume using OS/2 Extended Attributes, then this entry actually holds a password hash, otherwise it can be assumed to be a file creation time. • FlexOS, 4680 OS and 4690 OS store a record size in the word at entry . This is mainly used for their special database-like file types random file, direct file, keyed file, and sequential file. If the record size is set to 0 (default) or 1, the operating systems assume a record granularity of 1 byte for the file, for which it will not perform record boundary checks in read/write operations. • With DELWATCH 2.00 and higher under Novell DOS 7, OpenDOS 7.01 and DR-DOS 7.02 and higher, this entry is used to store the last modified time stamp for pending delete files and directories. Cleared when file is undeleted or purged. See offset for a format description. • Create date (since DOS 7.0 with VFAT). The year, month and day are encoded according to the following bitmap: The usage for creation date for existing files does not conflict with last modified time for deleted files because they are never used at the same time. For the same reason, the usage for the record size of existing files and last modified time of deleted files does not conflict. Creation dates and record sizes cannot be used at the same time, however, both are stored only on file creation and never changed later on, thereby limiting the conflict to FlexOS, 4680 OS and 4690 OS systems accessing files created under foreign operating systems as well as potential display or file sorting problems on systems trying to interpret a record size as creation time. To avoid the conflict, the storage of creation dates should be an optional feature of operating systems supporting it. • FlexOS, 4680 OS, 4690 OS, Multiuser DOS, System Manager, REAL/32 and DR DOS 6.0 and higher with multi-user security enabled use this field to store owner IDs. Offset holds the user ID, the group ID of a file's creator. :In multi-user versions, system access requires a logon with account name and password, and the system assigns group and user IDs to running applications according to the previously set up and stored authorization info and inheritance rules. For 4680 OS and 4690 OS, group ID 1 is reserved for the system, group ID 2 for vendor, group ID 3 for the default user group. Background applications started by users have a group ID 2 and user ID 1, whereas operating system background tasks have group IDs 1 or 0 and user IDs 1 or 0. IBM 4680 BASIC and applications started as primary or secondary always get group ID 2 and user ID 1. When applications create files, the system will store their user ID and group ID and the required permissions with the file. • With DELWATCH 2.00 and higher under Novell DOS 7, OpenDOS 7.01 and DR-DOS 7.02 and higher, this entry is used to store the last modified date stamp for pending delete files and directories. Cleared when file is undeleted or purged. See offset for a format description. • Last access date (since DOS 7.0 if ACCDATE enabled in CONFIG.SYS for the corresponding drive); see offset for a format description. The usage for the owner IDs of existing files does not conflict with last modified date stamp for deleted files because they are never used at the same time. The usage of the last modified date stamp for deleted files does not conflict with access date since access dates are no longer important for deleted files. However, owner IDs and access dates cannot be used at the same time. • High two bytes of first cluster number in FAT32; with the low two bytes stored at offset . • Access rights bitmap for world/group/owner read/write/execute/delete protection for password protected files, directories (or volume labels) under DR DOS 3.31 and higher, including PalmDOS, Novell DOS and OpenDOS, and under FlexOS, 4680 OS, 4690 OS, Concurrent DOS, Multiuser DOS, System Manager, and REAL/32. :Typical values stored on a single-user system are (PASSWORD /N for all access rights "RWED"), (PASSWORD /D for access rights "RW?-"), (PASSWORD /W for access rights "R-?-") and (PASSWORD /R for files or PASSWORD /P for directories for access rights "--?-"). Bits 1, 5, 9, 12-15 will be preserved when changing access rights. If execute bits are set on systems other than FlexOS, 4680 OS or 4690 OS, they will be treated similar to read bits. (Some versions of PASSWORD allow to set passwords on volume labels (PASSWORD /V) as well.) :Single-user systems calculate the most restrictive rights of the three sets (DR DOS up to 5.0 used bits 0-3 only) and check if any of the requested file access types requires a permission and if a file password is stored. If not, file access is granted. Otherwise the stored password is checked against an optional global password provided by the operating system and an optional file password provided as part of the filename separated by a semicolon (not under FlexOS, 4680 OS, 4690 OS). If neither of them is provided, the request will fail. If one of them matches, the system will grant access (within the limits of the normal file attributes, that is, a read-only file can still not be opened for write this way), otherwise fail the request. :Under FlexOS, 4680 OS and 4690 OS the system assigns group and user IDs to applications when launched. When they request file access, their group and user IDs are compared with the group and user IDs of the file to be opened. If both IDs match, the application will be treated as file owner. If only the group ID matches, the operating system will grant group access to the application, and if the group ID does not match as well, it will grant world access. If an application's group ID and user ID are both 0, the operating system will bypass security checking. Once the permission class has been determined, the operating system will check if any of the access types of the requested file operation requires a permission according to the stored bitflags of the selected class owner, group or world in the file's directory entry. Owner, group and world access rights are independent and do not need to have diminishing access levels. Only, if none of the requested access types require a permission, the operating system will grant access, otherwise it fails. :If multiuser file / directory password security is enabled the system will not fail at this stage but perform the password checking mechanism for the selected permission class similar to the procedure described above. With multi-user security loaded many utilities since DR DOS 6.0 will provide an additional /U:name parameter. :File access rights bitmap: :File renames require either write or delete rights, IBM 4680 BASIC CHAIN requires execute rights. • Extended Attributes handle (used by OS/2 1.2 and higher as well as by Windows NT) in FAT12 and FAT16; first cluster of EA file or 0, if not used. A different method to store extended attributes has been devised for FAT32 volumes, see FAT32.IFS under offset . The storage of the high two bytes of the first cluster in a file on FAT32 partially conflicts with access rights bitmaps. • Last modified time (since PC DOS 1.1/MS-DOS 1.20); see offset for a format description. • Under Novell DOS, OpenDOS and DR-DOS 7.02 and higher, this entry holds the deletion time of pending delete files or directories under DELWATCH 2.00 or higher. The last modified time stamp is copied to for possible later restoration. See offset for a format description. • Last modified date; see offset for a format description. • Under Novell DOS, OpenDOS and DR-DOS 7.02 and higher, this entry holds the deletion date of pending delete files or directories under DELWATCH 2.00 or higher. The last modified date stamp is copied to for possible later restoration. See offset for a format description. Entries with the Volume Label flag, subdirectory ".." pointing to the FAT12 and FAT16 root, and empty files with size 0 should have first cluster 0. VFAT LFN entries also have this entry set to 0; on FAT12 and FAT16 volumes this can be used as part of a detection mechanism to distinguish between pending delete files under DELWATCH and VFAT LFNs; see above. VFAT LFN entries never store the value here. This can be used as part of a detection mechanism to distinguish between pending delete files under DELWATCH and VFAT LFNs; see above. The FlexOS-based operating systems IBM 4680 OS and IBM 4690 OS support unique distribution attributes stored in some bits of the previously reserved areas in the directory entries: • Local: Don't distribute file but keep on local controller only. • Mirror file on update: Distribute file to server only when file is updated. • Mirror file on close: Distribute file to server only when file is closed. • Compound file on update: Distribute file to all controllers when file is updated. • Compound file on close: Distribute file to all controllers when file is closed. Some incompatible extensions found in some operating systems include: == Size limits ==
Size limits
The FAT12, FAT16, FAT16B, and FAT32 variants of the FAT file systems have clear limits based on the number of clusters and the number of sectors per cluster (1, 2, 4, ..., 128). For the typical value of 512 bytes per sector: FAT12 requirements : 3 sectors on each copy of FAT for every 1,024 clusters FAT16 requirements : 1 sector on each copy of FAT for every 256 clusters FAT32 requirements : 1 sector on each copy of FAT for every 128 clusters FAT12 range : 1 to 4,084 clusters : 1 to 12 sectors per copy of FAT FAT16 range : 4,085 to 65,524 clusters : 16 to 256 sectors per copy of FAT FAT32 range : 65,525 to 268,435,444 clusters : 512 to 2,097,152 sectors per copy of FAT FAT12 minimum : 1 sector per cluster × 1 clusters = 512 bytes (0.5 KiB) FAT16 minimum : 1 sector per cluster × 4,085 clusters = 2,091,520 bytes (2,042.5 KB) FAT32 minimum : 1 sector per cluster × 65,525 clusters = 33,548,800 bytes (32,762.5 KB) FAT12 maximum : 64 sectors per cluster × 4,084 clusters = 133,824,512 bytes (≈ 127 MB) [FAT12 maximum : 128 sectors per cluster × 4,084 clusters = 267,694,024 bytes (≈ 255 MB)] FAT16 maximum : 64 sectors per cluster × 65,524 clusters = 2,147,090,432 bytes (≈2,047 MB) [FAT16 maximum : 128 sectors per cluster × 65,524 clusters = 4,294,180,864 bytes (≈4,095 MB)] FAT32 maximum : 8 sectors per cluster × 268,435,444 clusters = 1,099,511,578,624 bytes (≈1,024 GB) FAT32 maximum : 16 sectors per cluster × 268,173,557 clusters = 2,196,877,778,944 bytes (≈2,046 GB) [FAT32 maximum : 32 sectors per cluster × 134,152,181 clusters = 2,197,949,333,504 bytes (≈2,047 GB)] [FAT32 maximum : 64 sectors per cluster × 67,092,469 clusters = 2,198,486,024,192 bytes (≈2,047 GB)] [FAT32 maximum : 128 sectors per cluster × 33,550,325 clusters = 2,198,754,099,200 bytes (≈2,047 GB)] :Legend: 268435444+3 is , because FAT32 version 0 uses only 28 bits in the 32-bit cluster numbers, cluster numbers up to flag bad clusters or the end of a file, cluster number 0 flags a free cluster, and cluster number 1 is not used. Likewise 65524+3 is for FAT16, and 4084+3 is for FAT12. The number of sectors per cluster is a power of 2 fitting in a single byte, the smallest value is 1 (), the biggest value is 128 (). Lines in square brackets indicate the unusual cluster size 128, and for FAT32 the bigger than necessary cluster sizes 32 or 64. Because each FAT32 entry occupies 32 bits (4 bytes) the maximal number of clusters (268435444) requires 2097152 FAT sectors for a sector size of 512 bytes. 2097152 is , and storing this value needs more than two bytes. Therefore, FAT32 introduced a new 32-bit value in the FAT32 boot sector immediately following the 32-bit value for the total number of sectors introduced in the FAT16B variant. The boot record extensions introduced with DOS 4.0 start with a magic 40 () or 41 (). Typically FAT drivers look only at the number of clusters to distinguish FAT12, FAT16, and FAT32: the human readable strings identifying the FAT variant in the boot record are ignored, because they exist only for media formatted with DOS 4.0 or later. Determining the number of directory entries per cluster is straightforward. Each entry occupies 32 bytes; this results in 16 entries per sector for a sector size of 512 bytes. The DOS 5 RMDIR/RD command removes the initial "." (this directory) and ".." (parent directory) entries in subdirectories directly, therefore sector size 32 on a RAM disk is possible for FAT12, but requires 2 or more sectors per cluster. A FAT12 boot sector without the DOS 4 extensions needs 29 bytes before the first unnecessary FAT16B 32-bit number of hidden sectors, this leaves three bytes for the (on a RAM disk unused) boot code and the magic at the end of all boot sectors. On Windows NT the smallest supported sector size is 128. On Windows NT operating systems the FORMAT command options /A:128K and /A:256K correspond to the maximal cluster size 0x80 (128) with a sector size 1024 and 2048, respectively. For the common sector size 512 /A:64K yields 128 sectors per cluster. Both editions of each ECMA-107 and ISO/IEC 9293 specify a Max Cluster Number MAX determined by the formula MAX=1+trunc((TS-SSA)/SC), and reserve cluster numbers MAX+1 up to 4086 (, FAT12) and later 65526 (, FAT16) for future standardization. Microsoft's EFI FAT32 specification states that any FAT file system with less than 4085 clusters is FAT12, else any FAT file system with less than 65,525 clusters is FAT16, and otherwise it is FAT32. The entry for cluster 0 at the beginning of the FAT must be identical to the media descriptor byte found in the BPB, whereas the entry for cluster 1 reflects the end-of-chain value used by the formatter for cluster chains (, or ). The entries for cluster numbers 0 and 1 end at a byte boundary even for FAT12, e.g., for media descriptor . The first data cluster is 2, and consequently the last cluster MAX gets number MAX+1. This results in data cluster numbers 2...4085 () for FAT12, 2...65525 () for FAT16, and 2...268435445 () for FAT32. The only available values reserved for future standardization are therefore (FAT12) and (FAT16). As noted below "less than 4085" is also used for Linux implementations, or as Microsoft's FAT specification puts it: ...when it says <, it does not mean <=. Note also that the numbers are correct. The first number for FAT12 is 4085; the second number for FAT16 is 65525. These numbers and the "<" signs are not wrong." == Fragmentation ==
Fragmentation
The FAT file system does not contain built-in mechanisms which prevent newly written files from becoming scattered across the partition. On volumes where files are created and deleted frequently or their lengths often changed, the medium will become increasingly fragmented over time. While the design of the FAT file system does not cause any organizational overhead in disk structures or reduce the amount of free storage space with increased amounts of fragmentation, as it occurs with external fragmentation, the time required to read and write fragmented files will increase as the operating system will have to follow the cluster chains in the FAT (with parts having to be loaded into memory first in particular on large volumes) and read the corresponding data physically scattered over the whole medium reducing chances for the low-level block device driver to perform multi-sector disk I/O or initiate larger DMA transfers, thereby effectively increasing I/O protocol overhead as well as arm movement and head settle times inside the disk drive. Also, file operations will become slower with growing fragmentation as it takes increasingly longer for the operating system to find files or free clusters. Other file systems, e.g., HPFS or exFAT, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas. Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks. In fact, seeking for files in large subdirectories or computing the free disk space on FAT volumes is one of the most resource intensive operations, as it requires reading the directory tables or even the entire FAT linearly. Since the total amount of clusters and the size of their entries in the FAT was still small on FAT12 and FAT16 volumes, this could still be tolerated on FAT12 and FAT16 volumes most of the time, considering that the introduction of more sophisticated disk structures would have also increased the complexity and memory footprint of real-mode operating systems with their minimum total memory requirements of 128 KB or less (such as with DOS) for which FAT has been designed and optimized originally. With the introduction of FAT32, long seek and scan times became more apparent, particularly on very large volumes. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased. FAT32 therefore introduced a special file system information sector where the previously computed amount of free space is preserved over power cycles, so that the free space counter needs to be recalculated only when a removable FAT32 formatted medium gets ejected without first unmounting it or if the system is switched off without properly shutting down the operating system, a problem mostly visible with pre-ATX-style PCs, on plain DOS systems and some battery-powered consumer products. With the huge cluster sizes (16 KB, 32 KB, 64 KB) forced by larger FAT partitions, internal fragmentation in form of disk space waste by file slack due to cluster overhang (as files are rarely exact multiples of cluster size) starts to be a problem as well, especially when there are a great many small files. Various optimizations and tweaks to the implementation of FAT file system drivers, block device drivers and disk tools have been devised to overcome most of the performance bottlenecks in the file system's inherent design without having to change the layout of the on-disk structures. They can be divided into on-line and off-line methods and work by trying to avoid fragmentation in the file system in the first place, deploying methods to better cope with existing fragmentation, and by reordering and optimizing the on-disk structures. With optimizations in place, the performance on FAT volumes can often reach that of more sophisticated file systems in practical scenarios, while at the same time retaining the advantage of being accessible even on very small or old systems. DOS 3.0 and higher will not immediately reuse disk space of deleted files for new allocations but instead seek for previously unused space before starting to use disk space of previously deleted files as well. This not only helps to maintain the integrity of deleted files for as long as possible but also speeds up file allocations and avoids fragmentation, since never before allocated disk space is always unfragmented. DOS accomplishes this by keeping a pointer to the last allocated cluster on each mounted volume in memory and starts searching for free space from this location upwards instead of at the beginning of the FAT, as it was still done by DOS 2.x. If the end of the FAT is reached, it would wrap around to continue the search at the beginning of the FAT until either free space has been found or the original position has been reached again without having found free space. These pointers are initialized to point to the start of the FATs after bootup, but on FAT32 volumes, DOS 7.1 and higher will attempt to retrieve the last position from the FS Information Sector. This mechanism is defeated, however, if an application often deletes and recreates temporary files as the operating system would then try to maintain the integrity of void data effectively causing more fragmentation in the end. In some DOS versions, the usage of a special API function to create temporary files can be used to avoid this problem. Additionally, directory entries of deleted files will be marked since DOS 3.0. DOS 5.0 and higher will start to reuse these entries only when previously unused directory entries have been used up in the table and the system would otherwise have to expand the table itself. Since DOS 3.3 the operating system provides means to improve the performance of file operations with FASTOPEN by keeping track of the position of recently opened files or directories in various forms of lists (MS-DOS/PC DOS) or hash tables (DR-DOS), which can reduce file seek and open times significantly. Before DOS 5.0 special care must be taken when using such mechanisms in conjunction with disk defragmentation software bypassing the file system or disk drivers. Windows NT will allocate disk space to files on FAT in advance, selecting large contiguous areas, but in case of a failure, files which were being appended will appear larger than they were ever written into, with a lot of random data at the end. Other high-level mechanisms may read in and process larger parts or the complete FAT on startup or on demand when needed and dynamically build up in-memory tree representations of the volume's file structures different from the on-disk structures. This may, on volumes with many free clusters, occupy even less memory than an image of the FAT itself. In particular on highly fragmented or filled volumes, seeks become much faster than with linear scans over the actual FAT, even if an image of the FAT would be stored in memory. Also, operating on the logically high level of files and cluster-chains instead of on sector or track level, it becomes possible to avoid some degree of file fragmentation in the first place or to carry out local file defragmentation and reordering of directory entries based on their names or access patterns in the background. Some of the perceived problems with fragmentation of FAT file systems also result from performance limitations of the underlying block device drivers, which becomes more visible the lesser memory is available for sector buffering and track blocking/deblocking: While the single-tasking DOS had provisions for multi-sector reads and track blocking/deblocking, the operating system and the traditional PC hard disk architecture (only one outstanding input/output request at a time and no DMA transfers) originally did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks. Such features became available later. Later DOS versions also provided built-in support for look-ahead sector buffering and came with dynamically loadable disk caching programs working on physical or logical sector level, often utilizing EMS or XMS memory and sometimes providing adaptive caching strategies or even run in protected mode through DPMS or Cloaking to increase performance by gaining direct access to the cached data in linear memory rather than through conventional DOS APIs. Write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a power failure or crash, made easier by the lack of hardware protection between applications and the system. == VFAT long file names ==
VFAT long file names
VFAT Long File Names (LFNs) are stored on a FAT file system using a trick: adding additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding ), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS. This method is very similar to the DELWATCH method to utilize the volume attribute to hide pending delete files for possible future undeletion since DR DOS 6.0 (1991) and higher. It is also similar to a method publicly discussed to store long filenames on Ataris and under Linux in 1992. Because older versions of DOS could mistake LFN names in the root directory for the volume label, VFAT was designed to create a blank volume label in the root directory before adding any LFN name entries (if a volume label did not already exist). Each phony entry can contain up to 13 UCS-2 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UCS-2 characters. If the position of the LFN's last character is not at a directory entry boundary (13, 26, 39, ...), then a terminator is added in the next character position. Then, if that terminator is also not at the boundary, remaining character positions are filled with . No directory entry containing a lone terminator will exist. LFN entries use the following format: If there are multiple LFN entries required to represent a file name, the entry representing the end of the filename comes first. The sequence number of this entry has bit 6 () set to represent that it is the last logical LFN entry, and it has the highest sequence number. The sequence number decreases in the following entries. The entry representing the start of the filename has sequence number 1. A value of is used to indicate that the entry is deleted. On FAT12 and FAT16 volumes, testing for the values at to be zero and at to be non-zero can be used to distinguish between VFAT LFNs and pending delete files under DELWATCH. For example, a filename like "File with very long filename.ext" would be formatted like this: A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (pFCBName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with space characters (ASCII ). For example, "Readme.txt" would be "README␠␠TXT".) unsigned char lfn_checksum(const unsigned char *pFCBName) { int i; unsigned char sum = 0; for (i = 11; i; i--) sum = ((sum & 1) > 1) + *pFCBName++; return sum; } If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions of Windows such as XP. Instead, two bits in byte of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support it. This creates a backwards-compatibility problem with older Windows versions (Windows 95 / 98 / 98 SE / ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported between operating systems, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing. == See also ==
tickerdossier.comtickerdossier.substack.com