Fragmentation occurs whenever a piece of data or free space is broken into several pieces. Fragmentation is often accepted in return for improvements in speed or simplicity. Analogous phenomena occur for other resources such as processors; see below.
Memory Memory fragmentation can be
internal and
external. Internal fragmentation occurs when
memory allocation provides more space than needed, causing waste of free space. External fragmentation occurs when free memory is fragmented, preventing newer, larger blocks to be allocated. In modern operating systems, the program receives fixed-size
pages which it then allocates among different parts of itself (often by an allocator such as malloc). As a result, there are two layers of
allocation that may each result in their own internal and external fragmentation: on the level of the memory allocator (e.g.
malloc) and on the level of pages. (Older operating systems might not use pages, but programs still tend to use a two-level system to avoid excessive calls to the OS.)
Internal fragmentation greatly reduces slack space. Due to the rules governing
memory allocation, more computer memory is sometimes
allocated than is needed. When this happens, the excess memory goes to waste. In this scenario, the unusable memory, known as
slack space, is contained within an allocated region. This arrangement, termed fixed partitions, suffers from inefficient memory use - any process, no matter how small, occupies an entire partition. This waste is called
internal fragmentation. •
Memory paging creates internal fragmentation because an entire
page frame (typically 4–64 KiB) will be allocated whether or not that much storage is needed. • malloc and similar allocators tend to provide a degree of
alignment, so that memory can only be provided to programs in chunks (usually a multiple of 4 bytes). As a result if a program requests perhaps 29 bytes, it will actually get a chunk of 32 bytes. Internal fragmentation on the allocator level is very difficult to reclaim after it has occurred; usually the best way to remove it is with a design change. For example, in
dynamic memory allocation,
memory pools drastically cut internal fragmentation by spreading the space overhead over a larger number of objects. Internal fragmentation on the page level (which is external fragmentation on the allocator level) can be reclaimed by moving allocated blocks. This requires editing
pointers to reflect the moving of allocated blocks, which is difficult for non-garbage-collected languages.
Compacting garbage collectors perform pointer fixups as it compacts the virtual address space.
External fragmentation External fragmentation arises when free memory is separated into small blocks and is interspersed by allocated memory. It is a weakness of certain storage allocation algorithms, when they fail to order memory used by programs efficiently. The result is that, although free storage is available, it is effectively unusable because it is divided into pieces that are too small individually to satisfy the demands of the application. The term "external" refers to the fact that the unusable storage is outside the allocated regions. For example, consider a situation wherein a program allocates three continuous blocks of memory and then frees the middle block. The memory allocator can use this free block of memory for future allocations. However, it cannot use this block if the memory to be allocated is larger in size than this free block. External fragmentation on the page level can be reclaimed by an application's own allocator or garbage collector as it frees up and hands back (uncommits) pages to the OS. External fragmentation on the page level can also be reclaimed by the operating system by moving pages in the physical address space and editing the
page table to match. This is called
page migration. The application is oblivious to the process because the virtual memory addresses remain unchanged.
File systems ''''
occurs when a collection of data in memory is broken up into many pieces that are not close together. File systems interact with block devices such as disk drives. These devices manage data in small, fixed-sized units called blocks or sectors, which filesystems abstract into their own units called blocks'' or
clusters. When a file system is created, there is free space to store file blocks together
contiguously. This allows for rapid sequential file reads and writes. However, as files are added, removed, and changed in size, the free space becomes externally fragmented, leaving only small holes in which to place new data. This is
external fragmentation. File systems differ from memory in that they are designed with fragmentation in mind: when a new file is written, or when an existing file is extended, the operating system puts the new data in new non-contiguous groups of clusters (
extents) to fit into the available holes. The new data blocks are necessarily scattered, slowing access due to
seek time and
rotational latency of the read/write head, and incurring additional overhead to manage additional locations. This is called
file system fragmentation. The effect is even worse if a file which is divided into many small pieces is deleted, because this leaves similarly small regions of free spaces. When writing a new file of a known size, if there are any empty holes that are larger than that file, the operating system can avoid data fragmentation by putting the file into any one of those holes. There are a variety of algorithms for selecting which of those potential holes to put the file; each of them is a
heuristic approximate solution to the
bin packing problem. The "best fit" algorithm chooses the smallest hole that is big enough. The "worst fit" algorithm chooses the largest hole. The "
first-fit algorithm" chooses the first hole that is big enough. The "next fit" algorithm keeps track of where each file was written. The "next fit" algorithm is faster than "first fit," which is in turn faster than "best fit," which is the same speed as "worst fit". Just as compaction can eliminate external fragmentation, (external) data fragmentation can be eliminated by rearranging data storage so that related pieces are close together. For example, the primary job of a
defragmentation tool is to rearrange blocks on disk so that the blocks of each file are contiguous. Most defragmenting utilities also attempt to reduce or eliminate free space fragmentation. Some moving
garbage collectors, utilities that perform automatic memory management, will also move related objects close together (this is called
compacting) to improve cache performance. There are four kinds of systems that never experience (external) data fragmentation—they always store every file contiguously. All four kinds have significant disadvantages compared to systems that allow at least some temporary data fragmentation: • Simply write each file
contiguously. If there isn't already enough contiguous free space to hold the file, the system immediately fails to store the file—even when there are many little bits of free space from deleted files that add up to more than enough to store the file. • If there isn't already enough contiguous free space to hold the file, use a
copying collector to convert many little bits of free space into one contiguous free region big enough to hold the file. This takes a lot more time than breaking the file up into fragments and putting those fragments into the available free space. • Write the file into any free block, through
fixed-size blocks storage. If a programmer picks a fixed block size too small, the system immediately fails to store some files—files larger than the block size—even when there are many free blocks that add up to more than enough to store the file. If a programmer picks a block size too big, a lot of space is wasted on internal fragmentation. • Some systems avoid dynamic allocation entirely, pre-storing (contiguous) space for all possible files they will need—for example,
MultiFinder pre-allocates a chunk of RAM to each application as it was started according to how much RAM that application's programmer claimed it would need. Internal fragmentation is rarely a concern on file systems because it does not have a global effect of making the entire file system's file allocation worse. It does cause a storage space
overhead and slow down access to large sets of affected small files (the set as a whole can be considered "fragmented"). To combat this problem, filesystems such as ext4 and btrfs have an
inline data or
inline file feature to forgo allocating a cluster for these files and instead directly store it in the
inode (file record).
Comparison Compared to external fragmentation, overhead and internal fragmentation account for little loss in terms of wasted memory and reduced performance. It is defined as: : Fragmentation of 0% means that all the free memory is in a single large block; fragmentation is 90% (for example) when 100 MB free memory is present but largest free block of memory for storage is just 10 MB. External fragmentation tends to be less of a problem in file systems than in primary memory (RAM) storage systems, because programs usually require their RAM storage requests to be fulfilled with contiguous blocks, but file systems typically are designed to be able to use any collection of available blocks (fragments) to assemble a file which logically appears contiguous. Therefore, if a highly fragmented file or many small files are deleted from a full volume and then a new file with size equal to the newly freed space is created, the new file will simply reuse the same fragments that were freed by the deletion. If what was deleted was one file, the new file will be just as fragmented as that old file was, but in any case there will be no barrier to using all the (highly fragmented) free space to create the new file. In RAM, on the other hand, the storage systems used often cannot assemble a large block to meet a request from small noncontiguous free blocks, and so the request cannot be fulfilled and the program cannot proceed to do whatever it needed that memory for (unless it can reissue the request as a number of smaller separate requests). ==Problems==