A backup strategy requires an information repository, "a secondary storage space for data" that aggregates backups of data "sources". The repository could be as simple as a list of all backup media (DVDs, etc.) and the dates produced, or could include a computerized index, catalog, or
relational database.
3-2-1 Backup Rule The backup data needs to be stored, requiring a
backup rotation scheme,
Backup methods Unstructured An unstructured repository may simply be a stack of tapes, DVD-Rs or external HDDs with minimal information about what was backed up and when. This method is the easiest to implement, but unlikely to achieve a high level of recoverability as it lacks automation.
Full only/System imaging A repository using this backup method contains complete source data copies taken at one or more specific points in time. Copying
system images, this method is frequently used by computer technicians to record known good configurations. However, imaging is generally more useful as a way of deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.
Incremental An
incremental backup stores data changed since a reference point in time. Duplicate copies of unchanged data are not copied. Typically a full backup of all files is made once or at infrequent intervals, serving as the reference point for an incremental repository. Subsequently, a number of incremental backups are made after successive time periods. Restores begin with the last full backup and then apply the incrementals. Some backup systems can create a from a series of incrementals, thus providing the equivalent of frequently doing a full backup. When done to modify a single archive file, this speeds restores of recent versions of files.
Near-CDP Continuous Data Protection (CDP) refers to a backup that instantly saves a copy of every change made to the data. This allows restoration of data to any point in time and is the most comprehensive and advanced data protection. Near-CDP backup applications—often
marketed as "CDP"—automatically take incremental backups at a specific interval, for example every 15 minutes, one hour, or 24 hours. They can therefore only allow restores to an interval boundary.
read-only copies of the data frozen at a particular
point in time. Near-CDP (except for
Apple Time Machine)
intent-logs every change on the host system, often by saving byte or block-level differences rather than file-level differences. This backup method differs from simple
disk mirroring in that it enables a roll-back of the log and thus a restoration of old images of data. Intent-logging allows precautions for the consistency of live data, protecting
self-consistent files but requiring
applications "be quiesced and made ready for backup." Near-CDP is more practicable for ordinary personal backup applications, as opposed to
true CDP, which must be run in conjunction with a virtual machine or equivalent and is therefore generally used in enterprise client-server backups. Software may create copies of individual files such as written documents, multimedia projects, or user preferences, to prevent failed write events caused by power outages, operating system crashes, or exhausted disk space, from causing data loss. A common implementation is an appended
".bak" extension to the
file name.
Reverse incremental A
Reverse incremental backup method stores a recent archive file "mirror" of the source data and a series of differences between the "mirror" in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions. This can either be done using
hard links—as Apple Time Machine does, or using binary
diffs.
Differential A
differential backup saves only the data that has changed since the last full backup. This means a maximum of two backups from the repository are used to restore the data. However, as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup. A differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Changes in files may be detected through a more recent date/time of last modification
file attribute, and/or changes in file size. Other variations of incremental backup include multi-level incrementals and block-level incrementals that compare parts of files instead of just entire files.
Storage media disc in plastic cover, a USB flash drive and an
external hard drive Regardless of the repository model that is used, the data has to be copied onto an archive file data storage medium. The medium used is also referred to as the type of backup destination.
Magnetic tape Magnetic tape was for a long time the most commonly used medium for bulk data storage, backup, archiving, and interchange. It was previously a less expensive option, but this is no longer the case for smaller amounts of data. Tape is a
sequential access medium, so the rate of continuously writing or reading data can be very fast. While tape media itself has a low cost per space,
tape drives are typically dozens of times as expensive as
hard disk drives and
optical drives. Tape media are generally
rotated on a schedule so at least one set is off-site in case something should happen to the building where the primary copy of the data lives. Many tape formats have been proprietary or specific to certain markets like mainframes or a particular brand of personal computer. By 2014
LTO had become the primary tape technology. The other remaining viable "super" format is the
IBM 3592 (also referred to as the TS11xx series). The
Oracle StorageTek T10000 was discontinued in 2016.
Hard disk The use of
hard disk storage has increased over time as it has become progressively cheaper. Hard disks are usually easy to use, widely available, and can be accessed quickly. In the mid-2000s, several drive manufacturers began to produce portable drives employing
ramp loading and accelerometer technology (sometimes termed a "shock sensor"), and by 2010 the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting. Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and
claim a range of higher drop specifications. Over a period of years the stability of hard disk backups is shorter than that of tape backups.
Optical storage Optical storage uses lasers to store and retrieve data. Recordable
CDs, DVDs, and
Blu-ray Discs are commonly used with personal computers and are generally cheap. The capacities and speeds of these discs have typically been lower than hard disks or tapes. Advances in optical media may shrink that gap in the future. Many optical disc formats are
WORM type, which makes them useful for archival purposes since the data cannot be changed in any way, including by user error and by malware such as
ransomware. Moreover, optical discs are
not vulnerable to
head crashes, magnetism, imminent water ingress or
power surges; and, a fault of the drive typically just halts the spinning. Optical media is
modular; the storage controller is not tied to media itself like with hard drives or flash storage (→
flash memory controller), allowing it to be removed and accessed through a different drive. However, recordable media may degrade earlier under long-term exposure to light. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity. A French study in 2008 indicated that the lifespan of typically-sold
CD-Rs was 2–10 years, but one manufacturer later estimated the longevity of its CD-Rs with a gold-sputtered layer to be as high as 100 years. Sony's
proprietary Optical Disc Archive Solid-state drive Solid-state drives (SSDs) use
integrated circuit assemblies to store data.
Flash memory,
thumb drives,
USB flash drives,
CompactFlash,
SmartMedia,
Memory Sticks, and
Secure Digital card devices are relatively expensive for their low capacity, but convenient for backing up relatively low data volumes. A solid-state drive does not contain any movable parts, making it less susceptible to physical damage, and can have huge throughput of around 500 Mbit/s up to 6 Gbit/s. Available SSDs have become more capacious and cheaper. Cloud-based backup (through services like or similar to
Google Drive, and
Microsoft OneDrive) provides a layer of data protection.
Online Online backup storage is typically the most accessible type of data storage, and can begin a restore in milliseconds. An internal hard disk or a
disk array (maybe connected to
SAN) is an example of an online backup. This type of storage is convenient and speedy, but is vulnerable to being deleted or overwritten, either by accident, by malevolent action, or in the wake of a data-deleting
virus payload.
Near-line Nearline storage is typically less accessible and less expensive than online storage, but still useful for backup data storage. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage. An example is a
tape library with restore times ranging from seconds to a few minutes.
Off-line Off-line storage requires some direct action to provide access to the storage media: for example, inserting a tape into a tape drive or plugging in a cable. Because the data is not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to on-line backup failure modes. Access time varies depending on whether the media are on-site or off-site.
Off-site data protection Backup media may be sent to an
off-site vault to protect against a disaster or other site-specific problem. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. A data replica can be off-site but also on-line (e.g., an off-site
RAID mirror).
Backup site A
backup site or disaster recovery center is used to store data that can enable computer systems and networks to be restored and properly configured in the event of a disaster. Some organisations have their own data recovery centres, while others contract this out to a third-party. Due to high costs, backing up is rarely considered the preferred method of moving data to a DR site. A more typical way would be remote
disk mirroring, which keeps the DR data as up to date as possible. ==Selection and extraction of data==