Replication (computing)

Replication in computing can refer to: • Data replication, where the same data is stored on multiple storage devices • Computation replication, where the same computing task is executed many times. Computational tasks may be: • Replicated in space, where tasks are executed on separate devices • Replicated in time, where tasks are executed repeatedly on a single device Replication in space or in time is often linked to scheduling algorithms. Access to a replicated entity is typically uniform with access to a single non-replicated entity. The replication itself should be transparent to an external user. In a failure scenario, a failover of replicas should be hidden as much as possible with respect to quality of service. Computer scientists further describe replication as being either: • Active replication, which is performed by processing the same request at every replica • Passive replication, which involves processing every request on a single replica and transferring the result to the other replicas When one leader replica is designated via leader election to process all the requests, the system is using a primary-backup or primary-replica scheme, which is predominant in high-availability clusters. In comparison, if any replica can process a request and distribute a new state, the system is using a multi-primary or multi-master scheme. In the latter case, some form of distributed concurrency control must be used, such as a distributed lock manager. Load balancing differs from task replication, since it distributes a load of different computations across machines, and allows a single computation to be dropped in case of failure. Load balancing, however, sometimes uses data replication (especially multi-master replication) internally, to distribute its data among machines. Backup differs from replication in that the saved copy of data remains unchanged for a long period of time. Replicas, on the other hand, undergo frequent updates and quickly lose any historical state. Replication is one of the oldest and most important topics in the overall area of distributed systems. Data replication and computation replication both require processes to handle incoming events. Processes for data replication are passive and operate only to maintain the stored data, reply to read requests and apply updates. Computation replication is usually performed to provide fault-tolerance, and take over an operation if one component fails. In both cases, the underlying needs are to ensure that the replicas see the same events in equivalent orders, so that they stay in consistent states and any replica can respond to queries. Replication models in distributed systems Three widely cited models exist for data replication, each having its own properties and performance: • Transactional replication: used for replicating transactional data, such as a database. The one-copy serializability model is employed, which defines valid outcomes of a transaction on replicated data in accordance with the overall ACID (atomicity, consistency, isolation, durability) properties that transactional systems seek to guarantee. • State machine replication: assumes that the replicated process is a deterministic finite automaton and that atomic broadcast of every event is possible. It is based on distributed consensus and has a great deal in common with the transactional replication model. This is sometimes mistakenly used as a synonym of active replication. State machine replication is usually implemented by a replicated log consisting of multiple subsequent rounds of the Paxos algorithm. This was popularized by Google's Chubby system, and is the core behind the open-source Keyspace data store. • Virtual synchrony: involves a group of processes which cooperate to replicate in-memory data or to coordinate actions. The model defines a distributed entity called a process group. A process can join a group and is provided with a checkpoint containing the current state of the data replicated by group members. Processes can then send multicasts to the group and will see incoming multicasts in the identical order. Membership changes are handled as a special multicast that delivers a new "membership view" to the processes in the group. == Database replication ==