The protocol has two components, the
Failure Detector Component and the
Dissemination Component. The
Failure Detector Component functions as follows: • Every
T' time units, each node (N_1) sends a ping to random other node (N_2) in its membership list. • If N_1 receives a response from N_2, N_2 is decided to be healthy and N_1 updates its "last heard from" timestamp for N_2 to be the current time. • If N_1 does not receive a response, N_1 contacts
k other nodes on its list (\{N_3,...,N_{3+k}\}), and requests that they ping N_2. • If after
T' units of time: if no successful response is received, N_1 marks N_2 as failed. The
Dissemination Component functions as follows: • Upon N_1 detecting a failed node N_2 , N_1 sends a
multicast message to the rest of the nodes in its membership list, with information about the failed node. • Voluntary requests for a node to enter/leave the group are also sent via multicast. == Properties ==