Sequential algorithm Given a Graph G(V,E), it is easy to find a single MIS using the following algorithm: • Initialize I to an
empty set. • While V is not empty: • Choose a node v∈V; • Add v to the set I; • Remove from V the node v and all its neighbours. • Return I.
Random-selection parallel algorithm [Luby's Algorithm] The following algorithm finds a MIS in time O(log
n). • Initialize I to an empty set. • While V is not empty: • Choose a random set of vertices S ⊆ V, by selecting each vertex v independently with probability 1/(2d(v)), where d is the degree of v (the number of neighbours of v). • For every edge in E, if both its endpoints are in the random set S, then remove from S the endpoint whose degree is lower (i.e. has fewer neighbours). Break ties arbitrarily, e.g. using a
lexicographic order on the vertex names. • Add the set S to I. • Remove from V the set S and all the neighbours of nodes in S. • Return I.
ANALYSIS: For each node v, divide its neighbours to
lower neighbours (whose degree is lower than the degree of v) and
higher neighbours (whose degree is higher than the degree of v), breaking ties as in the algorithm. Call a node v
bad if more than 2/3 of its neighbors are higher neighbours. Call an edge
bad if both its endpoints are bad; otherwise the edge is
good. • At least 1/2 of all edges are always good. PROOF: Build a directed version of G by directing each edge to the node with the higher degree (breaking ties arbitrarily). So for every bad node, the number of out-going edges is more than 2 times the number of in-coming edges. So every bad edge, that enters a node v, can be matched to a distinct set of two edges that exit the node v. Hence the total number of edges is at least 2 times the number of bad edges. • For every good node u, the probability that a neighbour of u is selected to S is at least a certain positive constant. PROOF: the probability that NO neighbour of u is selected to S is at most the probability that none of u's
lower neighbours is selected. For each lower-neighbour v, the probability that it is not selected is (1-1/2d(v)), which is at most (1-1/2d(u)) (since d(u)>d(v)). The number of such neighbours is at least d(u)/3, since u is good. Hence the probability that no lower-neighbour is selected is at most 1-exp(-1/6). • For every node u that is selected to S, the probability that u will be removed from S is at most 1/2. PROOF: This probability is at most the probability that a higher-neighbour of u is selected to S. For each higher-neighbour v, the probability that it is selected is at most 1/2d(v), which is at most 1/2d(u) (since d(v)>d(u)). By union bound, the probability that no higher-neighbour is selected is at most d(u)/2d(u) = 1/2. • Hence, for every good node u, the probability that a neighbour of u is selected to S and remains in S is a certain positive constant. Hence, the probability that u is removed, in each step, is at least a positive constant. • Hence, for every good edge e, the probability that e is removed, in each step, is at least a positive constant. So the number of good edges drops by at least a constant factor each step. • Since at least half the edges are good, the total number of edges also drops by a constant factor each step. • Hence, the number of steps is O(log
m), where
m is the number of edges. This is bounded by O(\log(n)). A worst-case graph, in which the average number of steps is \Theta(\log(n)), is a graph made of
n/2 connected components, each with 2 nodes. The degree of all nodes is 1, so each node is selected with probability 1/2, and with probability 1/4 both nodes in a component are not chosen. Hence, the number of nodes drops by a factor of 4 each step, and the expected number of steps is \log_4(n).
Random-priority parallel algorithm The following algorithm is better than the previous one in that at least one new node is always added in each connected component: • Initialize I to an empty set. • While V is not empty: • Let W be the set of vertices in V with no earlier neighbours (based on the fixed ordering); • Add W to I; • Remove from V the nodes in the set W and all their neighbours. • Return I. Between the totally sequential and the totally parallel algorithms, there is a continuum of algorithms that are partly sequential and partly parallel. Given a fixed ordering on the nodes and a factor δ∈(0,1], the following algorithm returns the same MIS: • Initialize I to an empty set. • While V is not empty: • Select a factor δ∈(0,1]. • Let P be the set of δ
n nodes that are first in the fixed ordering. • Let W be a MIS on P using the totally parallel algorithm. • Add W to I; • Remove from V all the nodes in the prefix P, and all the neighbours of nodes in the set W. • Return I. Setting δ=1/
n gives the totally
sequential algorithm; setting δ=1 gives the totally parallel algorithm.
ANALYSIS: With a proper selection of the parameter δ in the partially parallel algorithm, it is possible to guarantee that it finishes after at most log(n) calls to the fully parallel algorithm, and the number of steps in each call is at most log(n). Hence the total run-time of the partially parallel algorithm is O(\log^2(n)). Hence the run-time of the fully parallel algorithm is also at most O(\log^2(n)). The main proof steps are: • If, in step
i, we select \delta = 2^i \log{(n)} / D, where
D is the maximum degree of a node in the graph, then
WHP all nodes remaining after step
i have degree at most D / 2^i. Thus, after log(
D) steps, all remaining nodes have degree 0 (since
D\delta = C \log{(n)}/d (for any constant
C), then
WHP the longest path in the directed graph determined by the fixed ordering has length O(\log(n)). Hence the fully parallel algorithm takes at most O(\log(n)) steps (since the longest path is a worst-case bound on the number of steps in that algorithm). • Combining these two facts gives that, if we select \delta = 2^i \log{(n)} / D, then WHP the run-time of the partially parallel algorithm is O(\log(D)\log(n)) = O(\log^2(n)). == Listing all maximal independent sets ==