List-labeling problem

The cost of list labeling is related to m, the range of the labels assigned. Suppose that no more than n items are stored in the list-labeling structure at any time. Four cases have been studied: • m = 2^{\Omega(n)} • m = n^{\Omega(1)} • m = O(n) • m = (1+\varepsilon) n Exponential Labels In the exponential label case, each item that is inserted can be given a label that is the average of its neighboring labels. It takes \Omega(n) insertions before two items are at adjacent labels and there are no labels available for items in between them. When this happens, all items are relabelled evenly from the space of all labels. This incurs O(n) relabeling cost. Thus, the amortized relabeling cost in this case is O(1). Polynomial Labels The other cases of list labeling can be solved via balanced binary search trees. Consider T, a binary search tree on S of height h. We can label every node in the tree via a path label as follows: Let \sigma(X) be the sequence of left and right edges on the root-to-X path, encoded as bits. So if X is in the left subtree of the root, the high-order bit of \sigma(X) is 0, and if it is in the right subtree of the root, the high-order bit of \sigma(X) is 1. Once we reach X, we complete \sigma(X) to a length of h+1 as follows. If X is a leaf, we append 0s as the low order bits until \sigma(X) has h+1 bits. If X is an internal node, we append one 0 and then 1s as the low order bits until \sigma(X) has h+1 bits. The important properties of \sigma() are that: these labels are in the range \{0, 1, \ldots, 2^{h+1}-1\}; and for two nodes with keys X and Y in T, if X then \sigma(X) . To see this latter property, notice that the property is true if the least common ancestor of X and Y is neither X nor Y, because \sigma(X) and \sigma(Y) will share bits until their least common ancestor. If X, then because T is a search tree, X will be in the left subtree and will have a next bit of 0, whereas Y will be in the right subtree and will have a next bit of 1. Suppose instead that, without loss of generality, the least common ancestor of X and Y is X, and that X has depth d. If Y is in the left subtree of X, then \sigma(X) and \sigma(Y) share the first d+1 bits. The remaining bits of \sigma(X) are all 1s, whereas the remaining bits of \sigma(Y) must have a 0, so \sigma(Y). If instead Y is in the right subtree of X, then \sigma(X) and \sigma(Y) share the first d bits and the d+1st bit of \sigma(X) is 0, whereas the d+1st bit of \sigma(Y) is 1. Hence \sigma(X). We conclude that the \sigma() function fulfills the monotonicity property of the label() function. Thus if we can balance the binary tree to a depth of (\log m) -1, we will have a solution to the list labeling problem for labels in the range \{0,\ldots,m-1\}. Weight-balanced trees In order to use a self-balancing binary search tree to solve the list labeling problem, we need to first define the cost function of a balancing operation on insertion or deletion to equal the number of labels that are changed, since every rebalancing operation of the tree would have to also update all path labels in the subtree rooted at the site of the rebalance. So, for example, rotating a node with a subtree of size k, which can be done in constant time under usual circumstances, requires \Omega(k) path label updates. In particular, if the node being rotated is the root then the rotation would take time linear in the size of the whole tree. With that much time the entire tree could be rebuilt. We will see below that there are self-balancing binary search tree data structures that cause an appropriate number of label updates during rebalancing. A weight-balanced tree BB[\alpha] is defined as follows. For every X in a root tree T, define size(X) to be the number of nodes in the subtree rooted at X. Let the left and right children of X be X.left and X.right, respectively. A tree T is \alpha-weight balanced if for every internal node X in T, size(X.left) \ge \lfloor \alpha \cdot size(X.right)\rfloor and size(X.right) \ge \lfloor \alpha \cdot size(X.left)\rfloor. The height of a BB[\alpha] tree with n nodes is at most \log_{1/(1-\alpha)} n=-\log(n)/\log(1-\alpha). Therefore, in order to solve the list-labeling problem, we need \alpha = 1 - 1/(1-n^{-1/(\log(m)-1)}) to achieve a depth of \log(m) - 1. A scapegoat tree is a weight-balanced tree where whenever a node no longer satisfies the weight-balance condition the entire subtree rooted at that node is rebuilt. This rebalancing scheme is ideal for list labeling, since the cost of rebalancing now equals the cost of relabeling. The amortized cost of an insertion or deletion is (1+ 1/(1-2\alpha))\log_{1/1-\alpha} n + O(1). For the list labeling problem, the cost becomes: • m = n^{\Omega(1)}: \alpha = O(1), the cost of list labeling is amortized O(\log n). (Folklore, modification of Itai, Konheim and Rodeh.) • m = O(n) : \alpha = 1+ \Theta(1/\log n), the cost of list labeling is amortized O(\log^2 n). This bound was first achieved by Itai, Konheim, and Rodeh • m = (1+\varepsilon) n : If m is a power of two, then we can set \alpha = 1+ \Theta(\varepsilon/\log n), and the cost of list labeling is O(\varepsilon^{-1}\log^2 n). A more careful algorithm can achieve this bound even in the case where m is not a power of two. ==Lower bounds and open problems==