MarketAVL tree
Company Profile

AVL tree

In computer science, an AVL tree is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of any node differ by not more than one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

Definition
Balance factor In a binary tree the balance factor of a node X is defined to be the height difference : \mathrm{BF}(X) := \mathrm{Height}(\mathrm{RightSubtree}(X)) - \mathrm{Height}(\mathrm{LeftSubtree}(X)) of its two child sub-trees rooted by node X. A binary tree is defined to be an AVL tree if :\mathrm{BF}(X) \in \{-1,0,1\} holds for every node X in the tree. A node X with \mathrm{BF}(X) is called "left-heavy", one with \mathrm{BF}(X)>0 is called "right-heavy", and one with \mathrm{BF}(X)=0 is sometimes simply called "balanced". Properties Balance factors can be kept up-to-date by knowing the previous balance factors and the change in height – it is not necessary to know the absolute height. For holding the AVL balance information, two bits per node are sufficient. The height h (counted as the maximal number of levels) of an AVL tree with n nodes lies in the interval: :\log_2(n+1) \le h where \varphi := \tfrac{1+\sqrt 5}2 \approx 1.618  is the golden ratio and b := \frac{\log_2 5}{2 \log_2 \varphi} - 2 \approx \; -0.3277 . This is because an AVL tree of height h contains at least F_{h+2}-1 nodes where \{F_n\}_{n\in\N} is the Fibonacci sequence with the seed values F_1=F_2=1 . ==Operations==
Operations
Read-only operations of an AVL tree involve carrying out the same actions as would be carried out on an unbalanced binary search tree, but modifications have to observe and restore the height balance of the sub-trees. Searching Searching for a specific key in an AVL tree can be done the same way as that of any balanced or unbalanced binary search tree. In order for search to work effectively it has to employ a comparison function which establishes a total order (or at least a total preorder) on the set of keys. The number of comparisons required for successful search is limited by the height and for unsuccessful search is very close to , so both are in , where n is the number of nodes in the tree. Traversal As a read-only operation the traversal of an AVL tree functions the same way as on any other binary tree. Exploring all nodes of the tree visits each link exactly twice: one downward visit to enter the subtree rooted by that node, another visit upward to leave that node's subtree after having explored it. Once a node has been found in an AVL tree, the next or previous node can be accessed in amortized constant time. So it is necessary to check each of the node's ancestors for consistency with the invariants of AVL trees: this is called "retracing". This is achieved by considering the balance factor of each node. Since with a single insertion the height of an AVL subtree cannot increase by more than one, the temporary balance factor of a node after an insertion will be in the range For each node checked, if the temporary balance factor remains in the range from –1 to +1 then only an update of the balance factor and no rotation is necessary. However, if the temporary balance factor is ±2, the subtree rooted at this node is AVL unbalanced, and a rotation is needed. The function Join on two AVL trees and and a key will return a tree containing all elements in , as well as . It requires to be greater than all keys in and smaller than all keys in . If the two trees differ by height at most one, Join simply create a new node with left subtree , root and right subtree . Otherwise, suppose that is higher than for more than one (the other case is symmetric). Join follows the right spine of until a node which is balanced with . At this point a new node with left child , root and right child is created to replace c. The new node satisfies the AVL invariant, and its height is one greater than . The increase in height can increase the height of its ancestors, possibly invalidating the AVL invariant of those nodes. This can be fixed either with a double rotation if invalid at the parent or a single left rotation if invalid higher in the tree, in both cases restoring the height for any further ancestor nodes. Join will therefore require at most two rotations. The cost of this function is the difference of the heights between the two input trees. function JoinRightAVL(TL, k, TR) (l, k', c) = expose(TL) if (Height(c) R)+1) T' = Node(c, k, TR) if (Height(T') R) T'' = Node(l, k', T') if (Height(T') '' else return rotateLeft(T'') function JoinLeftAVL(TL, k, TR) /* symmetric to JoinRightAVL */ function Join(TL, k, TR) if (Height(TL)>Height(TR)+1) return JoinRightAVL(TL, k, TR) if (Height(TR)>Height(TL)+1) return JoinLeftAVL(TL, k, TR) return Node(TL, k, TR) Here Height(v) is the height of a subtree (node) . (l,k,r) = expose(v) extracts 's left child , the key of 's root, and the right child . Node(l,k,r) means to create a node of left child , key , and right child . To split an AVL tree into two smaller trees, those smaller than key , and those greater than key , first draw a path from the root by inserting into the AVL. After this insertion, all values less than will be found on the left of the path, and all values greater than will be found on the right. By applying Join, all the subtrees on the left side are merged bottom-up using keys on the path as intermediate nodes from bottom to top to form the left tree, and the right part is asymmetric. The cost of Split is , order of the height of the tree. function Split(T, k) if (T = nil) return (nil, false, nil) (L,m,R) = expose(T) if (k = m) return (L, true, R) if (km) (L',b,R') = Split(R, k) return (Join(L, m, L'), b, R')) The union of two AVL trees and representing sets and , is an AVL that represents . function Union(t1, t2): if t1 = nil: return t2 if t2 = nil: return t1 (t, b, t>) = Split(t2, t1.root) return Join(Union(left(t1), t), t1.root, Union(right(t1), t>)) Here, Split is presumed to return two trees: one holding the keys less its input key, one holding the greater keys. (The algorithm is non-destructive, but an in-place destructive version exists as well.) The algorithm for intersection or difference is similar, but requires the Join2 helper routine that is the same as Join but without the middle key. Based on the new functions for union, intersection or difference, either one key or multiple keys can be inserted to or deleted from the AVL tree. Since Split calls Join but does not deal with the balancing criteria of AVL trees directly, such an implementation is usually called the "join-based" implementation. The complexity of each of union, intersection and difference is \text{O}\left(m \log \left({n\over m}+1\right)\right) for AVL trees of sizes m and n \; (\ge m). More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executed in parallel with a parallel depth \text{O}(\log m\log n). When m=1, the join-based implementation has the same computational DAG as single-element insertion and deletion. ==Rebalancing==
Rebalancing
If during a modifying operation the height difference between two child subtrees changes, this may, as long as it is 1 of Z in a way so that t1's height being already lower has decreased. (This is the only case where Z's balance factor may also be 0.) There are four possible variants of the violation: And the rebalancing is performed differently: Thereby, the situations are denoted as where C (= child direction) and B (= balance) come from the set {{nowrap|{ Left, Right }}} with The balance violation of case is repaired by a simple rotation whereas the case is repaired by a double rotation The cost of a rotation, either simple or double, is constant. Simple rotation Figure 2 shows a Right Right situation. In its upper half, node X has two child trees with a balance factor of +2. Moreover, the inner child t23 of Z (i.e., left child when Z is right child, or right child when Z is left child) is not higher than its sibling t4. This can happen by a height increase of subtree t4 or by a height decrease of subtree t1. In the latter case, also the pale situation where t23 has the same height as t4 may occur. The result of the left rotation is shown in the lower half of the figure. Three links (thick edges in figure 2) and two balance factors are to be updated. As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2, where it is again, when t23 and t4 were of same height. Otherwise the leaf layer reaches level h+1, so that the height of the rotated tree decreases. ;Code snippet of a simple left rotation node *rotate_Left(node *X, node *Z) { // Z is by 2 higher than its sibling t23 = left_child(Z); // Inner child of Z right_child(X) = t23; if (t23 != null) parent(t23) = X; left_child(Z) = X; parent(X) = Z; // 1st case, BF(Z) == 0, // only happens with deletion, not insertion: if (BF(Z) == 0) { // t23 has been of same height as t4 BF(X) = +1; // t23 now higher BF(Z) = –1; // t4 now lower than X } else { // 2nd case happens with insertion or deletion: BF(X) = 0; BF(Z) = 0; } return Z; // return new root of rotated subtree } Double rotation Figure 3 shows a Right Left situation. In its upper third, node X has two child trees with a balance factor of +2. But unlike figure 2, the inner child Y of Z is higher than its sibling t4. This can happen by the insertion of Y itself or a height increase of one of its subtrees t2 or t3 (with the consequence that they are of different height) or by a height decrease of subtree t1. In the latter case, it may also occur that t2 and t3 are of the same height. The result of the first, the right, rotation is shown in the middle third of the figure. (With respect to the balance factors, this rotation is not of the same kind as the other AVL single rotations, because the height difference between Y and t4 is only 1.) The result of the final left rotation is shown in the lower third of the figure. Five links (thick edges in figure 3) and three balance factors are to be updated. As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the double rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2 and after the double rotation it is at level h+1, so that the height of the rotated tree decreases. ;Code snippet of a right-left double rotation node *rotate_RightLeft(node *X, node *Z) { // Z is by 2 higher than its sibling Y = left_child(Z); // Inner child of Z // Y is by 1 higher than sibling t3 = right_child(Y); left_child(Z) = t3; if (t3 != null) parent(t3) = Z; right_child(Y) = Z; parent(Z) = Y; t2 = left_child(Y); right_child(X) = t2; if (t2 != null) parent(t2) = X; left_child(Y) = X; parent(X) = Y; // 1st case, BF(Y) == 0 if (BF(Y) == 0) { BF(X) = 0; BF(Z) = 0; } else if (BF(Y) > 0) { // t3 was higher BF(X) = –1; // t1 now higher BF(Z) = 0; } else { // t2 was higher BF(X) = 0; BF(Z) = +1; // t4 now higher } BF(Y) = 0; return Y; // return new root of rotated subtree } ==Comparison to other structures==
Comparison to other structures
Both AVL trees and red–black (RB) trees are self-balancing binary search trees and they are related mathematically. Indeed, every AVL tree can be colored red–black, but there are RB trees which are not AVL balanced. For maintaining the AVL (or RB) tree's invariants, rotations play an important role. In the worst case, even without rotations, AVL or RB insertions or deletions require inspections and/or updates to AVL balance factors (or RB colors). RB insertions and deletions and AVL insertions require from zero to three tail-recursive rotations and run in amortized time, thus equally constant on average. AVL deletions requiring rotations in the worst case are also on average. RB trees require storing one bit of information (the color) in each node, while AVL trees mostly use two bits for the balance factor, although, when stored at the children, one bit with meaning «lower than sibling» suffices. The bigger difference between the two data structures is their height limit. For a tree of size • an AVL tree's height is at most • : \begin{array}{ll} h & \leqq \; c \log_2 (n + d) + b \\ & :where \varphi := \tfrac{1+\sqrt 5}2 \approx 1.618  the golden ratio, c := \tfrac 1{\log_2 \varphi} \approx 1.440,   b := \tfrac{c}2 \log_2 5 - 2 \approx \; -0.328, and  d:=1+\tfrac{1}{\varphi^4\sqrt{5}} \approx 1.065. • a RB tree's height is at most • : \begin{array}{ll} h & \leqq \; 2\log_2(n+1) \end{array}  . AVL trees are more rigidly balanced than RB trees with an asymptotic relation AVL/RB ≈0.720 of the maximal heights. For insertions and deletions, Ben Pfaff shows in 79 measurements a relation of AVL/RB between 0.677 and 1.077 with median ≈0.947 and geometric mean ≈0.910. ==See also==
tickerdossier.comtickerdossier.substack.com