Red-Black Tree Files: rbtree.c, rbtree.h Like the AVL tree, the red-black tree is a balanced binary search tree. A description of the binary search trees and the AVL can be found in the files bst.txt and avl.txt. Nodes in a red-back tree are the same as for a binary search tree, with the addition of a colour field which specifies the colour of a node. The balance of a red-black tree is maintained by colouring nodes red and black in such a way that the following properties are satisfied: Root Property: The root node is black. External Property: Every external node is black. Note: An `external node' is just a NULL-node. A NULL-node is used in place of a child node which does not exist. Depending on the implementation, a non-existent child may be specified by a NULL pointer, rather than a NULL-node, but for this description, NULL nodes (i.e. external nodes) are used. Internal Property: The children of a red node are black. Depth Property: All external nodes have the same number of black ancestors. Other balanced search trees such as the AVL tree and 2-3 tree use a different set of rules in order to maintain the trees balance. After inserting or deleting an item from an AVL or 2-3 tree, the tree can become unbalanced, requiring some of the tree to be restructured. An AVL tree may require several rotation operations in order to maintain the trees balance after inserting or removing an item, and a 2-3 tree may require several splitting or merging operation after inserting or removing a node. At worst, the AVL and 2-3 tree can require O(log n) structural changes to be made to the tree after inserting or deleting an item. Here n is the number of items in the tree. As for the 2-3 tree and AVL tree, the red-black tree can require structural changes in order to maintain its colouring balance properties. However, the red-black tree also allows node colours to be changed when maintain its colouring balance. In the worst case, the red-black tree requires O(log n) recolourings and O(1) structural changes after inserting or deleting an item. The advantage of the red-black tree over an AVL tree or 2-3 tree is that structural changes are only O(1) after inserting or deleting an item. To completely describe the red-black tree, a description, of the find(), insert(), and delete() operations will be given. The find() and find_min() Operations ------------------------------------ Since the red-black tree is just a binary tree which is balanced, the find() and find_min() operations are exactly the same as for a normal binary tree. Refer to the file bst.txt for a description of the binary search trees find() operation. The insert() Operation ---------------------- Insertion begins the same as for a normal binary tree. The effect of insertion is to replace an external node with a new internal node z, holding the inserted element, x, and its key, k, and having two external children. The colouring of nodes is as follows: - If z is the root node, then z is coloured black, else z is coloured red. - The children of z, being external nodes, are coloured black. This preserves the root, external, and depth properties of the tree, T, but the internal property may be violated as follows: - If z is not the root of T, and the parent, v, of z is red, then we have a parent and child which are both red. This violates the internal property, and is called the double-red problem. Suppose there is a double-red problem after insertion, then the parent, u of v is black. This is the case becuase the internal property was previously satisfied, and since v is red, the root property states v must have a parent node. There are two cases to consider when fixing a double-red problem: Case 1: The sibling, w of v is black. In this case the problem is fixed by performing a trinode restructuring. This is done by the operation restructure(z), defined below: The restructure(z) Operation ---------------------------- - Take node z, its parent, v, and grandparent, u, and label them as a, b, and c, in left to right order. That is, a, b, and c are visited in this order by an in-order traversal. - Replace node u, with node b, and make nodes a and c children of b, keeping the same left to right order. Child nodes previously under node b can be assigned to nodes a or c appropriately. After the restructure(z) operation, node b is coloured black and nodes and and c are coloured red. This eliminates the double-red problem. Case2: The sibling, w of v is red. In this case, the double-red problem can be fixed by recolouring nodes. This is done by colouring v and w black, and their parent, u red (unless u is the root node, in which case, it remains black). This eliminates the double-red problem at node z, but the double-red problem, may reoccur at z's grandparent, u, if u has a red parent. Thus, recolouring can propagate up the tree, until the double-red problem is eliminated by a final recolouring, or a trinode restructuring. We see that at worst, the number of recolourings steps is half the height of the tree, and since tree height is O(log n), the number of recolourings is at worst O(log n). Also, at most one restructuring operation can occur, so at worst O(1) time is spent on restructuring the tree. The delete() Operation ---------------------- The delete operation begins the same as for a normal binary tree. This searches for a node, u, which holds the item to be deleted. As in the binary tree, if u has less than two children, then u is deleted, and the lone child (if any) and its subtree move up to replace u. However, if u has two children, then the minimum (that is, left-most) node, v, in the right child subtree of u replaces node u, so that the removal appears to be at the old position of v. Thus, for all deletion cases, the removal appears at a node with less that two children. That is, we only need to consider the effect of removing a node, v, with an external child, w. Removal of a node, v, with an external child, w, proceeds as follows. Let r be the sibling of w and x be the parent of v. - Node v, and its external child, w, are removed, and node r replaces v as the child of x. - If v was red, then r must be black, and the depth property is preserved since a red node has been lost. - Otherwise if r is red, then v must have been black, and the depth property is preserved by colouring r black. - Otherwise both v and r are black, so to preserve the depth property, r would need to be coloured double-black after v was deleted. This colouring violation is called the double-black problem, and must be fixed by recolouring or restructuring part of the red-black tree. There are three separate cases for fixing the double-black problem appearing at node r. Let x and y denote the parent and sibling of r respectively: Case 1: The sibling, y, of r is black and has a red child, z. In this case, trinode restructuring is performed using the operation restructure(z). See the description of insert() above for an explanation of the restructure(z) operation. The restructure operation takes the nodes x, y, and z, and labels them left to right as a, b, and c, then replaces x with b, and makes a and c the children of b. After restructuring, a and c are coloured black, b is given the former colour of x. Node r remains black. This restructuring and recolouring eliminates the double-black problem. Case 2: The sibling, y, of r is black and both children of y are black. In this case, recolouring is performed. Node r remains black. Node y is coloured red, and if node x is red it is coloured black, eliminating the double-black problem. However, if node x was black, it would need to be coloured double-black. Thus the double-black problem can reappear at node x. We see that the recolouring either eliminates the double black problem or propagates it up to the parent of r. If the double-black problem reoccurs, the three cases are reconsidered at the parent, x. Case 3: The sibling, y, of r is red. In this case, an adjustment operation is performed as follows. If y is the right child of x, let z be the right child of y, otherwise let z be the left child of y. Trinode restructuring is then done using restructure(z). This makes y the parent of x. Node y is then coloured black, and x is coloured red. Node r remains black. This is only an adjustment operation and has not yet eliminated the double-black problem. The sibling of r is now black, so now either Case 1 or Case 2 applies, but with a different node for y. Since x is red, if Case 2 applies, then the double-black problem will not reoccur after the recolouring performed under Case 2. Thus, after the adjustment operation, just one application of either Case 1 or Case 2 is needed to fix the double-black problem. The number of times Case 2 can be repeated is bounded by the height of the tree, and since tree height is O(log n), the worst case number of recolourings performed is O(log n). Also, since Case 1 and Case 3 do not repeat, only O(1) worst case restructuring steps are performed. The delete-min() Operation -------------------------- After the minimum node is located, this the process is exactly the same as for the delete() operation. However, we can note that the restructure(z) operation on a node, z, with parent y, and grandparent x, has fewer left to right orderings of x, y, and z to consider. This is because y is always the right child of x, which means that x is always first in the left to right order.