Iterative Red Black Trees
Red Black Trees
Balanced binary search trees are pervasive in modern software. Providing worst case performance of O(logN) red black trees are by far the most pervasive choice when it comes to their implementation. They are famously complicated to implement, and a quick look in Cormen et. al. "Introduction to Algorithms" will back this up. More recently, as i have covered in another article, the left leaning variety of red black trees introduced in 2008 has made for a straight forward implementation, which i covered in a previous article. What's not immediately obvious from the literature is that because their are less valid left leaning red/black trees than their are valid non-left leaning (from here on called 'strict') red black trees. This leads to the tree requiring more restructuring operations - rotations and color flips - than they would if the left leaning condition was not enforced. Despite the algorithms in Cormen et. al. being the de facto standard when it comes to library implementations, as anyone who's looked at sys/tree.h on BSD or the linux kernel tree source code can confirm, there are less complicated implementations that work just as well and are far easier to understand. The implementation method I'm going to be covering here is based on the algorithms dating back to at least 1983 from Sedgewicks first edition of "Algorithms". An updated version of this algorithm appeared in his 1993 second edition "Algorithms in C++" and it also seems to the bases for the Red/Black Tree implementation in Mark Allen Weiss's "Data Structures and Algorithms". It makes use of sentinel nodes as CLRS does, but thats about where the similarities end. Where CLRS uses internal parent pointers, Sedgewick's implementation uses external ancestor pointers which greatly simplifies the implementation. That being said, Lets start the show.
Sentinel nodes in Binary Search Trees
The popular implementation of implementing BST's is having initializing the root node to null, and empty left or right pointers to null as well. In a Red/Black tree however it is common place to use a null object and this implementation is no different. We will be using a head node, and all empty links will point to the sentinel node 'z'. The following code initializes an empty tree: I want to point out an important part about initializing the sentinel nodes, notice the values of the key set to the head sentinel and the null object sentinel (z). head's key is set to std::numeric_limits<int>::min(), for those unfamiliar with this value, it is the same as INT_MIN in C and java, It is the lowest possible value that can be represented by an integer value. Conversely, the null object is set to std::numeric_limtis<int>::max(), again analagous to INT_MAX, that highest value that can be represented by an integer. The reason for the sentinels being set to these values is so that at the beginning of a search through the tree any value entered into the tree will always be greater than the head nodes key, and thus end up going down the right link, and any value will be less than the null objects key, thus stopping the search.
Helper Functions
Before getting into the real "meat" of the algorithms for this data structure i want to just point out two helper functions that will greatly increase the readability of the code, remember, computers compile code, but humans need to be able to read it too! two functions that will help with the readability of the following algorithms are isRed() and isNil() which are used for identifying red nodes and the null object node as we iterate through the tree:
Iterative Insertion in Red/Black Binary Search Trees
Insertion into a red black tree proceeds very similarly to insertion into a regular binary search tree built in the iterative fashion. The primary differences are the tracking of the grand parent and great grand parent nodes (parents nodes are tracked in regular red/black trees as well) and the check in the inner loop. This check searches the tree on the way down for nodes that correspond to the "four node" isometry in the 2-3-4 tree that red black trees correspond to, as four nods need to be broken up to maintain balance (more on this below). Once the insertion place is found and the node is placed, the splitNode() function is called again, as this is the function that enforces the balancing rules of our red black tree.
Maintaining Balance
There are six cases that can arise during the insertion of a node into a red/black tree that need to be addressed in order to maintain proper balance. As mentioned above, the key to maintaining balance in Red/Black Trees is carefully choreographed color flips and rotations. As we enter the splitNode() function, the first operation that takes places is a colorflip, setting the color of 'x' to red and it's right and left children to black. What happens next is like a beautiful dance. With 'x' set to red, we now check if 'x's parent is red, if 'p' is red, than the "no double red" principle has been violated and we must restructure the tree. We proceed by setting 'x's grandparent to red as well and then determine which way the tree is "leaning by comparing the value of x's key to both its parent and great grandparent. Depending on the result of this check we then execute either a double rotation beginning at 'x's parent via its grandparent or a single rotation of x via its great grand parent. We then set x's color to black. The last step is to ensure that the root node of our tree is color black, as it may have been recolored during the restructuring. When this operation is complete, our tree is enforcing the red/black principles and is in balance. It might sound like a lot, and at first glance its not obvious, nor is it an intuitive operation, but given time its certainly not the "incomprehensible" set of operations people are often led to believe that it is.
Visualizing the difference
By using a simple preorder traversal we can see the change that takes place to the structure of the tree. Recursive traversal of a binary search tree is simple:
function preorder_traverse(node) if node is not null print node preorder_traverse(node's left child) preorder_traverse(node's right child)
As translated to C++: If we enter the charachters of the string "AREDBLACKEXAMPLE" into an unbalanced binary search tree, a preorder traversal yields the following output:
A R E D B A A C L K E E M L P X
But a preorder traversal of our Red Black Tree gives us this output:
E B A A A D C L E E K R M L P X
So how do we know that the resulting tree is more balanced? To begin with think about how a binary search tree is built. A node's right child has a key value that is less than its self, and a left child's key that is greater than its self. If the root node of our tree is 'A' than every single node must than fall on its right side. But lets use a modified breadth first traversal to get a better idea of the balance of each tree. A level order traversal of our unbalanced tree produces this output:
A R E X D L B K M A C E L P A E
As we can see, by the first two lines, the root node, 'A', only has one child, 'R', and by following the BST ordering rules we know thus that the entire tree is falling on the right side of 'A'. Now lets take a look at what out balanced tree produces:
E B L A D E R A A C E K M X L P
Much better!, each level is full, leading to a squatter tree. It's this property that results in Red Black Trees being able to provide worst case o(logN) performance! Needless to say, the RedBlack Tree is a clear winner when one needs to implement a binary search tree.
The whole shebang...
Full code for the Red/Black Tree from this article:
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
-
Deleting Entries From Open-Address Hash tables
-
Transform any Binary Search Tree in to a Sorted Linked List
-
From Regular Expressions To NFA by way of Abstract Syntax Trees: Thompsons Construction
-
Extending Sedgewick's explicit Digraph NFA
-
Iterative In-order B Tree Traversal
Leave A Comment