Validating Red/Black Trees
Red/Black Trees are ubiquitous in computer science, and anyone who has taken more than a cursory glance at this site will know that I have spent a fair bit of time studying red/black trees, their various properties, characteristics and different methods of implementing them. This last point also required that I have a way of quickly determining if a given algorithm produces a valid Red/Black tree.
In previous posts I have discussed various tree visualization algorithms, culminating in my TreeDraw project, which I use to generate the various binary search tree graphics in my posts. In this post I want to discuss the algorithms used to ensure a given tree satisfies all of the properties of a Red/Black or Left Leaning Red/Black tree, as well as gather performance statistics of various Trees. When combined with the TreeDraw algorithm, I am able to instantly gather information like the following when comparing two trees:
Red/Black Tree:
Tree Size: 63
Left rotations: 16
Right rotations: 19
Color Flips: 35
Insertions requiring rotations(%): 30.1587
Average Rotations / Insertion: 1.42105 (Right: 0.789474 / Left: 0.631579)
Max Single Insertion rotations: 2
+ Tree is black balanced.
+ Tree is valid 2-3-4 tree
+ Tree is a valid BST
-------------------------
Left Leaning Red/Black Tree:
Tree Size: 63
Left rotations: 38
Right rotations: 28
Color Flips: 42
Insertions requiring rotations(%): 46.0317
Average Rotations / Insertion: 1.68966 (Right: 0.724138 / Left: 0.965517)
Max Single Insertion rotations: 5
+ Tree is black balanced.
+ Tree is valid 2-3 tree.
+ Tree is a valid BST
This is very useful information, especially when fine-tuning an algorithm, or comparing differing implementations. The statistics regarding the number of rotations and color flips are very straightforward, of real interest is the final three tests performed on both trees. It is these three tests which determine if the provided tree is a valid Red/Black Tree (or Left Leaning Red/Black). But what is a valid Red/Black Tree?
On page 574 of "Algorithms in C++"(3rd. ed) Sedgewick provides the definition of a A Red/Black tree as: "a binary search tree in which each node is marked to be either red or black, with the additional restriction that no two red nodes appear consecutively. A balanced red-black BST is a red-black BST in which all paths from root to leaf have the same number of black nodes." In the 4th addition, the Left Leaning invariant is introduced with the additional rule that no right child may be a red node(All red nodes "lean left").
Ok there's a bit to unpack there, lets start with the simplest: Red/Black Tree's are binary search tree's. At the very least, a tree must be a valid binary search tree, or we don't even to need to bother with any more complicated tests.
Is the tree a BST?
All binary search trees must posses the "binary search tree property", that is, a binary tree where the value of a given nodes key is greater then the key of its left child, and less then the key of its right child. We can check the validity of a binary search tree by performing this check on each node visited during a pre-order traversal, exiting if we encounter a node that doesn't satisfy the BST property.
bool isBST(node* x) {
if (x == nullptr)
return true;
if (x->left && x->key < x->left->key)
return false;
if (x->right && x->right->key < x->key)
return false;
return isBST(x->left) && isBST(x->right);
}
This algorithm neatly exemplifies the paradigm we will be using for the other validation algorithms as well, mainly, if a provided node is null, we return true: an empty tree is always a valid tree. We then perform the test on the invariant we are validating, and then recur on the left and right branches of the current node.
Now that we've determined if the tree is a valid binary search tree, lets validate that it conforms to the red/black property laid out above.
Does the BST follow the color rules?
We have two color invariants that we need to validate. First we must validate that the tree maps to a 2-3-4 tree by checking that no two red nodes occur consecutively. This is as simple as checking if the current node is red, that neither of it's children are red.
bool is234(node* x) {
if (x == nullptr)
return true;
if (isRed(x)) {
if (isRed(x->left) || isRed(x->right))
return false;
}
return is234(x->left) && is234(x->right);
}
The code for validating Left Leaning Red/Black trees or Parity Seeking Red/Black Tree's proceeds in a similar fashion, with the addition of another test to ensure that no red child has a red sibling(2-3 Red/Black Trees), or in the case of LLRB tree's, that all red nodes are their parent's left child. The full code is available on my github and linked at the bottom.
Is the Red/Black Tree black-balanced?
This is actually the trickiest of the three cases to validate. We want to know if every path in the tree has the same number of black nodes. At first glance this is actually much easier than it appears, but it requires a few insights to arrive at the proper solution. Like all things, it helps to take a step back, and try to simplify things. Before we can tell if all paths have the same number of black nodes, we need to know how many black nodes one path has. We know we the path to the min node in a binary search tree is obtained by following the left pointer from the root to a null node, so counting the black nodes we encounter while traversing to the left most node gets us our baseline.
bool blackBalanced() {
int numBlack = 0;
node* x = root;
while (x->left) {
if (!isRed(x)) numBlack += 1;
x = x->left;
}
return blackBal(root, numBlack);
}
Now we can take this baseline and perform.. you guessed it, a traversal of the tree! As we traverse down a path, we subtract one from our black count when we encounter a black node. Any time we arrive at a leaf, numBlack should equal zerom, otherwise the two paths have differing numbers of black nodes and are thus unbalanced. As the stack unwinds and we backtrack up the path the black count is restored before continuing back down another path in the tree.
bool blackBal(node* x, int numblack) {
if (x == nullptr)
return numblack == 0;
if (!isRed(x)) numblack--;
return blackBal(x->left, numblack) && blackBal(x->right, numblack);
}
Well, Is it Valid?
In order for a tree to be considered a valid red black tree it needs for all three of the above methods to return as true, this is as simple as combining the three tests into one boolean expression:
bool isRedBlackTree() {
return isBST(root) && is234(root) && isBalanced();
}
It should go without saying that we need not traverse the tree three separate times as is done by the example just shown. All of the above methods can be performed together during a single traversal:
bool validate(node* root) {
node* x = root;
int blackNodes = 0;
while (x != nullptr) {
if (!isRed(x)) blackNodes++;
x = x->left;
}
return validate(root, blackNodes);
}
bool validate(node* x, int bb) {
if (x == nullptr)
return bb == 0;
//update black balance
if (!isRed(x)) bb--;
//check color rules
if (isRed(x)) {
if (isRed(x->left) || isRed(x->right))
return false;
}
//validate BST
if (x->left != nullptr && x->info < x->left->info)
return false;
if (x->right != nullptr && x->right->info < x->info)
return false;
return validate(x->left, bb) && validate(x->right, bb);
}
Using these methods you can know play around with different balancing strategies, fine tune your own red/black algorithms, or just explore the effects of applying diferent rotations, and validate if the result is a red/black tree. Until next time, happy hacking!
-
Ternary Search Tries: String Specific Ordered Symbol Tables
-
Digital Search Trees
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
-
Deleting Entries From Open-Address Hash tables
-
Transform any Binary Search Tree in to a Sorted Linked List
-
From Regular Expressions To NFA by way of Abstract Syntax Trees: Thompsons Construction
Leave A Comment