Top-Down Deletion for Red/Black Trees
Since their introduction in 1978, Red/Black Trees have gone on to become the dominant ordered collection based container. Be it for symbol tables or sets, imperative or functional Red/Black Trees can be found everywhere. Unlike AVL tree's which strive to maintain optimal balance for fast searching, Red/Black trees utilize a "good-enough" balance strategy in an effort to keep insertion and deletion as fast as possible while minimizing the amount of restructuring needed to achieve it. This has led to a bit of folk wisdom amongst programmers on when to use which balanced search tree: Red/Black trees when updates are the most frequent operation performed, and AVL trees when searching dominates.

A Red/Black Tree
One of the selling points of Red/Black trees has always been that they would be better suited to a concurrent environment than AVL trees because of their ability to be implemented top-down so that fewer nodes would need "locking" during updates. Unfortunatly, while working examples for top-down insertion abound[5][6][11], the deletion algorithm has historically been another story entirely. In todays post I'm going to talk about how this came to be, as well as do my part to alleviate it by presenting my implementation of a single-pass top down red/black tree deletion and the resources I used in it's development.
A Stroll Through the Literature
Depending on where you first encounter an algorithmic description of Red/Black trees the code will likely have descended from one of two lineages:
1) Code which descends from the original Sedgewick/Guibas paper[1] and Sedgewicks venerable "Algorithms" books[4][5][6]
2) Those which can be traced back to the listing in "Introduction to Algorithms" - AKA the "CLRS" book[2].
Prior to 2008, this meant the algorithms were implented as either top-down without the use of parent pointers - as presented in "Probem Solving With Data Structures in Java"[11] or bottom-up with parent pointers - as can be found in "Data Structures And Algorithms in C++" by Goodrich & Tamassia[12]. Despite the different approaches, both lineages model 2-3-4 trees with red links being allowed to slant in either direction. For the most part both lineages were presented as iterative algorithms, with the exception of the implementation presented in the 3rd edition of "Algorithms", which was a top-down recursive implementation.
Red/Black Trees in the Wild
With few exceptions almost every production implementation of Red/Black Trees such as can be found in the Linux Kernel[9] and the various C++ STL distributions[10] are derived from the CLRS lineage, as is Java's util.TreeMap. There are a few different reasons which contribute to the CLRS implementation being the dominant "real world" choice.
The first and most cited reason is that the number of rotations performed for each insertion and deletion is bounded in the bottom up algorithms. This means each insertion requires at most one rotation, with at most 2 rotations required for each deletion. Other methods may require up to lgN rotations, as AVL trees do. Another frequently cited reason for the use of parent pointers applies to all BST's in that they allow for the implementation of iterator objects which use O(1) space instead of the O(height(T)) needed for storing their path otherwise. Lastly, and albeit a somewhat less important reason is that while there was an (incomplete) description of a deletion algorithm in the 1978 paper, prior to the 4th edition of "Algorithms" none of Sedgewick's books included a Red/Black tree deletion algorithm - a wait time of over 30 years after the first paper!
Help On The Way
This last issue was finally addressed (well, partially at least) In 2008 when Sedgewick published a paper reviving an idea first leveraged by Arne Anderson of enforcing a constraint on the direction which red links are allowed to lean[4]. This is relevant because it greatly reduces the number of balancing cases which need to be handled. The result was Left Leaning Red/Black trees, which would replace the original Red/Black tree in the 4th edition of his book "Algorithms"[5]. Unlike the first three editions of the text, this one included an implementation of a balanced delete algorithm.
I say it was only partially addressed because the algorithm presented only worked for the 2-3 tree based left-leaning variant of the trees. It does not work for the general class of 2-3-4 Red/Black Trees that up to the point had been the Red/Black tree. It is also not a "true" top-down algorithm in that it requires rotations on the way back up the tree as the recursion unwinds. Despite these differences, the 2008 algorithm does leverage many of the same general ideas originally first proposed in the 1978 paper.
The Driving Idea
As a reminder, a Red/Black tree is considered "Balanced" if every path from root to leaf contains the same number of black links, with no two reds occuring in a row. Deletion of an internal node in a Binary Search Tree can be accomplished by replacing a node with it's in-order successor, which is then removed from the tree.
Top down deletion relies on a simple but clever premise which arises naturally from a few observations about the nature of both Red/Black trees and the wider family of binary search tree's in general:
1) We know that deleting a leaf node from a binary search tree requires no additional work besides the actual pruning of the leaf.
2) We know that deleting a red node from a red/black tree does not effect the black height of the tree and so does not leave the tree in an unbalanced state.
It then holds that if we can reduce every deletion to a case of deleting a red leaf then we should be able to delete any value from the tree in a single top-down pass. We also know from the fundamental theorem of rotations in binary search trees that any BST can be converted to any other BST through the careful application of rotations. It then follows that we should be able to transform a red black tree through a combination of rotations and color flips as well.
An illustration of pushing redness down the search path, as depicted in Algorithms[4]. In this image, the red node is being moved from the right side of the search path to the left.
While making no mention of deletion in the first or second editions of "Algorithms", this approach was mentioned very briefly in as two exercises in the third edition of "Algorithms"[5]. The first exercise encourages the reader to "develop an implementation of the search procedure which uses rotations and color changes on the way down the tree to ensure that the node at the bottom of the search path is not 2-node". The second exercise is to use this procedure in implementing a deletion procedure by "continuing to a 3 or 4 node after finding the node to be deleted using the successor to replace the node to be deleted"
A quick check confers that the description in the exercise shares the same strategy given in the original 1978 paper and the 2008 LLRB paper, as such we can be confident in our approach that this is how top down red/black deletion is intended to be performed.
Implementation Details
In order to keep the code as simple and readable as possible, I have decided to use a recursive implementation without the use of parent pointers in the node structure. This greatly reduces the amount of pointer swizzling that needs to happen during rotations.
const bool red = true;
const bool black = false;
template <class K>
struct rbnode {
K info;
bool color;
rbnode* left;
rbnode* right;
rbnode(K k = K(), bool c = red, rbnode* ll = nullptr, rbnode* rr = nullptr) : info(k), color(c), left(ll), right(rr) { }
};
I am aware of the "array of children trick" used to simplify mirror cases, though I will not be using it as I find it hinders the readability of the code. As such, all mirror cases are handled explicitly with nodes having left and right child pointers.
link rotL(link h) {
link x = h->right; h->right = x->left; x->left = h;
x->color = h->color;
h->color = red;
return x;
}
link rotR(link h) {
link x = h->left; h->left = x->right; x->right = h;
x->color = h->color;
h->color = red;
return x;
}
Aside from explicit changes to the root node, a nodes color should only be changed by a call to a rotation or color flip procedure, streamlining the algorithms. With the implementation details out of the way, lets move to the deltion algorithm.
Top-Down BST Deletion
What's great about this method, is that it is almost identical to normal deletion in a binary search tree. So long as we maintain the invariant that the node to be deleted is a red leaf, everything proceeds identically to a normal top down deletion. I would be remiss if i didnt say that "so long as..." is doing an outsized amount of work, but we'll cover it all and in the end its really not so bad.
void erase(K data) {
if (empty())
return;
if (isBlack(root->left) && isBlack(root->right)) {
root->color = red;
}
root = eraseR(root, data);
if (root != nullptr) root->color = black;
}
To help maintain our invariant, we begin by coloring the root node red. We can change the color of the root node without upsetting the balance, as any change to the root effects all paths equally, as such it gives us a leg up to start with a red node to push along our path.
link eraseR(link h, K key) {
if (h == nullptr)
return h;
h = pushRedDown(h, key);
Since we want to maintain our current-node-is-red invariant at all times, it is performed as soon as we know the current node is not null and before we do any comparisons. Only after the path is prepared do we continue searching down the tree for the desired value.
if (key < h->info) {
h->left = eraseR(h->left, key);
} else if (key > h->info) {
h->right = eraseR(h->right, key);
} else {
// found the key
When we find the value we want to delete from the tree, we proceed one of two ways based on it's position in the greater tree. If the key we're looking for also happens to be a leaf node, we simply remove it and were done, safe in the knowledge that it was a red leaf.
if (h->right == nullptr) {
link t = h;
h = h->left;
delete t;
} else {
link tmp = min(h->right);
h->info = tmp->info;
h->right = eraseR(h->right, h->info);
}
}
return h;
}
If the key we want is at an internal node or has one child, we replace it's value by that of it's in-order successor - not the whole node mind you, just the data - and then continue moving down the tree to finish by deleting the node we just copied the replacement data from, which by the time we get to it, will be a red leaf node.
And now to address the elephant in the room...
Pushing Red Nodes "Down" the Search Path
This is where all the "magic" happens when it comes to top down balancing. We have 2 "main cases" we need to handle, with each case having left and right mirrors. When we prepare for insertion, we are predominantly concerned with the color of the current-nodes parent-nodes sibling node. Conversely, when we prepare the path for deletion we are concerned with current-nodes sibling-nodes color. It's critical to realize that when we call pushRedDown(), we're passing it the parent of what will become the current node. Remember: we're preparing the path for the future to make deletion easier.
To help keep clear which node is which and their respective roles in the tree, we assign aliases so the target node - the node that will become the "current" node is labeled 'x', and it's sibling is labeled 's'. The node passed to the procedure is 'p', as it is the parent node of both 'x' and 's'.
link pushRedDown(link p, K key) {
link x, s;
if (key < p->info) {
x = p->left;
s = p->right;
} else {
x = p->right;
s = p->left;
}
With everything labeled we can start our balancing preparations. Of the two cases, the first case is the "simple" case: The target node is black, and it's sibling is red. This situation is resolved with a single rotation towards the direction of traversal, so if we are heading to the left node, we perform a left rotation, and if we are traversing the right link we perform a right rotation.
if (isBlack(x) && isRed(s)) {
p = (key < p->info) ? rotL(p):rotR(p);
}
x = (key < p->info) ? p->left:p->right;
if (x && isBlack(x) && isBlack(x->left) && isBlack(x->right)) {
p = (key < p->info) ? pushLeft(p):pushRight(p);
}
return p;
}
After performing this rotation we need to re-establish which node is which before testing for the second case, as the orientation and parant node may have changed due to the previous rotation. Case 2 occurs when the target node is a 2-node, meaning its colored black and so are both of its children. This case requires a bit more complex restructuring logic, but nothing too crazy, and as you'll see shortly, nothing we haven't actually seen before.
We resolve Case 2 through the use of pushRight() and pushLeft(), which through a combination of color flips and single or double rotations ensure that the node we traverse to is either a 3 or 4 node instead of a 2 node. The colorFlip procedure is refactored to be more general. Instead of always setting the root to red and the children to black, it instead assigns the inverse of whatever they are currently set to. This is significant because colorFlips as had previously been used for insertion served to split 4-nodes to ensure there was room for insertion at the bottom of the path. During a deletion however we are doing the opposite: we're merging nodes to ensure the node we want to delete is a red node.
link colorFlip(link h) {
h->color = !h->color;
if (h->left)
h->left->color = !h->left->color;
if (h->right)
h->right->color = !h->right->color;
return h;
}
We begin by performing a colorFlip, and then check if the result of the colorFlip landed us in an orientations which would violate our red/black properties. These happen to be the same orientations which we would rotate on if we were doing an insertion, and so they are handled with the same rotations as for insertion!
link pushRight(link h) {
h = colorFlip(h);
if (h->left) {
if (isRed(h->left) && isRed(h->left->right)) {
h->left = rotL(h->left);
h = rotR(h);
h = colorFlip(h);
} else if (isRed(h->left) && isRed(h->left->left)) {
h = rotR(h);
h = colorFlip(h);
}
}
return h;
}
link pushLeft(link h) {
h = colorFlip(h);
if (h->right) {
if (isRed(h->right) && isRed(h->right->left)) {
h->right = rotR(h->right);
h = rotL(h);
h = colorFlip(h);
} else if (isRed(h->right) && isRed(h->right->right)) {
h = rotL(h);
h = colorFlip(h);
}
}
return h;
}
If we do perform a rotation, we perform one more colorFlip() to clean up and that's it! Thanks to these very simple rules once our path is prepared and the red leaf is removed we are truly done: A single-pass top-down algorithm for deletion in red/black trees.
Testing It Out
After augmenting the tree with counters to record the number of rotations and colorflips I put together a test suite to run that is comprised of inserting and then erasing values in various configurations. I use random sequences, sorted sequence, reverse-ordered sequence, erasing the tree by repeatedly erasing the minimum or maximum, etc.
int main(int argc, char* argv[]) {
srand(time(0));
int N = (rand() % 100) + 50 + 1;
cout<<"Random Sequence: ";
randomSequence(N);
cout<<"Sorted Sequence: ";
preSorted(N);
cout<<"Reverse Ordered: ";
reverseOrder(N);
cout<<"Remove Maximum: ";
removeMax(N);
cout<<"Remove Minimum: ";
removeMin(N);
return 0;
}
After each call to erase() I validate that tree still remains a Red/Black Tree with no double black or double red violations.
Random Sequence: Removed 111 of 111 nodes successfully, and failed 0 times.
Left Rotations: 45
Right Rotations: 36
Color Flips: 68
Sorted Sequence: Removed 111 of 111 nodes successfully, and failed 0 times.
Left Rotations: 151
Right Rotations: 1
Color Flips: 208
Reverse Ordered: Removed 111 of 111 nodes successfully, and failed 0 times.
Left Rotations: 30
Right Rotations: 23
Color Flips: 66
Remove Maximum: Removed 111 of 111 nodes successfully, and failed 0 times.
Left Rotations: 43
Right Rotations: 138
Color Flips: 204
Remove Minimum: Removed 111 of 111 nodes successfully, and failed 0 times.
Left Rotations: 90
Right Rotations: 51
Color Flips: 208
So there you have it, red/black deletion explained as simply as I can make it. Thats all I've got for today, I truly hope you got something from this post. I know actual working, understandable red/black tree code for deletion is seriously rare, and it was something of a personal quest to bring this algorithm to everyone in as easily digestable a way possible - which I believe I have accomplished.. Until next time, Happy Hacking!
Further Reading
The code presented in this article is available on GitHub
-
Top-Down Deletion for Red/Black Trees
-
Implementing Closures in Bytecode VMs: Heap Allocated Activation Records & Access Links
-
Pascal & Bernoulli & Floyd: Triangles
-
A Quick tour of MGCLex
-
Compiling Regular Expressions for "The VM Approach"
-
Composable Linked Digraphs: An efficient NFA Data Structure for Thompsons Construction
-
Improving the Space Efficiency of Suffix Arrays
-
Augmenting B+ Trees For Order Statistics
-
Top-Down AST Construction of Regular Expressions with Recursive Descent
-
Balanced Deletion for in-memory B+ Trees
Leave A Comment