BST Deletion: Removal By Merge

When it comes to algorithms for removing an entry from a binary search tree there are two general strategies you take: removal by copying and removal by merge. Deletion by copying - sometimes referred to as "hibbard deletion" - is by far the more common of the two strategies employed and the method that I have always used when implementing binary search trees.

Hibbard deletion works by replacing the node to be deleted with its in-order successor (or predecessor) allowing the tree to remain in sorted order after deletion. In todays post I'm going to take a look at the other strategy for removing an item without invalidating the binary search property: deletion by merge. Instead of replacing a node with its in-order successor, we instead merge the node to be removeds two children, and replacing the node to be deleted with the resulting tree.

So how is it done? Well, as you can probably glean from the name, the main operation central to this algorithm is the merging of two trees.

Joining 2 BSTs

The first step in merging two trees is to check if either tree (or both) are null. In the event that one of the trees to be merged is null, we simply return the other tree, as there is nothing to merge. 

Assuming we do have two trees to merge, we begin by obtaining a pointer to the minimum node of the rhs tree and the min nodes parent.  (Mind you, this is an arbitrary decision: we could proceed by finding the maximum node, etc of the lhs tree.)

treenode* merge(treenode* lhs, treenode* rhs) {
    if (lhs == nullptr) return rhs;
    if (rhs == nullptr) return lhs;
    treenode* curr = rhs, *parent = nullptr;
    while (curr->left != nullptr) {
        parent = curr;
        curr = curr->left;
    }

The minimum node (curr) is unlinked from it's parent and will become the new root node of the merged trees. The right child of the min node is placed in position we removed the min node from, while simultaneously setting lhs and rhs to curr's left and right child pointers respectively.

    if (parent != nullptr) {
        parent->left = curr->right;
        curr->right = rhs;
    }
    curr->left = lhs;
    return curr;
}

At this point the trees have been joined with curr being the new root node.

Erasing an Entry by Merge

Using this method to implement deletion is rather straight forward. We recursively search the tree in the normal fashion to find the node with the value we want to delete. Once we have found that node, we replace it by calling join on its left and right children, creating a new sub-tree to replace the node being removed.

node* erase(node* curr, int info) {
    if (curr == nullptr)
        return curr;
    if (info < curr->info) {
        curr->left = erase(curr->left, info);
    } else if (info > curr->info) {
        curr->right = erase(curr->right, info);
    } else {
        node* tmp = curr;
        curr = merge(curr->left, curr->right);
        delete tmp;
    }
    return curr;
}

That's it! Really. So if this implementation of delete is so straight forward, then how come hibbards algorithm (deletion by copying) is the preferred method? The answer to that is two fold: performance characteristics and adapatability.

With regards to performance characteristics, while both methods can leave the tree unbalanced, deletion by merging has the potential to actually make the tree grow in height in some pathological cases. As for adaptability, hibbards deletion is more easily adapted for use in both AVL trees as well as some varieties of Red Black Trees, so now understanding how it works will come in handy later, in fact, deletion in B-Tree's follows a similar strategy to hibbard deletion as well. 

That's what i've got for today, until next time Happy Hacking.


Leave A Comment