Randomized Binary Search Trees: Balance through Chaos
Self Balanced Binary Search Trees
When it comes to balanced search trees, their is a wealth to choose from: Red/Black and its associated variants, AVL trees, scape goat trees, splay trees, and the list goes on, and not to mention the non-binary variants: 2-3 trees, B-trees, etc etc. The thing is all of the above methods come with their own quirks and complexities that can be difficult to understand, or hard to implement.
A relative new comer to the ranks of balanced binary search trees exploits one of the lesser explored properties of binary sort trees: a BST build from random keys is USUALLY fairly well balanced. What if we could take ordered data and insert it into a BST tree randomly while preserving the BST property? The answer is the Randomized Binary Search Tree(RBST).
Randomized Binary Search Trees
RBSTs combine regular insertion and root insertion, chosen via coin flip to build a Binary Search Tree that is fairly well balanced with high probability. Being both a randomized algorithm AND a probabilistic data structure is something would initially make alot of people say "no thanks", and im not claiming this to be a "miracle data structure", but it IS someting that is theoretically interesting, as well as practical in its simplicity. Lets take a look at the parts to see how it ticks.
The trick to RBSTs is the root insertion component. Root insertion in binary search trees makes use of tree rotations to build the tree from the top instead of inserting at the bottom. The tree rotations are the same code one would use for a red/black tree minus the coloring info, or to balance an AVL tree, minus the bookkeeping, as shown in the following pseudo code:
rotateLeft(node* x): node *y = x->right; x->right = y->left; y->left = x; x = y; rotateRight(node* x): node* y = x->left; x->left = y->right; y->right = x; y = x;
Seeing as their is no balance info or color info to track, RBSTs use a standard BST node:
struct node: node* left node* right int key
Proving Randomized BST Insertion Builds a Balanced Tree.
We can prove that a randomized BST builds a balanced tree very easily. Its well known that when data is entered into a binary search tree in sorted order, it devolves into a linked list. So if we were to code up a quick loop and call insert on the digits from 0 - 9 into a regular BST we would end up with a skewed tree, in essence, a linked list. We also know that if we conducted this same experiment with a red/black tree or AVL tree that we would get a fairly, if not perfectly balanced BST.
Following this line of reasoning, we will implement a Randomized Binary Search Tree and run this test. As stated above, we need a standard BST insertion routine, a root insertion routine, and a routine to choose between the two randomly and insert our value on the tree. We also need the left rotate and right rotate functions to help with our root insertion.
Lets start with our node structure and regular insertion function. I've coded the following examples in C, as i wanted to take a break from all the OOP i've been doing in java and C++ lately. A little procedural programming never hurt anyone ;)
Standard BST inseertion:
I'm not really going to go into detail on the above snippit, as I have an entire article series on building binary search trees available on the 'Articles' page.
Inserting a node at the root works in an interesting way. It utilizes tree rotations to swap its way down the tree instead of traversing the tree from top to bottom as the above code does. The book "Algorithms in C++"(available in Java and C as well) offers a rather lengthy explanation of the process involved.
And the final piece we need is the coinflip procedure to determine if we will use bottom insertion or root insertion on a given value to make our tree "randomized":
-
Pratt Parsing: Top-Down Operator Precedence
-
Generating P-Code by AST Traversal
-
Ternary Search Tries: String Specific Ordered Symbol Tables
-
Digital Search Trees
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
-
Deleting Entries From Open-Address Hash tables
Leave A Comment