The Skip List: a probabilistic data structure, Part 1
Exploring the 'Skip List' data structure as an alternative to self balancing Binary Search Trees for key/value pairs
What's a 'Skip List'?
A skip list is a data structure that was developed due to some of the 'unpleasant' aspects of implementing self balancing binary search trees (AVL Tree, Red/Black Tree, Splay Tree) - mainly, they're not so easy to code correctly. For parallel applications, Binary Search Trees are notoriously difficult, if not impossible to implement (mutex locking BST's is a serious challenge.). And while non-self balanced Binary Search Trees are easy to implement, there is a good likelihood that they will become unbalanced leading to a significant degradation of performance.
Enter the Skip List. The skip list was created in the late 1980s, making it the new kid on the block as far data structures go. Comprised of 'levels' of linked lists, they can be thought of linked lists with 'fast lanes'. The top (or bottom, depending on your view) layer is a standard linked list containing every value inserted into the list, each successive layer has fewer and fewer data inserted elements so that search no longer takes O(n) as it does in standard linked lists. Visually, they look like this:
The above representation is shown using singly linked lists. Skip lists can be, and often are, implemented as a doubly linked list for a gain in performance, and on occasion, the masochists amongst us implement them as an orthogonal linked list.
Don't let the visual representation fool you: when we insert 6 into the above skip list, we don't create 3 seperate '6' node objects and put one on each level. While we could do this 'naive' implementation, instead the 'levels' are represented by an array of pointers which cuts down significantly on the space complexity, while still allowing the quick search/traversal of the list which justifies its existence. This is also why we don't strictly need up and down pointers a'la an orthogonal implementation: we can jump levels using array indexes.
Implementation of a Skip List
*note: As shown here, we're using C++ to implement the SkipList ADT. If you would like to use C instead, the examples shown here can be found as C available here.
Like all linked structures, the place to begin when implementing a Skip List is the node type. To accommodate the multiple layered structure of the Skip List, we replace the standard 'next' and 'prev' pointers with pointers to arrays. Other than that, they are identical to a standard doubly linked list node.
Keep in mind that SkipNode will be a nested class of the SkipList class, which is where we will define keyType and itemType using class templates:
const int max_level = 7; template <typename keyType, typename valueType> class SkipList { private: class SkipNode { public: keyType key; valueType value; SkipNode *next[max_level]; SkipNode *prev[max_level]; SkipNode(keyType k) : key(k) { } SkipNode(keyType k, valueType v) : key(k), value(v) { } }; int sz; unsigned int randLevel(); unsigned int coinFlip(); void showLevel(int n); SkipNode* search(int key); public: SkipNode *head[max_level]; SkipNode *z; random_device rd; SkipList() { sz = 0; z = new SkipNode(INT_MAX); for (auto i = 0; i < max_level; i++) { head[i] = new SkipNode(INT_MIN); z->next[i] = z; head[i]->next[i] = z; head[i]->prev[i] = head[i]; z->prev[i] = head[i]; } } inline bool empty(int level) { return head[level]->next[level] == z; } void insert(keyType, valueType); void show(); bool find(int key); int size() { return sz; } };
Insertion
Inserting into a SkipList proceeds almost identically to insertion into a regular linked list, the only difference is that we have to assign a depth with which to propagate or node on the skip list, which is chosen at randomly. The original description says "by flipping a coin", So thats what we will do.
First, we implement the two helper functions, randLevel() and coinFlip().
//generate a random number between 0 and 1 and return it
template <typename keyType, typename valueType> unsigned int SkipList<keyType, valueType>::coinFlip() { mt19937_64 gen(rd()); uniform_int_distribution<int> distrib(0, 1); return distrib(gen); } //repeatedly call coinFlip() until tails, tallying the total of heads template <typename keyType, typename valueType> unsigned int SkipList<keyType, valueType>::randLevel() { //count coin flips until tails uint n = 0; while (coin_flip()) //if heads, we increment { n++; } //return number of coin flips as the number of levels to use return n; }
With those functions in place were ready to insert into our Skip List:
template <typename keyType, typename valueType> void SkipList<keyType, valueType>::insert(keyType key, valueType val) { SkipNode *n = new SkipNode(key, val); int ins_level = randLevel(); if (ins_level > max_level) { ins_level = max_level; } //propagate the insertion through each level for (auto i = 0; i <= ins_level; i++) { //insert as in a normally sorted linked list for (SkipNode *t = head[i]; t != z; t = t->next[i]) { //find proper position if (key > t->key && key < t->next[i]->key) { n->depth++; n->next[i] = t->next[i]; n->prev[i] = t->prev[i]; t->next[i] = n; t->next[i]->prev[i] = n; } } } sz++; //increment list size. }
We now can fill out skip list with data!
Displaying a Skip List
Displaying a Skip List is a straight foward operation: Print a level of the skip list, drop down a level, print that level, etc until we;ve covered the entire height of the list. I seperated this out into 2 functions showLevel() and show(), which isnt totally necessary, but gives some nice flexability.
template <typename keyType, typename valueType> void SkipList<keyType, valueType>::showLevel(int level) { SkipNode *t; if (level > max_level) { cout<<"Level exceeds max.\n"; return; } else if (!empty(level)) { for (t = head[level]->next[level]; t != z; t = t->next[level]) { cout<<t->key<<" "; } } else { cout<<"(empty)"; } cout<<endl; } template <typename keyType, typename valueType> void SkipList<keyType, valueType>::show() { //display each level for (auto i = max_level-1; i > -1; i--) { cout<<"Level "<<i+1<<": "; showLevel(i); } }
Lets give our Skip List a test drive:
#include <iostream> #include "skiplist.h" using namespace std; int main() { srand(time(0)); SkipList<int, string> sList; for (auto i = 0; i < 15; i++) { auto q = rand() % (100 - 1) - 1; sList.insert(q, to_string(q)); } sList.show(); return 0; } Output: Level 7: (empty) Level 6: 37 Level 5: 34 37 Level 4: 34 37 Level 3: 24 34 37 Level 2: 24 34 37 56 72 82 94 Level 1: 15 24 34 37 39 49 56 61 67 72 77 82 88 94 97
Success!
and thats it for part one, in Part 2 we'll cover searching and deletion, stay tuned!
Full source code for the Skip List created in this article is available on my github:
C++ version:
https://github.com/maxgoren/examples/blob/main/skiplist.cpp
https://github.com/maxgoren/examples/blob/main/skiplist.h
C version:
-
How Many Hanukkah Candles?
-
Pratt Parsing: Top-Down Operator Precedence
-
Generating P-Code by AST Traversal
-
Ternary Search Tries: String Specific Ordered Symbol Tables
-
Digital Search Trees
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
Leave A Comment