Algorithm Visualization Techniques & Data Observation

Often times when designing or learning a new algorithm it is beneficial to have a visual representation of what is “going on inside” an algorithm. As with self balancing binary search tree’s it is often helpful to know how the data is currently being organized compared to how it should be organized when trouble shooting a given data structure or algorithm. This can be viewed as an extension of “print statement debugging”.

For this post, I will be modifying quicksort so that we can observe which slice of the array is being worked on. As an aside, some algorithms generate cool patterns which are fun to watch anyway, regardless of any other benefits.

Visualizing the Effect of Pivot Selection on Quicksort

It is well known that the choice of the pivot value for the partitioning phase of the quicksort algorithm greatly impacts the running time of quicksort, and much work has been done to improve sorting performance. Our goal is to monitor the effect of pivot choice on the algorithm by visualizing the breakdown of the array slices currently being sorted. To do this we need to keep track of a few variables during execution:

  • l – the left bound of the array slice
  • r – the right bound of the array slice
  • n – the total number of items in the array
  • p – the index of the pivot value

In order to effect the visualization, on the entrance to the algorithm we will call a print method that will print a string of characters representing the array during the current iteration. If the slot being displayed is not included in the slice being sorted, it will print a blank space, If the array slot being displayed IS in the slice being sorted, it will print a dash(-). For the pivot position it will print ‘P’

 void print(int l, int r, int piv, int n) {
    for (int i = 0; i < n; i++) {
        if (i == piv) cout<<"P";
        else if (i >= l && i <= r) {
            cout<<"-";
        } else {
            cout<<" ";
        }
    }
    cout<<endl;
}

In order to get a baseline we need to see how the algorithm behaves using the ‘default’ pivot selection:

void quicksort(vector<int>& a, int l, int r, int n) {
    if (r > l) {
        int i = partition(a, l, r);
        print(l, r, i, n);
        quicksort(a, l, i - 1, n);
        quicksort(a, i + 1, r, n);
    }
}

max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$ ./rec
------------------------------------P-------------
-------------------------------P----
---------------P---------------
------------P--
--P---------
P-
-------P-
---P---
P--
P-
P--
-P
P-
--------------P
---P----------
P--
-P
P---------
-------P-
-P-----
-P---
P--
-P
P---
P--
P-
----------P--
------P---
----P-
-P--
-P
P--
-P
-P
max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$

Pretty neat. Notice how the slice being worked on is steadily working from left to right. Let’s see what happens when we modify quicksort to always recur into the bigger array slice first:

 
void quicksort2(vector<int>& a, int l, int r, int n) {
    if (r > l) {
        int i = partition(a, l, r);
        print(l, r, i, n);
        if (i - 1 > r - i+ 1) {
            quicksort2(a, l, i - 1, n);
            quicksort2(a, i + 1, r, n);
        } else {
            quicksort2(a, i + 1, r, n);
            quicksort2(a, l, i - 1, n);
        }
    }
}
max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$ ./rec
------------------------------------P-------------
-------------------------------P----
---------------P---------------
--------------P
---P----------
P--
-P
P---------
-------P-
-P-----
-P---
P--
-P
------------P--
--P---------
-------P-
---P---
P--
P-
P--
-P
P-
P-
P---
P--
P-
----------P--
------P---
----P-
-P--
-P
P--
-P
-P
max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$

Wow, that looks much different! And have you noticed that the output seems to be shorter? That’s because by always recurring into the bigger of the two arrays, we ultimately end up having to sort fewer slices. Now, let’s see what using the random pivot selection method does:

 
void quicksort2(vector<int>& a, int l, int r, int n) {
    if (r > l) {
        int p = rand() % (r - l + 1) + l + 1;
        exch(a, p, r);
        int i = partition(a, l, r);
        print(l, r, i, n);
        if (i - 1 > r - i+ 1) {
            quicksort2(a, l, i - 1, n);
            quicksort2(a, i + 1, r, n);
        } else {
            quicksort2(a, i + 1, r, n);
            quicksort2(a, l, i - 1, n);
        }
    }
}

max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$ ./rec
------------------------------P-------------------
------------P-----------------
---P-------------
-P-
-------P-----
-----P-
-P---
--P
P-
P----
-P--
-P
------P-----
----P
--P-
-P
P-----
----P
--P-
-P
-P-----------------
--------P--------
-----P--
--P--
P-
-P
-P
-P------
----P-
--P-
-P

Another interesting comparison is to compare the sequence of slices generated by Mergesort as compared to Quicksort:

---------------M
-------M
---M
M----
--M
--M--
---M----
M--------
----M
--M
---
M--
---
--M--
M----
--M
---
M--
---
--M--
----M----
-------M--------
M--------------
-------M
---M
----
M----
--M
---
M--
---
--M--
---M----
M-------
----M
--M
---
M--
---
--M--
M---
--M
---
M-
--
--M-
----M---
-------M-------
---------------M--------------
Duration: 0.144514ms.
Sorted.

As you can see, the pattern generated by mergesort has a much more defined structure, as expected. Being able to see the effects of changes to an algorithm, or even changing an algorithm completely really allows us, in my opinion to build a deeper understanding of how both the algorithm works as a whole, as well as its place in the problem space. I also happen to think the little tornados generated by the output look pretty cool. Let’s try another metric, how about we monitor the depth of recursion.

Visualizing the Depth of Recursion

The ‘depth’ of recursion, is the amount of times that a recursive method has been called. This is important, because recursion requires space on the stack, and excessive or poorly implemented recursion can smash the stack by causing a stack overflow. In order to track the depth we will be using a variable ‘depth’ that increments on entrance to the method, and decrements on exit. By utilizing this depth value we can print a stack trace of the recursion:

 
int indent = 0;

void printDepth() {
    for (int i = 0; i < indent; i++) {
        cout<<" ";
    }
    cout<<"@"<<endl;
}

void quicksort(vector<int>& a, int l, int r) {
    indent++;
    if (r > l) {
        printDepth();
        int i = partition(a, l, r);
        quicksort(a, l, i - 1);
        quicksort(a, i + 1, r);
        printDepth();
    }
    indent--;
}

max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$ ./rec
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$

When the ‘@’ symbol is trending towards the right, the stack is growing in size, and when the ‘@’ symbol starts moving back towards the left it represents the stack unwinding. For this example, lets see what happens when we implement the small slice insertion sort optimization:

 void quicksort(vector<int>& a, int l, int r, int n) {
    indent++;
    if (r > l) {
        printDepth();
        if (r - l < 16) {
            inssort(a, l, r);
            printDepth();
            return;
        }
        int i = partition(a, l, r);
        quicksort(a, l, i - 1, n);
        quicksort(a, i + 1, r, n);
        printDepth();
    }
    indent--;
}

max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$ ./rec
@
@
@
@
@
@
@
@
@
@
@
@
@
@
max@MaxGorenLaptop:/mnt/c/Users/mgoren/searchspace$

As you can see, dropping to insertion sort when the array is below a certain size is FAR more efficient than to continue recursing deeper and deeper into smaller array slices.

The visualization of data structures and algorithms offers a unique way to better understand their inner workings. Do you have another technique you’d like to share? Drop a comment Below! Until next time, Happy Hacking!