Mergesort: An efficient sorting method for linked lists

 

Merge Sorting Linked List's with C

When it comes to sorting linked lists merge sort is king. The nature of merge sort allows for very efficient implementations for use with linked lists, in fact even more so than with an array!

The O(N) space complexity of merge sort is reduced to O(1) when sorting an linked list instead of an array and the O(n log n) best/worst/average time complexity is preserved. Because merge sort process in items in a sequential manner it is both stable, AND efficient for use with singly linked lists, which don't support random access.

Merge sort is a classic divide and conquer algorithm which works by breaking a problem into small pieces, solving for them, and combining the results to arrive at a solution for the overall problem.

Mergesort

In the case of merge sort, a list is recursively split until it is comprised of singletons. A single item is its self a sorted list. We than combine these smaller lists into larger lists in sorted order until the entire list is reassembled but sorted

Recursively Splitting the list

There are at least two ways to split a list, one involves knowing ahead of time the length of the list, and the other only requires a pointer to the head of a list to split it. If the length of the list is not known, and you're still determined to split the list based on length, it will require another O(n) traversal of the list to obtain it.

It's much better to split the list without needing to know its length. This is done through the two pointer method, or "tortoise and hare" technique. The first thing we do is check if the list is either only one item, or empty since in those cases there is nothing to split.

 
 struct node* mergesort(struct node* h) {
    if (h == NULL || h->next == NULL)
        return h;
    struct node* fast = h->next;
    struct node* slow = h;
    while (fast != NULL && fast->next != NULL) {
        slow = slow->next;
        fast = fast->next->next;
    }
    struct node* front = h;
    struct node* back = slow->next;
    slow->next = NULL;
    front = mergesort(front);
    back = mergesort(back);
    return merge(front, back);
} 

The 'fast' pointer advances through the list twice as fast as the 'slow' pointer, so when the fast pointer has reached the end of the list, 'slow' is pointing to the middle of the list, and from there we have all we need to split the input list. We then call mergesort() recursively on both halves of the list and the process continues. Once the lists can't be split any further, we begin to merge them back together.

Rebuilding the List in Sorted Order Through Mergeing

Merging is the process of taking two sorted lists, and combining them into one sorted list. A popular implementation I see a lot of places uses a recursive merge algorithm, which while it works requires additional stack space, and in the case of a large list has the potential to overflow. Instead, I prefer the following iterative algorithm:

struct node* merge(struct node* a, struct node* b) {
    struct node dummy; struct node* c = &dummy;
    while (a != NULL && b != NULL) {
        if (b->info > a->info) {
            c->next = a; a = a->next; c = c->next;
        } else {
            c->next = b; b = b->next; c = c->next;
        }
    }
    c->next = (a == NULL) ? b:a;
    return dummy.next;
} 

Take note of the comparison, it uses b > a instead of a < b to preserve stability. Once it exits the loop, one of the two lists has been exhausted, so concatenating the non exhausted list to the end is an O(1) operation.

 
 int main() {
    struct node* head = random_list(15, 1, 80);
    print(head);
    head = mergesort(head);
    print(head);
    return 0;
}  



max@MaxGorenLaptop:~/DSA$ ./mergesort
4 60 51 28 43 62 10 13 27 16 34 36 58 7 24
4 7 10 13 16 24 27 28 34 36 43 51 58 60 62
max@MaxGorenLaptop:~/DSA$

See? Merge sort isn't so scary, and when it comes to linked lists, its the algorithm of choice. Until next time, Happy hacking!

 


Leave A Comment