For the sake of this article, I write with the assumption that you, as a reader, have some basic understanding of analyzing time and space complexity. If not, I highly recommend reading the Medium article, Asymptotic Runtime Complexity — Analyzing Time and Space written by my peer, Vincent Pang.
As a developer, I’m sure you will have encountered the need to sort data for a project at some point in your career. With the help of modern technology, this task becomes a bit trivial for the average developer. In the early days of computing, an algorithm with a time complexity of O(n²) may have taken hours to accomplish as opposed to the mere seconds that is possible today. However, we cannot just rely on the technology improving. As problem solvers, software engineers spend a great amount of time creating and optimizing solutions to enhance performance and thus provide a better user experience.
Sorting algorithms are a dime a dozen. The most common being comparison algorithms; solutions that organize elements based off direct comparison to another element or elements. Examples include: Quick sort, Merge sort, and Heap sort. As a human, it’s natural for us to sort items through comparison. As such, these algorithms come rather easy to us. It can be easy to accept these solutions as they are the most accessible through either being built into the programming language itself or just a more straightforward approach to implement. Though this does beg the question, can we do better?
Sorting efficiency varies from solution to solution. The methods above highlight what seems to be the average time complexity between comparison sorts, O(n*log(n)). The following will showcase Radix sort, an entirely different type of sorting; non-comparison based and running O(n) time (with the proper preparations).
What is Radix sort?
Before we can answer this question, we must understand that Radix sort further builds on another non-comparison based and “linear” algorithm, Counting sort. To put simply, Counting sort checks the number of occurrences an element appears in the input and rearranges those values to determine the correct and sorted position. The problem with this algorithm is that it heavily relies on inputs being in a specific range. Traditionally, Counting sort will run in O(n + k) so long as k (the range of values) is less than or equal to n (the size of the input). However, in the case that k is 1 to n², the time complexity easily shoots to O(n²) which is even worse than the O(n*log(n)) as seen from comparison sorting.
Okay, so how does Radix sort work?
Buckets! Radix sort does its magic through categorizing elements with regard to their radix; digits from the numerical representation of the elements.
- Get maximum number from input and use that determine how many times we run through the algorithm
- Loop k (number of digits in the maximum number) times and for each iteration:
- For every element, starting with the least significant digit (the ones place) increment the bucket using that digit as index.
- Go through the counts in the buckets and adjust for proper index. This is kind of wacky, we basically just want to see how many elements should come before it.
- Iterate through the array again and place the elements with respect to the counter in the bucket.
- Repeat for all other digit placements until the input is sorted.
Let’s take a look at sorting the input, [ 4, 54, 59, 36, 68, 37, 7, 31, 57, 91 ]
Note that there is also a link to the solution down below
Considering the following variables:
- d => number of digits in maximum number O(log(k))
- k => maximum possible value in input
- b => base for representing numbers (10 for decimal in this case)
- n => size of input array
We can determine time and space complexities to both be: O(d*(n + b)
If we limit k to k ≤ n^c where c is a constant and b === n, only then can we boil the complexities down to O(n). Although, this can be a bit misleading because the bulk of the algorithm requires upwards to three “for” loops iterating n times. Therefore, for smaller values of n, this algorithm may prove to be much slow than comparison based sorting.
Additionally, we can see that for situations where d === n, the complexities would reach O(n²).
Much like other sorting algorithms, Radix sort has its pros and cons. Under the right conditions, Radix sort can perform exceptionally well. Considering the limitations of k stated above as well as the space needed to perform the algorithm, Radix sort doesn’t seem that practical for relatively small amounts of data. For the average developer, this may as well just be a party trick of some sort.
So. Is this better? It can be. But that is the beauty of sorting algorithms and computer science in general. Having many different angles to approach a problem and many more to try to improve.