Sorting Algorithms - When Not To Use Them

Sorting Algorithms - When Not To Use Them

Algorithm Fridays Week 3 Solution

If you had 30 students in a class and you wanted them to stand in a line; you could decide to arrange the students in line using their names - students whose names start with the alphabet A stay in front of the line while those whose names start with the latter alphabets stay at the rear.

Or, you can decide to arrange them based on their heights; in this case, you would likely bring the shorter students to the front of the line and the taller students to the back of the line.

Both of these methods define some way of arranging the students based on some property. This process is called sorting.

In this article, I will start with a brief introduction about sorting and then highlight briefly the different sorting algorithms and their performance. Then, I will apply that knowledge to the algorithm problem we solved for Week 3 of Algorithm Fridays and in doing so, we will see if using the sort function was the most optimal solution for the problem.

Let's begin with a brief introduction of sorting.

What is sorting?

Sorting is the process of arranging data in a particular format based on some attribute, for easier data retrieval.

That's all? Yes that is all.

Using the example of the class students, ordering those students whether by their names or heights is sorting.

Sorting algorithms define the step-by-step approach that describes how the data is sorted. Like with most other algorithms, sorting algorithms help make data more analyzable or easily retrievable.

Now that we know what sorting is and what sorting algorithms are, let's take a look at the different sorting algorithms that are available.

The Different Sorting algorithms

There are different sorting algorithms and I want to share this non-exhaustive list of sorting algorithms.

Runtime complexity of sorting algorithms

The table below shows the runtime complexity of the different sorting algorithms using Big O notation :

Sort AlgorithmWorst Case Complexity
Insertion sortO(N2)
Bubble sortO(N2)
Quick sortO(N2)
Merge sortO(N log N)
Heap sortO(N log N)
Radix sortO(N)
Counting sortO(N)

One important thing to note from here is that the most optimal sorting algorithms have a O(N log N ) runtime complexity.

How do we apply this knowledge in developing a solution to the algorithm problem?

Algorithm Problem

Given the knowledge that we have now of the different kinds of sorting algorithms and their time complexity, let's look at the algorithm problem that more than 20 participants worked on for Week 3 of Algorithm Fridays:

Algorithm Fridays Week 3 Question.png

How sorting helps us solve the problem

Looking at the problem definition, a possible first instinct would be to sort the array and then find the first and last occurrences of val in the array.

Here's what that solution (without taking care of edge cases) would look like in JavaScript:

const findFirstAndLastPositionOfVal = (nums, val) => {
    const sortedNums = nums.sort(); // This makes this function O(N log N)
    const start = nums.indexOf(val);
    const end = nums.lastIndexOf(val);
    return [start, end];
}

However, knowing that sorting algorithms have a O(N log N) time complexity, the question we should be asking is "there a way we can solve the problem faster by doing less work?"

Do we really need to sort?

As software engineers developing solutions, it is our place to re-interpret and re-define product requirements to allow us develop the most optimal solutions.

Let's think a bit more about the problem. Why are we sorting the array?

Sorting helps arrange the numbers in increasing order, as that will help us find the positions of val more intuitively.

For instance, if arr = [-2, 3, 0, 7, 11, 3, 3, -19, ] and val = 3, the image below shows us that the array would give us arr = [-19, -2, 0, 3, 3, 3, 7, 11] and we can easily identify the start and end positions of val = 3 being [ 3, 5]

sorted array.png The position of elements in the sorted array

But do we really need to sort the array to solve this?

Since the goal of sorting is to help order the numbers so that all numbers less than val are placed in front of val, is it possible that we can achieve the same result without doing the 'work' of sorting - again remembering that we want to do the most with the least resources?

What if we focused on finding the numbers that are less than val since we are sure that those numbers would be placed in front of val if we sorted the array? Armed with that information and the number of occurrences of val in the array, we could find the start and end positions of val.

The illustration below will help explain this better:

startAndEndOfVal.png

The solution in JavaScript is shown below:

const findStartAndPositionOfVal = (nums = [], val) => {
    if (typeof val !== 'number') throw new Error('Invalid input');

    if (!Array.isArray(nums) || nums.length == 0) return [-1, -1];

    let countOfNumsLessThanVal = 0;
    let countOfVal = 0

    nums.forEach((num) => {
      if (num < val) countOfNumsLessThanVal++;
      if (num === val) countOfVal++;
    });

    if (!countOfVal) return [-1, -1] // val is not found in array

    if (!countOfNumsLessThanVal) {
       // Edge case: val is smallest number in array
       return [0, countOfVal - 1];
    }

    const startOfVal = countOfNumsLessThanVal;
    // subtract one because of zero-based indexing
    const endOfVal = startOfVal + countOfVal - 1;

    return [startOfVal, endOfVal ] ;
}

/** Test case 1: When nums is null, should return [-1, -1] **/
console.log(findStartAndPositionOfVal (null, 5)); //✔️

/** Test case 2: When nums is an empty array, should return [-1, -1] **/
console.log(findStartAndPositionOfVal ([], 5)); // ✔️

/** Test case 3: When nums has just one occurrence of val, should return [0, 0]  **/
console.log(findStartAndPositionOfVal([0], 0)); // ✔️

/** Test case 4: When nums has multiple occurrences of val,  should return [0, 2] **/
console.log(findStartAndPositionOfVal([3, 3, 3], 3)); // ✔️

/** Test case 5: When val is the smallest number in nums, should return [0, 2] **/
console.log(findStartAndPositionOfVal([0, 8, 0, 0, 3, 12], 0)); // ✔️

/** Test case 6: When val is the biggest number in nums, should return [5, 5]  **/
console.log(findStartAndPositionOfVal([0, -8, 0, 0, 0, 3], 3)); // ✔️

/** Test case 7: When val is not the smallest nor biggest number in nums, should return [1, 4] **/
console.log(findStartAndPositionOfVal([0, -8, 0, 0, 0, 3], 0)); //  ✔️

The much improved solution has a linear, O(N) time complexity, because we only have to iterate through the items in the array just once. If you want to further optimize it, you can use the two-pointer approach I talked about last week in this article.

Winners of the $20 award

After reviewing all the solutions submitted, we have four optimal solutions for Week 3 of Algorithm Fridays.

The winners are:

We picked these solutions because apart from being correct and robust, they all were able to solve the problem without having to sort the array - thus achieving a time complexity of O(N) which is better than O(N log N) caused by sorting.

Thank you to everyone who participated in Week 3 of Algorithm Fridays.

Conclusion

Sorting algorithms are powerful and best used when you want to arrange data in a specific format. However like with every other tool available to you, you should only use them when absolutely necessary because it comes with a cost.

Your goal should be to find ways to achieve the same results by doing less computation - less work.

I believe this was worth your read and I hope this challenges you to think a bit more deeply about the code you write when developing solutions.

Please feel free to share your comments or thought on this post or on the question tweet.

Thanks for reading and see you on Friday for Week 4 of Algorithm Fridays.

Have an amazing week ahead!

Further Reading