I believe by now you would have had ample knowledge about these whole sorting tandems, how and when to use them, their respective trade-offs, and as well as pros and cons.
In this article, we would discuss merge sorting, its strategies, and their uniqueness in the sorting space and ecosystem, as much as when to use it and other things you need to understand.
Also before we delve into all these details, it’s important to understand the concept of divide and conquer as it plays a vital role in a merge sorting algorithm.
What is Divide & Conquer?
It’s an algorithmic concept of breaking down a complex problem into units, more manageable sub-problems, solving them independently, and then combining the solutions to obtain the final result. Generally speaking, it’s an approach that I’ve learned and discovered is widely used in algorithm design and allows for efficient problem-solving in several domains:
It typically consists of three main steps:
Divide: The original problem is divided into smaller subproblems that are structurally similar to the original problem but of reduced size. This division can be performed recursively until the subproblems become trivial and easily solvable.
Conquer: Each subproblem is solved independently. This step involves applying the same algorithm or methodology recursively to the sub-problems until they reach a base case where a direct solution is available.
Combine: The solutions obtained from the conquered sub-problems are combined or merged to derive the solution for the original problem. This merging step aims to combine the smaller solutions into a single solution that satisfies the requirements of the overall problem.
It’s important to know that the application of this strategy is not only limited to implementing merge sort algorithm but binary search, quick sort and etc. It’s more pronounced in the merge sort algorithm.
Merge Sort — How unique is it?
If I should recommend any sorting algorithm, to sort large data sets (heavy-weight lifting), it would be a merge sorting algorithm just because of the following reasons:
Sorting Stability: Merge Sort is a stable sorting algorithm, meaning that elements with equal values maintain their relative order after sorting. This stability feature is crucial in various applications where preserving the original order of equal elements is quite essential.
Predictable Performance: Merge Sort has a consistent and predictable performance. Regardless of the initial order of the elements, Merge Sort guarantees a time complexity of O(n log n). This efficiency makes it suitable for scenarios with large datasets, where performance is a critical factor.
Efficient for External Sorting: Its divide-and-conquer approach, coupled with its efficient merging step, makes it particularly well-suited for external sorting. External sorting refers to the situation where the data being sorted exceeds the available memory capacity and needs to be stored on secondary storage. It minimizes disk I/O operations, making it an ideal choice for sorting external data.
And the list goes on and on, this sorting algorithm looks like an ideal algorithm, having all the best features in one.
Implementation
To implement a merge sorting algorithm, let’s consider an input array [4, 1, 4, 6, 7, 2, 3, 5]. We can use a recursive function to divide the array, sort the subarrays, and merge them back together, applying the principle of the divide-and-conquer strategy we can simply approach the sorting with the following steps:
Dividing the Array: To start, we divide the original array into two halves. Taking the example of [4, 1, 4, 6, 7, 2, 3, 5], we split it into two halves: [4, 1, 4, 6] and [7, 2, 3, 5]. We continue dividing these subarrays until we reach single-element subarrays.
Conquering: Sorting the sub-arrays Once we have the single-element subarrays, we begin sorting them. For example, within the left half, [4, 1] becomes [1, 4]. Similarly, within the right half, [7, 2] becomes [2, 7], and [3, 5] remains unchanged.
Merging the Sorted Sub-Arrays: Now that we have individually sorted subarrays, we merge them to obtain the final sorted array. We compare the elements from both halves and place them in the correct order. By merging [1, 4] and [2, 7], we get [1, 2, 4, 7]. Similarly, merging [3, 5] gives us [3, 5].
Next, we merge the two remaining sorted sub-arrays: [1, 2, 4, 7] and [3, 5]. Comparing the elements, we obtain the fully sorted array: [1, 2, 3, 4, 4, 5, 6, 7].
The following is the code sample for the input array:
// Java Solution
public static int[] mergeSort(int[] array) {
if(array.length <= 1) return array;
int mid = array.length / 2;
int[] left = Arrays.copyOfRange(array, 0, mid);
int[] right = Arrays.copyOfRange(array, mid, array.length);
return sorted(mergeSort(left), mergeSort(right));
}
public static int[] sorted(int[] m, int[] n) {
int[] p = new int[m.length + n.length];
int i = 0, j = 0, k = 0;
while(i < m.length && j < n.length) {
if(m[i] <= n[j]) {
p[k++] = m[i++];
} else {
p[k++] = n[j++];
}
}
while(i < m.length) {
p[k++] = m[i++];
}
while(j < n.length) {
p[k++] = n[j++];
}
return p;
}
// Output : [1, 2, 3, 4, 4, 5, 6, 7]
Conclusion:
Merge Sort leverages the divide-and-conquer approach to sort an array, we explored the divide-and-conquer strategy by dividing the input array into smaller subarrays, recursively sorting them, and then merging the sorted subarrays to obtain the final sorted array. merge sort’s efficiency and simplicity make it a valuable tool for sorting large datasets. By understanding the power of divide and conquer, we can appreciate the effectiveness of merge sort in tackling sorting problems.