Merge Sort
Consider the problem of sorting an array of numbers in order. Merge sort is to recursively divide the array in half, and then recombine the sorted subarrays.
- Divide the array in half.
- Recursively divide until each subarray has only one element.
- Merge the sorted subarrays.
The following example shows how an array is divided and then merged.
38 27 43 3 9 82 10 -- divide 38 27 43 3 | 9 82 10 38 27 | 43 3 | 9 82 | 10 38 | 27 | 43 | 3 | 9 | 82 | 10 -- each length=1 27 38 | 3 43 | 9 82 | 10 -- merge 3 27 38 43 | 9 10 82 3 9 10 27 38 43 82
The following method implements merge sort in Java. It is a recursive method which calls itself on the subarrays.
public static void mergeSort(int[] data, int first, int last) { if (first >= last) return; int mid = (first + last) / 2; mergeSort(data, first, mid); // sort the left half mergeSort(data, mid + 1, last); // sort the right half merge(data, first, last); // merge two sorted subarrays }
The following is the method for merging two sorted subarrays. The first subarray starts from the index first
and ends at the index mid
, and the second subarray starts from the index mid + 1
and ends at index last
. It keeps picking up the smaller element from the two subarrays, until one subarray is finished; and then it copies the rest of the other subarray to the result.
public static void merge(int[] data, int first, int last) { // A new array to hold the merged result int[] temp = new int[last - first + 1]; int mid = (first + last) / 2; int i = first, j = mid + 1, k = 0; // Copy smaller item from each subarray into temp until one // of the subarrays is exhausted while (i <= mid && j <= last) { if (data[i] < data[j]) temp[k++] = data[i++]; else temp[k++] = data[j++]; } // Copy remaining elements from first subarray, if any while (i <= mid) temp[k++] = data[i++]; // Copy remaining elements from second subarray, if any while (j <= last) temp[k++] = data[j++]; // Copy the result back into original array for (i = first; i <= last; i++) data[i] = temp[i - first]; }
Note that in the statement temp[k++] = data[j++]
, we use postfix increment operator ++. A postfix expression (like j++) evaluates to be the value of the variable before incremented. For example, the following statement
a = i++;
has equivalent consequence as the following two statements:
a = i; i = i + 1;
And the following
temp[k] = data[i++];
is equivalent to
temp[k] = data[i]; i = i + 1;
And
temp[k++] = data[i++];
is equivalent to
temp[k] = data[i]; k = k + 1; i = i + 1;
An alternative solution for merging is given below.
public static void merge(int[] data, int first, int last) { int mid = (first + last) / 2; int i = first, j = mid + 1; int len = last - first + 1; int[] temp = new int[len]; // Temporary storage for (int k = 0; k < len; k++) { if (i == mid + 1) // The left part is done temp[k] = data[j++]; else if (j == last + 1) // The right part is done temp[k] = data[i++]; else if (data[i] < data[j]) // Get one from the left temp[k] = data[i++]; else // Get one from the right temp[k] = data[j++]; } // Copy merged part back into the original array for (i = first; i &lt;= last; i++) data[i] = temp[i - first]; }
Suppose the array has length n. The merge sort approach recursively divides the array in half, and this generates totally log(n) levels of divisions, and in consequence, log(n) levels of merging. For each level, merging runs in linear time. The total running time is about O(n * log(n)). This is much more efficient than selection/insertion/bubble sorts which run in O(n^2) time.
Merge sort uses the idea of divide and conquer:
- Recursively break down a problem into two (or more) sub-problems of the same type, until they become simple enough to be solved directly.
- The solutions to the sub-problems are then combined to give a solution to the original problem.
Demos of merge sort:
http://www.sorting-algorithms.com/merge-sort
http://youtu.be/XaqR3G_NVoo