|
James Cook University - Department of Computer
Science
CP2001 Data Structures and Algorithms
Sorting Algorithms |
|
|
Some Efficent Sorting Algorithms
| Chapt 18 |
The main problem with simple sorting algorithms covered in CP1300 (such as bubble sort, insertion sort and selection sort) is their poor run-time performance -> requiring average-case O(N^2) comparisons.
The O(N^2) may not be a problem for small N (indeed, insertion sort is the best technique to use for N<20).
We consider some sorting algorithms that have a run-time performance better than O(N^2).
One of these we have already seen previously - heap sort, which achieves a guaranteed worst-case O(N log N) run-time and a not-too-bad average run-time.
In fact, heap sort achieves a much better spread of running times (ie,
between the worst-case and best-case behaviour) than quick sort.
However, the average-case performance of quick sort is about twice that
of heap sort (but still O(N log N)).
We now briefly look at a few other sorting algorithms:
Tree Sort
Tree sort sorts by inserting the unsorted values into a balanced binary tree (say, an AVL-tree) and then doing an inorder traversal of the tree. The inorder traversal visits the nodes in sorted order.
An example simple algorithm:
tree = AVL_tree_create() // O(1) cost
for i = 1 to N do // O(N log N) cost
AVL_insert(tree, A[i])
AVL_traverse_inorder(tree) // O(N) cost
The overall cost is dominated by the time to build the tree of
N items by repeated insertion into the tree (and local rotations
for an AVL tree, or colour flips for a red-black tree).
That is, the overall cost is O(N log N).
Merge Sort
Merge sort is a divide and conquer sorting technique.
The idea behind a divide and conquer technique is to have two phases to sorting:
The divide phase repeatedly divides the problem into smaller sub-problems until the problem is small enough to solve.
The conquer step solves the simpler sub-problems and reconstructs a solution to the overall problem.
As you can imagine, divide and conquer techniques are typically recursive in nature.
In merge sort we assume we know:
Merge sort uses the following recursive procedure:
mergeSort(List)
{
if (List is a single item) return; //go back up recursion tree
SplitList(List, unsortedListL, unsortedListR);
A = mergeSort(unsortedListL);
B = mergeSort(unsortedListR);
return mergeLists(A,B);
}
where SplitList() splits the unsorted list, List, into
two equal
sub-lists (unsortedListL and unsortedListR), and
where mergeLists merges the two sorted sub-lists A and B.
Due to the tree-like recursive nature of divide-and-conquer, the overall run-time
complexity is O(N log N).
This is due to the O(N) comparisons required at each level of the
recursion tree and the existence of O(log N) levels in the tree.
Note that merge sort guarantees O(N log N) worst-case behaviour - regardless of the distribution of the original data (eg, random, partially sorted, or already sorted).
In fact, merge sort has one the best worst-case behaviours of all sorting algorithms (for moderate-to-large N).
Disadvantages:
Quick sort is another recursive divide-and-conquer technique.
In each round of quick sort:
The first steps involving locating the pivot in its correct (final and sorted) position requires O(N) comparisons.
The last two steps perform the recursion (quick sort on the two
sub-arrays of values - one having values less than the pivot,
the other having values greater than the pivot).
This, as in the case of merge sort, consists of O(log N) recursions.
So, the overall run-time complexity is O(N log N).
The average-case and best-case performances are both O(N log N).
In fact, quick sort has the best average-case behaviour
of all sorting algorithms.
Unfortunately, if the array of values is sorted (or nearly sorted), the performance may go down to O(N^2) - due to the continuous scan of the pivot all the way to the other side of the array when the array is pre-sorted.
Therefore quick sort cannot guarantee a O(N log N) performance for all array types.
Also, like merge sort, quick sort uses the stack => stack overheads.
| Copyright © 1998 Olivier de Vel. All rights reserved. |
|