YCP Logo Lecture 32: Heaps, Heap Sort

Heaps and Priority Queues

A heap is a complete binary tree with a particular ordering property between the parent's key and the children's keys.  A complete binary tree has the maximum number of nodes at every level of the tree except the bottom.  In addition, all of the nodes at the bottom level of the tree are as far to the left as possible.

Min heap: parent's key is less than children's.  Max heap: parent's key is greater than children's.

One important application of heaps is to implement a priority queue.  When an item is added to the queue, a priority is specified.  When an item is dequeued, the item with the highest priority is removed.

Heap Operations

In the following discussion, we will assume that we are using a min heap.  However, the same algorithms used in min heaps also work with max heaps---the only difference is the ordering of parents and children.

find-min
Find the minimum element in the heap. This is always the root of the heap, so it takes O(1) time.
remove-min
Remove the minimum element in the heap. This can be done in O(log n) time.
insert
Insert an element in the heap. This can be done in O(log n) time.

Representing a Heap Using an Array

A complete binary tree can be efficiently represented using an array.  Each element of the array represents one node in the tree.  For a node at index j,

Unlike the usual approach to representing a tree using objects and fields, the array-based approach makes it easy to find the parent of any node in the tree.  The parent is always at index

floor( (j-1) / 2 )

The floor function discards the fractional part of a number.  In Java, we can simply rely on integer division to discard the fractional part of dividing (j-1) by 2.

Binary heap class

class BinaryHeap<E> {
    Object[] storage;
    int numElements;
    Comparator<E> comparator;
}

Inserting a new element

The basic idea for inserting an element in a min heap is that the element becomes the new rightmost leaf on the bottom level.  Since the new leaf may be out of order with respect to its parent, we may need to switch the new element with its parent.  This process of swapping the new element with its parent continues as long as it is less than its parent.

void insert(E element) {
    if (numElements >= storage.length)
        grow();

    int index = storage[numElements];

    while (index > 0) {
        int parent = (index-1) / 2;
        if (comparator.compare((E) storage[parent], element) < 0) {
            break;
        }
        swap(index, parent);
        index = parent;
    }
    ++numElements;
}

Removing the minimum element

To remove the minimum element we copy the rightmost leaf on the bottom level up to the root.  As in the case of insertion, moving the leaf to the root may violate the ordering property of the root with respect to its children.  To restore the heap, we repeatedly swap the leaf with its smallest child until either it is less than both children, or it is once again a leaf.

E removeMin() {
    E min = (E) storage[0];

    // Move the last element of the array to the root
    storage[0] = storage[numElements-1];
    numElements--;

    int index = 0;

    while (index < numElements) {
        int left = (2 * index) + 1;
        int right = (2 * index) + 2;

        // If there is no left child, then we've reached a leaf
        if (left >= numElements)
            break;

        // Find the minimum child
        int minChild = -1;
        if (right >= numElements) {
            // No right child
            minChild = left;
        } else {
            if (comparator.compare((E) storage[left], (E) storage[right]) < 0)
                minChild = left;
            else
                minChild = right;
        }

        if (comparator.compare((E) storage[index], (E) storage[minChild]) < 0) {
            // Current element is greater than either child, so we're done
            break;
        }

        swap(index, minChild);
        index = minChild;
    }

    return min;
}

Heap Sort

We have studied several sorting algorithms previously: Insertion Sort, Shell Sort, Quick Sort, Merge Sort.  However, the one algorithm that has O(n log n) worst case running time (Merge Sort) requires O(n) storage.  Heap sort is a sorting algorithm that has O(n log n) worst case running time, but does not require additional storage.

Here is a sketch of how the algorithm works.

Step 1.  Construct a max heap: O(n log n) time.  For each element in the array starting at the beginning, the element is inserted into a max heap which is constructed at the beginning of the array.  The following diagram shows the progress of building the max heap.  The red element is the next element to be inserted, and the green elements are those that have been arranged into a max heap.

Step 2.  Once the entire array has been arranged as a max heap, we repeatedly perform the remove-max operation on the heap.  Each time the maximum element is removed, it is placed in array element where the rightmost bottom leaf was removed, at the end of the heap.  After every element has been removed from the max heap, the array is in sorted order.  This also takes O(n log n) time, so the overall algorithm is O(n log n).