Today we’ll learn about heaps. A heap structure is a tree that has a heap invariant. We’ll be focusing on binary min-heaps, which are a nice way to implement priority queues, a common way of managing tasks of varying priorities: elements are added in any order, but the “most important” elemnt is taken out each time. (For min-heaps, “most important” means “smallest”.)

It may help to think of priority queues as working on pairs of values and priorities, e.g., you might queue up tasks for the day, with lower priority meaning more important:

If enqueued in a priority queue in any order, these would come out in the order “eat”, “class”, “video games”, “laundry”.

Trees in text

But first, let’s get a textual notation for our trees.

A binary tree t is either:

Defining heaps

A binary tree t is a min-heap when:

Contrast the min-heap property with the search property. The search property amounts to “is sorted when traversed in-order”. Here, the min-heap property amounts to “the top most value is king of the hill, i.e., the minimal value in the whole tree”.

A few things:

You can take our notion of min-heap and convert

Exercise 1

Draw a min-heap with 7 elements. Now draw one with 8 elements. How many levels does each have?

Operations on min-heaps

We’re using min-heaps to implement priority queues, so we really only need two operations:

The enqueue operation corresponds to adding a new element to the min-heap. The tricky thing here is that we need to maintain the min-heap invariant, which means keeping our tree complete, i.e., all levels full except the last, which fills from the left.

The dequeue operation corresponds to removing a minimal element from the min-heap. This should be easy: the root of the tree is minimal! But just like the remove operation on BSTs, we have to fix up the tree: if we have the tree node(v, l, r) and we remove v, who is the new root?

It turns out both of these operations can be implemented simply using a notion of “fake it til you make it”. We’ll deliberately misplace elements in the heap (to maintain the completeness invariant) and then fix the tree up. To restore the ‘min’ part of the min-heap.

The enqueue operation a/k/a insert

Concretely, for enqueue, we’ll put the element in the appropriate position: the leftmost open spot on the last level (or as the leftmost element in a new level). Then we’ll start at that node and work our way up, asking: hey, is the node here less than its parent? If so, swap their values! We keep swapping until we’ve found a good spot.

Here’s an example. Suppose we have the follow min-heap (verify that it’s a min-heap!):

      5
    /   \
   10    6
  / 
 11

Suppose we want to enqueue 4 into this tree. The next available spot is the right child of the 10, so we’ll get this tree:

      5
    /   \
   10    6
  /  \
 11   4

Uh oh… that’s not a min-heap! So we have to fix it up. We’ll start at 4 and work our way up. We compare 4 with its parent, 10, and see that 4 is less… so we swap it up, finding:

      5
    /   \
   4     6
  / \
 11 10

We still don’t have a min-heap. We compare 4 with its parent again, finding that 4 < 5. So we swap once more:

      4
    /   \
   5     6
  / \
 11 10

Whew—fixed!

Exercise 2

Draw a min-heap with 7 nodes (you can use your earlier one). Find elements that take 0, 1, 2, and 3 swaps respectively. Could an element take 4 swaps?

The enqueue algorithm

Here’s some pseudocode for how to add something an element e to a min-heap:

  1. Put the element in the last valid position, i.e., the rightmost unused child node in the last level, or as the leftmost child in a new level. (Observe that the tree is still complete.)
  2. While the new node is strictly less than its parent, swap their elements and continue traversing upward until either (a) the parent is less than or equal to the new node, or (b) you’ve reached the root.

The algorithm has two phases: (1) insert the element, and (2) fix your mistakes. It’s O(log(n)), where n is the number of elements… because we swap from a leaf on a complete tree possibly up to the root—and we can only do that log(n) times before reaching the root, since a complete tree with n elements is log(n) high.

The dequeue operation a/k/a removeMin

Similarly, to dequeue, we’ll pop off the root—that’s the minimal element we want to dequeue—and we’ll replace it with the very last element in the heap, i.e, the rightmost non-empty node in the last level. Now, that’s probably a terrible choice for our root. So we’ll swap it down, pushing it towards whichever of its children are minimal.

Here’s another example. Suppose we start with the same tree:

      5
    /   \
   10    6
  / 
 11

Finding the minimum here is even easier than in search trees: it’s just the root! We’ll return 5 in the end, but first… we have to restore the invariant. We currently have this tree with a hole in it, marked XXX:

     XXX
    /   \
   10    6
  / 
 11

We’ll pick the ‘last’ element in the heap—that is, the rightmost element on the last level. Here, that’s 11. So now we have this tree:

     11
    /  \
   10   6

Unfortunately… it’s not a min-heap. We need to do some swapping. We look at our new root and ask: which child is the minimum? Here, it’s 6, because 6 < 10. If the minimum child is less than our node, we’ll swap. We continue this process until the minimum child is no greater than our node (or we have no children).

Since 6 < 11, we’ll swap, yielding:

     6
    / \
   10 11

A min-heap… success!

Exercise 3

Draw a min-heap that, when you dequeue, will swap 3 times. Before you start… how many levels do you need to start with to make 3 swaps possible?

The dequeue algorithm

Here’s some pseudocode for how to get a minimal element out of a min-heap:

  1. Save the root element to return later.

  2. Take the last element in the heap—the rightmost element in the lowest level—and make it the new root. (Observe that the tree is still complete).

  3. Starting at the root:

    a. Compute minimum of the two children of the current node.

    b. If the minimum is greater than the value at the current node, stop and return the original root saved earlier.

    c. Swap the current node and the minimum child’s values. Continue with (a), with the swapped child selecting the next ‘current’ node (i.e., if you the left child was the minimum, continue down the left subtree).

The algorithm has three phases: (1) remove the root and save it to return, (2) move the last element to the root, and (3) fix your mistakes. It’s O(log(n)), where n is the number of elements… because we swap from the root on a complete tree possibly down to a leaf—and we can only do that log(n) times before reaching a leaf, since a complete tree with n elements is log(n) high.

Implementing trees using arrays

As you’ve been thinking about these algorithms, you may wonder… how am I supposed to find the last-in-heap value? Isn’t that expensive? It’ll be hard to keep track of! It doesn’t seem fun to implement min-heaps using Nodes…

So, we won’t. Complete trees correspond really nicely to an array representation, which makes it easy to find the “last-in-heap” value for dequeue and the next available spot for enqueue.

Here’s the idea: we’ll flatten things in level order, left to right. So consider the following tree:

      a
     / \
    b   c
   / \ / \
  d  e f  g

We can represent it as the following array:

index0123456
valueabcdefg

Here’s another example:

    a
   / \
  b   z
 / \
 m g 
index01234
valueabzmg

Observe a few things:

This representation has some nice properties:

So we can implement a min-heap’s tree using… an ArrayList!

Finding parents and children

Concretely, we can use these rules to write down a complete tree as an array:

So the left child of index 0 is going to be 2*0+1=1, i.e., index 1. The right child of index 0 is going to be 2*0+2=2.

If a node has index i, its parent has index (i - 1)/2. Note that we mean integer division, which truncates. Let’s check our work:

Finally, note that the root node is its own parent: (0 - 1)/2 = -1/2 = 0. Be careful to avoid infinite loops!

Exercise 4

Encode your 7-value min-heap into an array.

Tying it all together

Implement a min-heap with the following signature:

public class Heap<E extends Comparable<? super E>> {
  private ArrayList<E> 
  public Heap(); // starts empty
  public int size();
  public void insert(E);
  public E removeMin(); // should throw an exception if empty
}

You’ll find it helpful to define (private) methods for swapping indices, as well as computing parent/child indices.