CMPU 102, Fall 2005 Lecture 11

Singly-linked lists, loop pre- and post-conditions, loop invariants

Implementing algorithms to deal with complex data structures like linked lists can be a challenge.  However, by thinking carefully about preconditions, postconditions, and loop invariants for algorithms the correct code practically writes itself.

Consider the following singly-linked list implementation:

class Node<E> {
    E payload;
    Node<E> next;
}

class LL<E> {
    Node<E> head;

    ...methods...
}

At runtime this data structure will look something like this:

This linked list has three nodes, each of which has a String object as the payload.

Algorithm 1: Traversing the list

One algorithm we may want to perform is visiting each node of the linked list.  How can we do this?  A linked list may contain an arbitrary number of nodes.  So, we are clearly going to need to use a loop.

An excellent way to reason about an algorithm using a loop is to define the preconditions, postcondition, and loop invariants of the algorithm.  In addition, we need to know what variable or variables will be used by the loop, and how they are checked and modified at each iteration.

Precondition
What must be true before the loop begins

Invariant
What must be true at each iteration of the loop

Postcondition
What must be true when the loop is finished

Increment
How the loop variable or variables are changed after each iteration of the loop.

For the algorithm traversing the list, we can define these as follows.  The loop variable will be cur, the current node.  We want the loop to cause cur to refer to each node in the list.

Precondition cur == head
Postcondition cur == null
Loop invariant cur refers to a node in the list
Increment cur = cur.next

The precondition cur == head ensures that we start at the beginning of the list, and the postcondition cur == null ensures that we eventually reach the end of the list.  The loop invariant ensures that each iteration of the loop has the desired effect: in this case we want the loop to visit each node of the list.  The loop increment specifies that after each iteration the loop will advance one position, possibly reaching the end of the list.

Once we have defined these properties for the algorithm, the code to implement the algorithm becomes trivial:

cur = head; // make precondition true

assert cur == head; // check precondition
while (cur != null) { // check loop invariant
    assert cur != null;

    visit node referred to by cur

    cur = cur.next; // loop increment
}
assert cur == null; // check postcondition

The assert statements both specify and check a precondition, postcondition, or invariant.  They are useful in debugging because they will throw an exception if the condition specified by the assertion evaluates as false.

Algorithm 2: Inserting a node in the list

A more complicated algorithm is inserting a new node in the list at a specified position.  (We saw in Lecture 9 that it is easy to insert a node at the beginning of the list.)  There are several cases to consider in the insertion algorithm:

To specify the position where the new node should be inserted, we will assume the existence of a method called insertBefore.  When we pass this method a node in the list, it will return true if the new node should be inserted immediately before the existing node.

As is often the case with linked list algorithms, it is best to consider each of these cases separately.  This will lead to a general algorithm of the following form:

Node<E> nodeToInsert = new Node<E>();
nodeToInsert.payload = new element to add to the list
nodeToInsert.next = null;

if (head == null || insertBefore(head)) {
    nodeToInsert.next = head;
    head = nodeToInsert;
} else {
    insert in middle or at end
}

The code for inserting at the beginning of the list is very simple, and is described by the following figure.  (Note that the question mark can represent either an empty list (null) or a nonempty list.  The same code works for both cases.)

The complicated cases occur when we need to insert the new node in the middle or at the end of a nonempty list.  As noted earlier, the complication arises because we need a reference to both the node before and after the point where we are going to insert the node.

One way to solve this problem is to use two loop variables: a "current" variable and a "previous" variable.  The previous variable always lags one step behind the chase variable.  Here are the precondition, postcondition, invariant, and increment we will use:

Precondition cur == head.next && prev == head
Postcondition cur == null, prev refers to last node of list
Loop invariant cur refers to a node in the list and prev.next == cur
Increment prev = cur; cur = cur.next;

The precondition is slightly different than the list traversal algorithm: cur starts out as the second node of the list.  This is because we want prev and cur to visit every pair of nodes in the list, and this cannot happen if cur starts at the beginning of the list.

We can turn these conditions into code as follows:

// make preconditions true
Node<E> prev = head;
Node<E> cur = head.next;

// check precondition
assert prev == head && cur == head.next;
while (cur != null) { // check loop invariant
    assert cur != null && prev.next == cur;

    if (insertBefore(cur)) {
        // Insert between prev and cur
        nodeToInsert.next = cur;
        prev.next = nodeToInsert;
        return;
    }

    // loop increment
    prev = cur;
    cur = cur.next;
}
assert cur == null && prev != null;

// Insert at end: prev refers to the last node in the list
prev.next = nodeToInsert;

The code to insert the new node between the nodes referred to be prev and cur works as follows:

Note that if the node is inserted in the middle of the list, we return from the entire method.  Obviously, once the new node has been inserted, there is nothing else to do.

Finally, the code to insert at the end of the list works as follows:

A general rule about inserting a node in a linked list

What makes inserting into a linked list challenging is making sure that we don't "forget" to re-attach any part of the list.  If we're not careful, it is very easy to lose track of some part of the list.  As an illustration, consider what would happen in the case of inserting a node at the beginning of the list if rather than writing

nodeToInsert.next = head;
head = nodeToInsert;

we had written:

head = nodeToInsert;
nodeToInsert.next = head;

This operation would yield the following result:

The entire previous contents of the list, represented by the question mark, has been lost!  For this reason, it is a good idea to always first modify the new node by creating new links into the existing structure of the list before modifying the existing list structure.