CMPU 102, Spring 2006 |
Lecture 7 |

A *linked list* is a sequence
of *nodes*, each of which stores one element of the list.
The list element stored in a node is referred to as the node's *payload*.
Each node has a field storing a
reference to the next node in the list. The final node contains
the value **null** in its next reference, to indicate the end of the
linked list.

It could be complicated to require code using a linked list
to be concerned with nodes and next references. So, most data structures
implemented by a linked list use a *list header* object to
do all of the manipulation of the nodes and the elements they contain.
The list header will have a **head** field to point to the first
node in the list, and will define methods to access and manipulate the list.

Stacks can be represented using a linked list. The **head** field
points to the node whose payload is the top of the stack. Here
is how the stack formed by the operations

push "C"push "B"push "A"

would look:

Here is a linked list implementation of a stack. Note that we need a separate
**Node** class to define the list nodes.

class Node<E> { public E payload; public Node<E> next; } public class LinkedListStack<E> implements Stack<E> { private Node<E> head; public LinkedListStack() { head = null; } public void push(E element) { Node<E> node = new Node<E>(); node.payload = element; node.next = head; head = node; } public E pop() { E result = getTop(); head = head.next; return result; } public E getTop() { if (isEmpty()) throw new IllegalStateException("stack is empty!"); return head.payload; } public boolean isEmpty() { return head == null; } }

You may want to work through the **push** and **pop** methods
to see how the **head** and **next** fields of the list header
and node objects are updated. When modifying a linked list,
you have to be careful to update the references in the correct order.

*Algorithm analysis* refers to examining an algorithm and determining,
as a function of the size of its input, how many steps the algorithm will
take to complete.

General rules:

- a program statement that is not a function call: 1 step
- loop executing
*n*iterations:*n***m*, where*m*is the cost of a single loop iteration - method call: however many steps the method body will take given the arguments passed to the method

Usually, we are interested in the *worst case*: the absolute upper
limit for how many steps the algorithm will take. Sometimes we may be
interested in the *average case*, although what consistitutes the
*average* case can be difficult to define and complicated to analyze.
The worst case is usually fairly easy to figure out, and algorithms
with good worst case behavior give us confidence that our program will
run efficiently no matter what input we give it.

A simple example: a linear search for an array element matching a specified value:

public static<E> int findElement(E[] array, E element) { for (int i = 0; i < array.length; i++) { if (array[i].equals(element)) return i; } return -1; }

In the worst case (the array doesn't contain the element we're looking for), the loop will execute once for each element of the array. Each loop iteration executes 1 statement (the if). So, the worst case running time of this algorithm is

N+ 1

where *N* is the length of the array. (We tacked on an extra step
for the return statement at the end.)

A more complicated case: finding out whether or not an array contains duplicate elements:

public static<E> boolean containsDuplicates(E[] array) { for (int i = 0; i < array.length; i++) { Element element = array[i]; for (int j = i + 1; j < array.length; j++) { Element other = array[j]; if (element.equals(other)) return true; } } return false; }

This algorithm is harder to analysis because the number of iterations of the inner loop is determined by which iteration of the outer loop is executing.

It is clear that the in the worst case, the outer loop will execute
once for each element of the array. The inner loop executes once
for each element of the array past element *i*. We'll say that
the inner loop executes two statements, and that one statement
executes before the inner loop (element = array[i]). So, as a series,
the number of steps performed by the nested loops together is something like:

= (1 + 2(N-1)) + (1 + 2(N-2)) + ... + (1 + 2(1)) + (1 + 2(0))

=N+ 2(N* (N/2))

=N+N^{2}

(Recall that the sum of the series 1 + 2 + ... + *N*-2 + *N*-1 is *N***N*/2.)

Tacking on an extra step for the final return statement and putting the terms in canonical order, we get a worst case cost of

N^{2}+N+ 1

where *N* is the length of the array.

In analyzing an algorithm, we are generally interested in its *growth*
as *N* increases, rather than an exact number of steps. *Big-O*
refers to characterizing the growth of the exact cost T(n) of an algorithm
in relation to a simpler function f(n). Specifically,
the exact cost T(n) of an algorithm is O(f(n)) iff

There exists some constantCsuch thatC* f(n) >= T(n) for all sufficiently large values of n

Visually, f(n) is an upper bound for T(n) once we reach some sufficiently large value of n:

Finding the big-O bound for an algorithm is really easy, once you know its exact cost:

- Discard all terms except for the
*high order*term - Discard all constants

You can find the high order term according to the following inequalities (assume k > 1):

1 < log n < n^{1/k}< n < n^{k}< n^{k+1}< k^{n}

So, our algorithm to search an array for a specified element is O(N),
and the algorithm to determine whether or not an array has duplicate
elements is O(N^{2}). In the second case, we had both
*N* and *N*^{2} terms, but *N*^{2}
dominates *N*.

One of the nice things about analysis using big-O is that you
can *immediately* drop low order terms. For example,
it is perfectly valid to say things like:

That inner loop is O(N^{2}), and it executes in an outer loop that is O(N). So, the entire algorithm is O(N^{3}).