CS 201 - Assignment 2

Due: Friday, March 7th by 11:59 PM

Benchmarking InfiniteArray

In Lab 5, we investigated the issues involved in implementing a generic collection class using an array as the underlying storage mechanism.  An instance of the InfiniteArray class is like a "plain" array, but expands its internal storage array as large as is necessary to store references to all of the elements which have been added to the collection.

When an element is added to the collection, but the internal array is currently full (the number of elements stored in the collection is equal to the size of the array), the class allocates a new, larger array, copies the existing elements in collection into the new array, and then updates the storage field to point to the new array.

When we allocate the new array, we must address the question of how large the new storage array should be.  Since the add method always adds element values to the end of the current sequence of elements, it is sufficient to allocate a new array whose length is one greater than the old array.  We can call this the grow-by-one policy.  Here are before and after pictures showing an element being added to an InfiniteArray object, causing the storage array to expand using the grow-by-one policy:


As we saw in class, adding N elements to a grow-by-one InfiniteArray requires O(N2) operations to copy elements when the storage array is re-allocated.  This means the more elements we add, the more copying is required to expand the storage.

A better policy is to double the size of the array every time we expand the storage: we'll call this the grow-by-doubling policy.  Here is the after picture of adding an element using the grow-by-doubling policy (before picture is same as above):


The grow-by-doubling policy is better than grow-by-one because adding N elements requires only O(N) operations to  copy elements.

Benchmarking the two policies

Your task is to benchmark the performance of InfiniteArray implementations using the two policies.

You can use the solution to lab 5 (infiniteArray.zip) as a starting point.  This code implements the grow-by-one policy.

You should write a benchmark program that creates instances of InfiniteArray that use the grow-by-one and grow-by-doubling policies, measuring the amount of time required to add N elements for increasing values of N.

Here is the output of my benchmark program:

============== Grow by one ==============
1000,14321
3000,61104
5000,231162
7000,707827
9000,1522535
11000,2822073
13000,4595909
15000,8066294
17000,12658356
19000,17875463
21000,25622846
23000,33794965
25000,44646621
27000,57466095
29000,74513623
31000,90897147
33000,112152969
35000,134877941
37000,163893087
39000,198098826
============== Grow by doubling ==============
1000,449
3000,10150
5000,548
7000,1921
9000,23827
11000,402
13000,1122
15000,325
17000,371
19000,1771
21000,463
23000,490
25000,3017
27000,580
29000,641
31000,658
33000,699
35000,761
37000,7039
39000,844

In each pair of numbers, the first number is the number of elements added to the InfiniteArray, and the second number is the number of microseconds required to add all of the elements.

Here is a graph of the above data (click for full size image):

Note that once we reach a sufficiently large number of elements, the grow-by-one implementation is much slower than the grow-by-doubling implementation.  In fact, the running time of grow-by-doubling is too small to actually see in the graph.

Here is how to measure the amount of time required to execute a snippet of Java code:

long begin, end;

System.gc(); // run the garbage collector so it doesn't affect the computation

begin = System.nanoTime();

...code snippet that you want to benchmark...

end = System.nanoTime();

long microSec = (end - begin) / 1000L;

Note that in my benchmark results, the times for the grow-by-doubling implementation varied erratically.  This is normal; it is hard to get repeatable measurements for computations that are very quick.

What to Submit

Submit a text report (MS Word .DOC, PDF, or OpenOffice) containing the following:

A brief description of how you implemented and measured the running time of the two InfiniteArray implementations.  Include the source code for both implementations.  (Using the Lab 5 solution as a starting point is fine.)

Your raw data.  It should look something like my raw data.

A graph showing the running time of the two implementations as the problem size increases.  It should look something like the graph above.  Don't worry if the grow-by-doubling data points are not visible.

Please don't submit an Office 2007 (.DOCX) file: I can't read them.

Upload your report to the Marmoset server as Project 2:

https://camel.ycp.edu:8443/