22C:21 Computer Science II: Data Structures: Lab 2

Week 2: Lab Document, 9/6

Arrays are wonderful data structures because they allow random access, that is, it takes a constant amount of time to access any slot of the array, independent of the size of the array. However, arrays have one big drawback: they are not dynamic. For example, the following code fragment defines an int array and allocates 20 slots of memory to the array.

	int[] numberArray;
	numberArray = new int[20];

Later, if we want to store 21 integers in the array, there is no cheap way of adding one extra slot to the array. This means that if we intend to use arrays, we should, at the time of programming, have an idea of the maximum number of slots needed.

While there is no cheap way to enlarge or shrink an array, there is a somewhat costly way to do it, illustrated below.

	int[] numberArray;
        numberArray = new int[20];
	...
	...
	...
	// Here, the need to store 21 integers in the array
	// arises; so the array has to expand to size at least 21

	// Declare a temporary int array
	int[] temp;

	// Allocate enough space for it
	temp = new int[21];

	// Copy contents of numberArray into temp
	// This is a costly step.
	for(int i = 0; i < 20; i++)
		temp[i] = numberArray[i];

	// Make numberArray point to temp
	numberArray = temp;

	// Place new element in slot 20

This approach to changing the size of the array is extremely costly because the entire contents of the original array has to be copied into the new array. But, in Java, there is no choice but to do this. Given this situation, it makes sense to try and minimize the number of times the size of the array has to be changed. Consider the code fragment shown above. Here the array size is increased just enough to accomodate the new element (i.e., the 21st element). But, note that since the array size was increased by just one slot, the array is full at the end of the above code fragment. Therefore as soon as another new element shows up, we would need to redo all the work. A better approach to deal with this situation is to increase the size of the array significantly, each time there is an "overflow." For example, in the above code fragment, when the 21st element shows up, one might increase the array size to 20,000 and then not have to worry about "overflow" for a long time. Of course, we do not want to increase the array size arbitrarily because that might lead to a lot of wasted space. For example, our application may end up using 100 slots, which is more than 20 but far less than 20,000; in this case allocating memory for a 20,000-slot array is a waste of memory. The approach to this problem that works quite nicely is to double the size of the array whenever it needs to be enlarged. For example, if we start with an array of size 20 and the 21st element shows up, then we enlarge the array to size 40, and when the 41st element shows up, we enlarge the array to size 80, etc.

Problem:

The class RecordDB maintains a database of employee Record objects in a array sorted by social security number. The provided implementation is static and in fact assumes that there will be at most 200 employees in the database at any one time. Using the above ideas, modify the implementation to make it dynamic; specifically, create a DynamicRecordDB class that is similar to the RecordDB class, except that whenever the insert method detects an overflow, it doubles the size of the array. For the DynamicRecordDB class, it makes more sense to have two constructors, as described below.
```
	// The default constructor; allocates an array of size 1.
	DynamicRecordDB();	

	// Allocates an array of size n.
	// If the user has an estimate of the maximum size of the employee
	// database, they might prefer to use this constructor. The database
	// would behave correctly even if the number of records exceeded this
	// maximum
	DynamicRecordDB(int n);
```
To test the efficiency of the above scheme, perform an experiment in which
1. a DynamicRecordDB object is constructed by calling the default (0-argument) constructor and
2. n records are inserted into the object.
Measure the total time, of all n insertions. Repeat the experiment for n = 10,000, 10,500, 11,000, 11,500, ..., 20,000. Here is an example of how to time a Java program. Plot the per operation running time as a function of n. The "per operation running time" is just the total running time divided by the number of operations, which is n.