One naive approach for using an array to implement a positional list is to store the list items in elements 0..n-1 of the array, where n is the current length of the list. We would also need to keep track of the current position, as well as the number of items currently in the list, and so we would introduce instance variables for each of these in our class, which might begin like this:
public class ListViaNaiveAry {
private Object[] contents; // array holding the list items
private int numItems; // # of items in the list
private int crrntPos; // points to array element holding current item
Recall that a position in a list corresponds to one of the items in it, except for the rear position, which follows the last item. The purpose of the instance variable crrntPos is to indicate the array location holding the current item. As a special case, its value will be equal to numItems when the current position is the rear.
To explore this representation scheme, let's try to implement several of the operations that we included in our description of the ADT list.
Observers:
lengthOf(): return numItems; isEmpty() : return lengthOf() == 0; atFront() : return crrntPos == 0; atRear() : return crrntPos == numItems; getObj() : return contents[crrntPos];Recall that getObj() has as its precondition !atRear(). If we had chosen to implement getObj() in accord with the defensive programming style (in which a method assumes responsibility for verifying that its precondition is met), we could have written it as follows:
getObj() : if (atRear())
{ throw an exception }
else
{ return contents[crrntPos]; }
So we see that this scheme gives us simple and efficient (constant-time)
ways of implementing the observer operations. Let's turn to mutators.
Navigation Mutators:
toFront(): crrntPos = 0; toRear() : crrntPos = numItems; toNext() : crrntPos = crrntPos + 1; toPrev() : crrntPos = crrntPos - 1;
Employing the defensive style in implementing toNext(), we get
toNext(): if (atRear())
{ throw an exception }
else
{ crrntPos = crrntPos + 1; }
The defensive version of toPrev() is similar.
Content Mutators:
replace(x): contents[crrntPos] = x;
We leave it to the reader to devise the defensive version of replace().
So far, everything looks fine. But we will see now that the insert() and remove() operations are problematic insofar as there appears to be no way to make them run any faster than in linear time (i.e., time proportional to the length of the list).
remove() : // shift contents[crrntPos+1..numItems-1] one place to the left
for (int i = crrntPos; i != numItems-1; i = i+1)
{ contents[i] = contents[i+1]; }
numItems = numItems - 1;
insert(x) : // shift contents[crrntPos..numItems-1] one place to the right
for (int i = numItems; i != crrntPos; i = i-1)
{ contents[i] = contents[i-1]; }
numItems = numItems + 1;
contents[crrntPos] = x;
We see that, to carry out remove(),, the values in the segment contents[crrntPos+1..numItems-1] are shifted to the left one place. On average, then, we can expect that about half of the items in the list will be shifted, and, in the worst case, all of them will be shifted. It follows that this a linear time operation.
Given the constraints of the representation scheme that we have chosen, there seems to be no way to avoid having the insert() and remove() operations take time linear in the number of elements in the list. This leads us to wonder whether there exists a different array-based representation scheme that allows for more efficient versions of these operations.
Recall that, in devising an efficient array-based representation for queues, the key idea was to allow the element at the front of the queue to be stored at any of the locations in the array, followed by all the others (in a "wrap-around" fashion). Taking a similar approach here will help. But we need to go a step further. Because insertions and deletions occur only at the ends of a queue, keeping a queue's items stored in a contiguous array segment never requires that a sub-segment be shifted in order to plug a hole caused by a deletion or to create a hole needed for an insertion. Such is not the case for lists, in which insertion and deletion may occur anywhere. To make array segment shifting unnecessary, we relax the condition that the list items must be stored in a contiguous segment. Indeed, we allow all of the list items (not only the first) to be stored at arbitrary locations in the array. But then how do we keep track of which item is first, which is second, and so on? The answer is that, for each item, we also store the locations of its predecessor and successor.
To illustrate such a scheme, consider again this list of animals:
+---+ +---+ +---+ +---+ +---+ +---+ +---+
| C | | D | | B | | A | | O | | Y | | C |
| A |----| O |----| U |----| N |----| W |----| A |----| O |----x
| T | | G | | G | | T | | L | | K | | W | ^
+---+ +---+ +---+ +---+ +---+ +---+ +---+ |
^ |
| |
current rear
position
Following the suggestions above, one possible representation would be the
following, which uses three arrays, contents[], pred[],
and succ[], as well as two int variables,
front and crrntPos.
pred contents succ frontPos crrntPos
+----+ +--------+ +----+ +----------+ +--------+
0 | 8 | | null | | -1 | | 5 | | 1 |
+----+ +--------+ +----+ +----------+ +--------+
1 | 5 | | DOG | | 10 |
+----+ +--------+ +----+
2 | 3 | | OWL | | 7 |
+----+ +--------+ +----+
3 | 10 | | ANT | | 2 |
+----+ +--------+ +----+
4 | | | | | |
+----+ +--------+ +----+
5 | -1 | | CAT | | 1 |
+----+ +--------+ +----+
6 | | | | | |
+----+ +--------+ +----+
7 | 2 | | YAK | | 8 |
+----+ +--------+ +----+
8 | 7 | | COW | | 0 |
+----+ +--------+ +----+
9 | | | | | |
+----+ +--------+ +----+
10 | 1 | | BUG | | 3 |
+----+ +--------+ +----+
11 | | | | | |
+----+ +--------+ +----+
12 | | | | | |
+----+ +--------+ +----+
Notice that, for each item in the list, (a reference to) it is contained in one of the elements of contents[] and the corresponding elements of pred[] and succ[] "point to" that item's predecessor and successor, respectively. (We use -1 to denote a null pointer.) The values stored in array elements whose contents are not shown are irrelevant. The values of frontPos and crrntPos allow us to locate quickly the front and current positions, respectively, of the list. We don't need rearPos, because we will always use the zero-th element of each array to store information regarding the rear of the list.
Notice that it doesn't matter in which element of contents[] a particular list item is stored, as long as the corresponding elements of pred[] and succ[] correctly indicate the locations of that item's predecessor and successor, respectively, and as long as frontPos and crrntPos point to the right places. Indeed, within this scheme, the same list has many different possible representations (12!/6!, in fact), of which the above is only one example.
For a hint as to why this representation scheme is superior to the naive one that was explored earlier, consider the modifications needing to be made in order to get the result of applying remove(). The item at the current position is DOG. After removing it, the list looks like this:
+---+ +---+ +---+ +---+ +---+ +---+
| C | | B | | A | | O | | Y | | C |
| A |----| U |----| N |----| W |----| A |----| O |----x
| T | | G | | T | | L | | K | | W | ^
+---+ +---+ +---+ +---+ +---+ +---+ |
^ |
| |
current rear
position
With respect to our representation, then, the element of pred[] corresponding to BUG (which had been DOG's successor but is now CAT's) should be changed to point to CAT. Similarly, the element of succ[] corresponding to CAT should point to BUG. Also, crrntPos should be changed to point to BUG, which was the removed node's successor. That's all! Removing an item from a list is accomplished simply by modifying three int variables, which can be done in constant time!
Note: If the removed item were at the front of the list, we would have changed frontPos rather than the element of succ[] corresponding to that item's (non-existent) predecessor.
The updated representation is as follows:
pred contents succ frontPos crrntPos
+----+ +--------+ +----+ +----------+ +--------+
0 | 8 | | null | | -1 | | 5 | | 10 |
+----+ +--------+ +----+ +----------+ +--------+
1 | | | | | |
+----+ +--------+ +----+
2 | 3 | | OWL | | 7 |
+----+ +--------+ +----+
3 | 10 | | ANT | | 2 |
+----+ +--------+ +----+
4 | | | | | |
+----+ +--------+ +----+
5 | -1 | | CAT | | 10 |
+----+ +--------+ +----+
6 | | | | | |
+----+ +--------+ +----+
7 | 2 | | YAK | | 8 |
+----+ +--------+ +----+
8 | 7 | | COW | | 0 |
+----+ +--------+ +----+
9 | | | | | |
+----+ +--------+ +----+
10 | 5 | | BUG | | 3 |
+----+ +--------+ +----+
11 | | | | | |
+----+ +--------+ +----+
12 | | | | | |
+----+ +--------+ +----+
A list class based upon this representation scheme might begin like this:
public class PosListViaAry<T> implements PosList<T> {
private T[] contents; // array in which list items are stored
private int[] pred; // predecessor pointers
private int[] succ; // successor pointers
private int frontPos; // points to array element holding first item
private int crrntPos; // points to current position
Let's attempt to code some of the observer and navigation operations, just to verify that they can be done in constant time.
Observers:
isEmpty() : return frontPos == 0; lengthOf() : ? atFront() : return crrntPos == frontPos; atRear() : return crrntPos == 0; getObj() : return contents[crrntPos];
Navigational Mutators:
toFront() : crrntPos = frontPos; toRear() : crrntPos = 0; toNext() : crrntPos = succ[crrntPos]; toPrev() : crrntPos = pred[crrntPos];
Our implementation of isEmpty() is based upon the observation that a list is empty if and only if its front and rear positions coincide. As we always store information about the rear at location zero, it suffices to compare frontPos to zero. Our implementations of atFront(), atRear(), and getObj() are similarly obvious.
We run into a problem with lengthOf(), however. To compute the length of a list would seem to require either traversing the entire list (by getting to the front and following the succ pointers until arriving at the rear, and counting along the way). But this would take time proportional to the length of the list! This is what we are trying to avoid!
Another approach is to examine each element of contents[] to determine whether or not it is "occupied". (The latter assumes that there is some way to make the distinction between occupied and unoccupied array elements. If we stipulate that an element of the succ[] array that is logically unoccupied is to contain the value -2, then such a distinction can be made.) However, this method requires time proportional to contents.length, which may be significantly larger than the length of the list!
If the representation scheme we have in mind does not admit a constant-time algorithm for lengthOf(), perhaps we can adjust the scheme so that it does! What if we simply introduce a new instance variable of type int, called numItems, to our class? As suggested by its name, the purpose of this variable is to store the number of items in the list, i.e., its length. This gives us a very simple (and constant-time) algorithm for lengthOf():
lengthOf() : return numItems;
We could also rewrite isEmpty() to utilize the new variable, as follows:
isEmpty() : return numItems == 0;
Thus, we have solved the problem, right? Well, maybe! By introducing a new variable with the associated invariant numItems == length of list, we have imposed an extra computational burden upon all the operations that cause a list's length to change. It is conceivable that, in order to maintain this invariant, some operation that otherwise could have been accomplished in constant time will now require greater-than-constant time. A few moment's reflection, however, dispels this notion. The only operations that could change the length of a list are remove() and the various insert()'s. In each case, the modification needing to be made to numItems is trivial.
Now let's consider the remove() operation. As noted above, it requires changes to only three variables. However, it is critical that the changes be made in the correct order. (You will see that in working with pointers, it is often the case that you must be very careful about the order in which values are changed.)
remove():
if (atRear())
{ throw an exception }
else {
int predLoc = pred[crrntPos]; //location of predecessor of item being removed
int succLoc = next[crrntPos]; //location of successor of item being removed
if (frontPos == crrntPos) //if item being removed is at front,
{ frontPos := succLoc } //its successor becomes the front
else
{ succ[predLoc] = succLoc; } //connect predecessor to successor
prev[succLoc] = predLoc; //connect successor to predecessor
crrntPos = succLoc; //successor becomes current position
numItems = numItems - 1;
}
Notice that the above takes constant time!
What about insertions? To perform one, it suffices to
Steps (2) through (6) are easy. Taking the case of the insert(newObj) operation, which inserts the new object immediately before the current position, we get
One solution is to maintain the following representation invariant: Elements 0 through numItems of the three arrays are occupied and the rest are not. (Recall that element zero is used for storing the rear, so we need numItems + 1 locations in total.) Thus, the first unoccupied element is always at location numItems+1. This makes finding an unoccupied element very easy, which makes step (1) in the informal insertion algorithm simple. But it complicates the implementation of remove(), because it must be modified so as to leave the arrays in a state satisfying the new representation invariant. For example, suppose that an application of remove() is to have the effect of removing, from a 57-item list, the one that happens to be stored at location 24 of the arrays. Upon completion, locations 0 through 56 of the arrays should be occupied and the rest unoccupied. But the straightforward implementation of remove() given above would leave location 57 occupied and location 24 unoccupied. How can it be fixed? One way is to transfer, in each of the three arrays, the value in location 57 to location 24. In addition, any pointer to location 57 should be changed to point to 24. But the only such pointers are succ[pred[57]] (or else frontLoc, if the item at location 57 happened to be at the front of the list), pred[succ[57]], and possibly crrntPos.
Let's rewrite remove() to take account of this (as well as the numItems variable that we added since discussing it):
remove():
if (atRear())
{ throw an exception }
else
int predLoc = pred[crrntPos]; //loc. of predecessor of item to be removed
int succLoc = succ[crrntPos]; //loc. of successor of item to be removed
if (frontLoc == crrntPos) //if item being removed is first,
{ frontLoc := succLoc; } //its successor becomes the front
else
{ succ[predLoc] = succLoc; } //connect predecessor to successor
pred[succLoc] = predLoc; // connect successor to predecessor
int vacantLoc = crrntPos;
crrntPos = succLoc; // successor is now current pos
if (vacantLoc == numItems) {
// no action needed
}
else {
// now adjust arrays by moving contents of last occupied element
// into the element that was just vacated
contents[vacantLoc] = contents[numItems];
pred[vacantLoc] = pred[numItems];
succ[vacantLoc] = succ[numItems];
// now make all pointers to the last occupied element point to
// the element that had been vacated
if (pred[vacantLoc] == -1)
{ frontLoc = vacantLoc; }
else
{ succ[pred[vacantLoc]] = vacantLoc; }
pred[succ[vacantLoc]] = vacantLoc;
if (crrntPos == numItems)
{ crrntPos = vacantLoc; }
}
numItems = numItems - 1;
A different approach for solving the problem of quickly finding an unoccupied array element is to use an avail chain. Add a new instance variable avail to the class; its purpose is to point to the "first" unoccupied array element. For each unoccupied location i, succ[i] points to the "next" one (or has the null pointer value -1 if there is none). When an unoccupied element is needed (for an insertion), avail provides the location of one in constant time. To update avail, make it point to the "second" unoccupied element, which is given by succ[avail]. When an item is removed, the location i of the vacated element is placed at the beginning of the avail chain by setting succ[i] to avail and setting avail to i.
What is troubling about this approach is that construction of a brand new empty list takes time proportional to contents.length, as the initial avail chain must include every location, and thus each element of succ[] must be initialized to point to some other element. (One way to achieve this is to set avail to contents.length - 1 and, for each i in the index range, succ[i] = i-1.)
It seems rather ironic that our representation scheme is such that its most expensive operation is the construction of an empty list!
One slight advantage that the avail chain approach has over the earlier one is that, under it, remove() is much simpler.
In order to get the advantages of both approaches, a hybrid approach is possible. Here, an avail chain is maintained, but it includes only unoccupied array elements that had been occupied at some point in the past. In doing an insertion, if the avail chain is empty, the first never-occupied array elements (which will be the ones at location numItems + 1) are used for storing the new item.