22C:21: Computer Science II: Data Structures

Problem Set 4. Posted 2/8. Due on 2/15.


Notes: Problem 1 is your homework and it is due in your discussion section meetings on Thursday, 2/15. Of the remaining two problems, one will be solved by your TA in the discussion section meetings on Thursday 2/15. The remaining problem will be your quiz; this will occur at the end of the discussion section meeting (on 2/15), you will have 20 minutes for the quiz and you are allowed to consult your notes and textbook.

  1. One of the bottlenecks in the current implementation of the Graph class is the getIndex method that is slow because it performs a linear scan and is also used very often. Change the current implementation of the Graph class by changing how the names of vertices are stored. Instead of using a simple array of strings for the names, use the Dictionary class to store the vertex names. Note that the Dictionary class provides the insert, search, and delete methods, all of which will be quite useful to the Graph class.

    Here are more details. Suppose we have a graph with 5 vertices and a new vertex Boston comes along. Adjacency information pertaining to Boston would be stored in row 5 and column 5 of the adjacency matrix. Therefore, we would insert the pair (Boston, 5) into a Dictionary object that keeps track of vertex names and their indices. Later, if we want to get the index of the vertex Boston we would search the Dictionary object for Boston. Not only would we find Boston, we would also find the corresponding index 5.

    Therefore the tasks you need to complete are:

    1. Change the Dictionary class so that it can store a String-int pair.
    2. Change the Graph class so that vertex names are no longer stored in a String array, but are stored in the above mentioned Dictionary object.

    What to submit: The Link class, LinkList class, the Dictionary class, and the myGraph class. The names of these files should be Link.java, LinkList.java, the Dictionary.java, and myGraph.java. These should all be placed in a directory called homework4 and that is what you should submit. We do not want any other files, we do not want files with other names, and we do not want these files to be submitted separately. We will test your implementation of the myGraph class with the Ladders program from the previous homework.

    Some clarifications:

    1. You are only required to implement the following public methods in the myGraph class: (i) the constructor, (ii) addVertex, (iii) addEdge, and (iv) getNeighbors(String). Of course, you also have to implement the getIndex method since speeding up that method is the whole purpose of using a hash table. You do not have to implement any of the other methods and in particular do no have to worry about resizing.
    2. To store a (String, int) pair in a Link object, simply make the Link class have a String data member and an int data member. Note that you are expected to submit your modified Link class.
    3. Even though a (String, int) pair is being stored in a Link object, you should only use the String to hash. This means that you can use the current implementation in the Dictionary class as it is.
    4. Finally, even though vertex names are baing stored in a Dictionary object, you should still retain the names array. The reason is this: the Dictionary object allows you to go quickly from a vertex name to an index, but you also need to go quickly from an index to a vertex name and the names array will help with that.

  2. The trie data structure ("trie" is pronounced like "try") is another interesting data structure for maintaining collections of words. A trie is particularly suited for applications such as spelling checkers in which when an incorrectly spelled word is found, possible correct replacements need to be suggested. A trie would work nicely for the Ladders program also.

    Wikipedia has a nice entry for the trie data structure. To explain what a trie is, let me consider the same example that is presented at wikipedia. You should follow the picture there, as you read this explanation. Suppose that we want to store the words {to, tea, ten, in, inn}. We have one node representing our entire collection, called the root node. This root node has an array of size 26 with one slot in the array for each lower letter. The slot in the array corresponding to character 'a' would point to the collection of all words that begin with 'a'. The slot in the array corresponding to character 'b' would point to the collection of all words that begin with 'b', and so on. In our example, we have no words starting with 'a' and therefore slot in the array corresponding to character 'a' would be null. However, the slots corresponding to characters 'i' and 't' would point to other nodes, that respectively represent all words that start with 'i' and all words that start with 't'. Let us call these, node i and node t respectively. Let me now explain what is stored in node t. Just like the root node, node t also has an array of size 26. Slot 'a' in this array points to all words that start with 'ta', slot 'b' points to all words that start with 'tb', and so on. Since there are no words in our collection that start with 'ta', slot 'a' in node t will have a null. However, slot 'e' in node t will point to another node, called node te, and representing all words in the collection that start with 'te'.

  3. Implement the insert and search methods for the trie class. Each of these methods will take a single String argument. There is no need for error checking - you may assume that the given strings only contain lower case letters.