22C:21: Computer Science II: Data Structures

Project 1 Notes. Started on 2/22


  1. 2/22: I have finished turning the myListGraph into a generic class and testing it. This involved some amount of "trial and error" because I have never used generics before and there were some unexpected surprises. Here are the things I did to get the generic myListGraph working.

    Here are the updated versions of GenericLink.java and GenericLinkList.java.

  2. 2/22: I first tested the class on a small test program that our TA Uday Verma had created for testing his implementation of the myListGraph class. Here is the test program: MyListGraphTester1.java and here is the output it produces.

  3. 2/22: In order to test the generic myListGraph class more substantially, I used the Ladders program from Problem Set 3. But, the myListGraph class was missing methods such as getNeighbors and degree that were in the original myGraph class. So I added methods the following signatures to the generic version of the myListGraph class.
    // returns the list of neighbors of a given vertex
    public GenericLinkList<E> getNeighbors(E vertex)
    
    // returns the degree of a given vertex
    public int degree(E vertex)
    
    // returns the degree of a vertex with a given index
    public int degree(int index)
    

    While adding the above methods, I realized that I needed some more methods in the GenericLinkList class. So I added a data member called numLinks that keeps track of the number of links in the linked list and then I added methods, deleteFirst, copy, copyIntoArray, isEmpty, and size. These are all part of the updated GenericLinkList class.

    Now I was ready to test the generic myListGraph class with the Ladders program. Here is the Ladders program, I used: Ladders.java and here is the output it produces.

  4. 2/22: Now I wanted to test the generic myListGraph class with vertex names having a type that was not String. So I created a program called SmallWorld.java that distributes n points uniformly at random in a d by d square and connects pairs of points that are distance at most one. Note that this is Step (1) of the two steps described in the Project 1 handout.

    Here is SmallWorldGraph.java. Notice how it creates of graph whose vertex names are of type Integer. This program makes use of the Point class, discussed in class. Here is the output from a few runs of this program. Each output consists of three numbers: average degree, maximum degree, and minimum degree. Notice how as d, the dimension of the square gets smaller, the degree statistics increase, indicating more connections between vertices. To make it easy to produce these statistics, I added three methods to the generic myListGraph class:

    public double averageDegree()
    public int maximumDegree()
    public int minimumDegree()
    
    If you want to run SmallWorldGraph.java, you'll have to add these methods to the generic myListGraph class.

  5. 2/23: Now I implemented the clusteringCoefficient methods in the generic myListGraph class, with the following signatures.
    public double clusteringCoefficient(E v)
    public double clusteringCoefficient()
    
    To help implement these methods, I added a method to the generic myListGraph class that determines whether a given pair of vertices are connected by an edge or not. The signature of this method is:
    public boolean isEdge(E vertex1, E vertex2)
    
    I ran the following simple testing program: MyListGraphTester2.java and got this outputCC. It is easy to check by hand that the output is correct. One point to note is that the clustering coefficient of a vertex of degree 0 or 1 is defined to be 1.

  6. 2/23: The next step is to test the clusteringCoefficient methods more substantially. Here is a modification of the Ladders program that computes the clustering coefficient of a specified word in the ladders graph: LaddersCC.java. The program also calculates the clustering coefficient of the entire graph (which is quite high!). Here is the outputLaddersCC from a few runs. The output attempts to explain how the clustering coefficient was calculated and it should be easy to verify the answers by hand.

  7. 2/27: I implemented a breadth-first traversal method in the generic myListGraph class. I have posted the code I used: bfs. To test my implementation, I added some code to the Ladders program. Here is the Ladders program that, after constructing the ladders graph, performs a breadth-first search from the word "house" and prints out all 5-letter words organized by distance from "house:" LaddersBFS.java and outputLaddersBFS.

  8. 3/4: I added methods isConnected and averagePathLength to the generic myListGraph class. If a graph is disconnected, then the average path length is infinity. This is because a disconnected graph has pairs of vertices that are unreachable from each other and therefore the distance between such pairs is infinity. The averagePathLength function first checks if the graph is connected; if it is not connected, the function returns -1 (representing infinity); otherwise it computes the average path length by repeatedly calling breadth-first search and taking the average of all the computed pairwise distances.

  9. 3/4: As some of you have already noticed, breadth-first search runs slowly and it takes many hours to comupute the average path length of a 10,000 vertex graph. The problem, however, is not with breadth-first search, but with the "helper" function getIndex, that is repeatedly called. The function getIndex takes linear time (i.e., time that is proportional to the number of vertices in the graph). This is too expensive for a 10,000 vertex graph, given how often getIndex is called. The fix is quite simple because for the unit disk graphs and small world graphs we construct, vertex names are identical to their indices. Thus for such graphs vertex names can be used as indices! However, I don't want to change the current implementation of getIndex because for other kinds of graphs, vertex names and indices may not be identical (e.g., the Ladders graph). So I made a new graph class that is a subclass of the generic myListGraph class of type Integer. This subclass inherits all the methodsof the myListGraph class, but it has a fast (i.e., constant time) implementation of the getIndex function and a constructor of its own. I am posting part of my solution in unitDiskGraph.java where a subclass called unitDiskGraph is constructed. This program also contains a main method that tests the unitDiskGraph class. This is the output from running the main method and you will see that 9 average path length computations were performed on 10,000 vertex unit disk graphs with average degree between 12 and 20, in about 25 minutes. When d = 49, the generated graph is not connected and the average path length is returned as -1 after just one BFS call.