CS:2820 Notes, Lecture 5

Putting everything together from the previous lecture, we have the following class definitions for building our road network:

import java.io.File;
import java.util.LinkedList;
import java.util.Scanner;

/** Roads are one-way streets linking intersections
 *  @see Intersection
 */
class Road {
    float travelTime;         //measured in seconds
    Intersection destination; //where the road goes
    Intersection source;      //where the comes from
    // name of road is source-destination
}

/** Intersections join roads
 *  @see Road
 */
class Intersection {
    String name;
    LinkedList <Road> outgoing = new LinkedList <Road> ();
    LinkedList <Road> incoming = new LinkedList <Road> ();
    // Bug: multiple types of intersections (uncontrolled, stoplight)


/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        // Bug:  Must add code to see if there is a file name
        Scanner sc = new Scanner( new File( args[0] ) );
        // Bug:  What if the file doesn't exist?
    }
}

Recall also that we have decided to use a file containing a list of intersections and roads to describe the road network. Ignoring all details of how roads and intersections are described, the file might look something like this:

intersection ...
intersection ...
intersection ...
road ...
road ...
road ...
road ...
road ...

The skeletal code given above contains bugs, but it also contains a potential problem for programmers familiar with C or C++: The first command line argument after the program name is args[0]. This will seem strange to programmers accustomed to C or C++, where argv[0] is the program name and argv[1] is the first parameter after the program name.

Another problem for C or C++ programmers is that there is no count of command line arguments in Java. A C or C++ programmer would use an additional parameter to the main program, argc, to learn the count of the number of arguments. Arrays in C and C++ do not have a length attribute, but in Java, we ask the array args to tell us its length by using args.length.

Knowing the above, we can fix one bug in the program, but only at the cost of introducing another:

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else {
            Scanner sc = new Scanner( new File( args[0] ) );
            // Bug:  What if the file doesn't exist?
        }
    }
}

We'll put off the question of what to do if no input file is specified, but whatever it is, it will be very similar to what we do if the input file is specified but doesn't exist. In that case, the attempt to open the file within the scanner will throw an exception, and Java won't allow us to write code that could throw an exception without providing a handler. So, the skeleton of our main program code will look like this:

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else try {
            Scanner sc = new Scanner( new File( args[0] ) );
            // Bug:  Now we can process the file here
        } catch (FileNotFoundException e) {
            // Bug:  Complain that the file doesn't exist
        }
    }
}

Processing the text file

What does a scanner do? We can ask the scanner whether there is more input with sc.hasNext(). We can ask if the next input is a number with sc.hasNextInteger() or sc.hasNextFloat(). We can ask for the next string from the input with sc.next() or the next integer from the input with sc.nextInt().

The outermost loop of the road network initializer is pretty obvious: Read lines from the text file and process them. We could do all the processing in the outer loop, but that means that the outer loop needs to know about every detail of describing roads and intersections. One of the principle ideas behind object oriented programming is that all the aspects of each class should be encapsulated inside that class. How to read the description of a road, for example, is an issue that only matters to class Road.

Encapsulating everything about roads in class Road does have a potential downside. It means that the code to process the input language of our highway simulator will be scattered through the simulator. This makes it easy to modify details of roads, for example, but difficult to find out what the entire input language is. These kinds of design tradeoffs are unavoidable.

If we accept the decision to put all details of how roads are described in class Road, and to handle class Intersection similarly, we get code like this:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads;
    static LinkedList <Intersection> inters;

    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else try {
            Scanner sc = new Scanner( new File( args[0] ) );
            while (sc.hasNext()) {
                // until the input file is finished
                string command = sc.next()
                if (command == "intersection") {
                    inters.add( new Intersection( sc ) );
                } else if (command == "road") {
                    roads.add( new Road( sc ) );
                } else {
                    // Bug: Complain about unknown command
                }
            }
        } catch (FileNotFoundException e) {
            // Bug:  Complain that the file doesn't exist
        }
    }
}

The central tool used above is the next() method of class Scanner. You should look up class scanner to see all of its next methods, but the simplest of these is simply called next(). All than next() does is return the next string from the input stream. By default, successive strings in the input are delimited by things like spaces, tabs and newlines. Other next methods get the next integer, the next boolean, the next character, or the next float. We will use some of these later.

The above code assumes that we can use initializers for classes Road and Intersection to create a new class members, where the initializer is responsible for scanning the description of the new object from the source file. The code also assumes that we want to keep a list of all the roads we have scanned and all the intersections. At this point, we are not committing ourselves to do anything with these lists, but when the time comes to connect two intersections with a road, we'll have to look up those intersections somewhere.

Regardless of the number of spaces used for each indenting level, the above code is indented at close to the limit that can be easily understood. Psychologists say that the human mind can only handle about 7 plus or minus 2 different things in short-term memory, so once the number of levels exceeds 5, regardless of the visual presentation, a program will be hard to understand. We can resolve this by putting the loop outside the try block, but that makes it possible that sc could be null. Alternatively, we can move the bulk of the code in a second method:

    /** Initialize this road network by scanning its description
     */
    static void readNetwork( Scanneer sc ) {
        while (sc.hasNext()) {
            // until the input file is finished
            string command = sc.next()
            if (command == "intersection") {
                inters.add( new Intersection( sc, inters ) );
            } else if (command == "road") {
                roads.add( new Road( sc, inters ) );
            } else {
                // Bug: Complain about unknown command
            }
        }
    }

    /** Main program
     * @see readNetwork
     */
    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else try {
            readNetwork( new Scanner(new File(args[0])) );
        } catch (FileNotFoundException e) {
            // Bug:  Complain that the file doesn't exist
        }
    }
}

As we've already indicated, the above code assumes that the initializers for classes Road and Intersection exist. In order to write this code, we need to start fleshing out some details of the source file describing the road network. Here, we will assume that the first item in each line is the name of the road or intersection. Intersections have arbitrary names, while road names consist of a pair of interseciton names, separated by a space or tab. For roads, we'll assume that the next attribute is the travel time.

This is an inadequate way to describe roads. Later, we'll discover that some intersections have stoplights that alternately allow east-west travel and north-south travel. When we get to that point, we'll have to extend our naming convention so that a road can connect, for example, outgoing north from intersection A and incoming east to intersection B. For now, we'll ingore this, but we'll do so with the knowledge that our initial design is inadequate. The initial design gives us something like this:

intersection a
intersection b
intersection c
road a b 30
road b a 30
road a c 12
road c a 12
road b c 22

In the description of the data structures, we already decided that travel times are given in seconds. For long roads, it would be useful to give times in other time units such as minutes or hours. For now, we'll assume that times are given in seconds, with the full knowledge that this is an indadequate design.

Here is class Road with a preliminary version of a reasonable initializer to read from the above file:

/** Roads are one-way streets linking intersections
 *  @see Intersection
 */
class Road {
    float travelTime;         //measured in seconds
    Intersection destination; //where the road goes
    Intersection source;      //where the comes from
    // name of road is source-destination

    // initializer
    public Road( Scanner sc, LinkedList <Intersection> inters ) {
        // code here must scan & process the road definition

        string sourceName = sc.next();
        string dstName = sc.next();
        // Bug:  Must look up sourceName to find source
        // Bug:  Must look up dstName to find destination

        // Bug:  What if the next isn't a float
        travelTime = sc.nextFloat();
        string skip = sc.nextLine();
    }       
}

There are, of course, some bugs here! We need to do quite a bit of work, and, of course, we need to write the corresponding code for class Intersection.

Extreme Programming

Long before you have a useful program, you should be running it and testing it. Extreme programming, or XP as some advocates abbreviate it, recommends that any large programming project be developed incrementally. The classic model XP development model organizes the development process in one-week steps. During each week:

Monday: Develop specifications for the end product for the week.
Tuesday: Develop tests that the product must meet by the week's end.
Wednesday: Develop code to meet the specifications.
Thursday: Run tests and debug the code. The code should pass all tests including those for previous steps in the development cycle, excepting those tests that have been made obsolete by changes in the specifications.
Friday: Assess progress for the week.

At the end of the week, either you have reached the week's goals, with a completely tested body of code, or you have failed. If you fail, the XP methodology demands that you completely discard the week's work and start over. The reason for this extreme recommendation is that debugging is hard. If you did not get it right within a week, starting over may well get you to working code faster than grinding away at the bugs that just won't go away.

Consider a piece of code that took only a few hours to write. If you have a bug in that code, is it better to spend days trying to find and fix that bug, or is it better to discard the code and spend a few hours writing a replacement? If you try to write code to the same specification as before, you might make the same mistake, so the methodology here suggests you go back and toss not only the code, but also the specification, and re-specify, perhaps making the total work for this step smaller, so that you only reach the final goal after a larger number of smaller steps.

The important thing is that each weeklong step in the development cycle starts with working code from the previous successful step and (if successful) produces working code on which the next step can be built. The only thing you carry over from a failed development step is what you learned from that failure.

Note that this is a vast oversimplificaton of the XP methodology. The full methodology speaks about team structure, coding standards, and many other details. In my opinion, rigid adherence to the full methodology is likely to be foolish, particularly since the full XP methodology seems particularly oriented toward development that is driven by the GUI structure, or more generally, by input-output specifications. For software products that aren't GUI based, the methodology must be modified, and there is no reason to believe that one week is the right step size.

Regardless of that criticism, incremental development is extremely powerful and the basic framework above is very useful. The basic take-away from the XP model is: Take little steps. Don't try to get a whole large program working in one bang, but start with working code (for example, Hello World) and then augment it, one step at a time, until it does everything you want.

The Waterfall Model

At the opposite end of the spectrum from extreme programming, you will find what has come to be described as the waterfall model of software development:

Requirements: First, work out the requirements for the entire project. do not do any design work until the requirements are worked out in detail.
Design: Then, design the solution at a high level, working out all the internal components and their relationship. Do not do any implementation until a detailed design is completed.
Implementation: Write the code to implement the design. Complete writing the code before you do any testing.
Verification: Verify that the code meets the requirements. This involves extensive testing.
Maintain: Invariably, the requirements will change with time, and customers will find bugs.

Many software purchase contracts have been written assuming the waterfall model, where the requirements are written into the contract, and where the contractor is required to submit a complete design before implementation begins. If you look at the way bridges are designed and built, the waterfall model comes very close to describing a working system. First, you work out what the bridge should do, then you hire an engineering firm to complete the design, then you let bids for a contractor to build the bridge. Frequently, another engineering firm is hired to inspect the finished bridge, then you open it for traffic and take responsibility for maintenance.

In the world of software, the waterfall model has led to huge cost overruns and project failures, yet it seems to be a natural application of a methodology with a long proven track record. What's wrong?

The problem with the waterfall model is that it assumes that you can check your specification and have some confidence in it before you start coding. Unfortunately, most software projects are too complex for that. Until you have working code, you can't be sure that the specificaitons are right. Similarly, it assumes that you can look at the code and verify that it meets the specifications without trying it, and it assumes that your testing will determine that the code meets the specifications. In the real world of software, the end user will invariably try things that weren't covered in the tests, finding bugs, and many of those bugs will be due to flaws in the initial design.

5. Constructors

Where were we?

Processing the text file

Extreme Programming

The Waterfall Model