# Exam 1: Midterm

## Solutions and Commentary

### Midterm Exam

```                                  X
X
Median = 7.4                      X
Mean   = 6.88             X     X X     X
X     X X   X X   X X
X   X X X   X X   X X X X
X       X   X X X   X X   X X X X
_______X_X_X___X___X_X_X_X_X_X_X_X_X_X_X_X_X_______X_____________
0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15
F    |    D    |    C    |    B    |    A
```

If I had to offer letter grades based just on this exam, they might be something like the above.

### Machine Problems 1 and 2

```                                    X
X   X
Median = 8.0                        X   X
Mean   = 7.58           X           X X X X
X           X X X X X
X           X X X X X
X   X   X   X X X X X
X   X     X   X   X   X X X X X
_____________X___X_X___X___X_X_X_X_X_X_X_X_X_
0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10
```

### Homeworks 1 to 7

```Median = 12.6                 X
Mean   = 12.67                X X     X   X X   X
X         X X X     X   X X X X
X X X     X X X X   X X X X X X X
_______X_X_X___X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X___X_X_X_________
5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20
```

### Overall Scores

```                                                X
Median = 28.5                             X     X
Mean   = 27.13          X     X           X   X X
X     X   X       X   X X X X
X     X X   X   X X   X X X X X X X X
_________X___X___X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_________
10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40
|     D     |     C     |     B     |     A
```

The above grade scale was used for assigning midterm grades. The emphasis in midterm grading is to giving adequate warning to low performers in order to give them time to get their act together. Not having Machine Problem 3 graded yet means that these scores may not truly represent the quality of work turned in to this point.

## Solutions and Commentary

1. Background: The launch() routine for the posted solution to MP2 contained something like the following code:
```void launch() {
if (fork() == 0) { /*child*/
int i;
execve( argv[0], argv, environ );
getparsepath();
for (i = 0; patha[i] != NULL; i++) {
execve( gluepath( patha[i], argv[0] ), argv, environ );
}
printf( "no such command\n" );
exit( EXIT_FAILURE );
} else { /*parent*/
wait( NULL );
}
}
```

The getparsepath routine gets the value of \$PATH and separates it into its components, in the array patha[]. The gluepath() routine returns a pointer to the strings it is passed, concatenated with a / between them.

a) Give an brief example command line input to mush that would cause the first execve() above to succede.

/bin/echo hello world

1/2 got this right, 4/5 earned no credit. Echoing hello world was the most popular answer, but there was nothing wrong with ls or other shell commands. Those who left off the leading slash (/) earned partial credit, but giving just echo without the leading /bin/ earned no credit, since this cannot cause the first execve() command to succeed.

b) There is an exit() above that only works when the user types nonsense into the shell. Yet, for the wait() in the parent branch to be satisfied, the code in the child branch must terminate with an exit(). Where is the exit() in the normal case, for example, when the user types in echo hello world?

The exit() at the end of the launched applicaton (echo in this case) does the job.

1/4 got this right. 1/3 earned no credit. Some partial credit was offered to those who declared that the exit was part of execve(), as if execve() waited for the launched application to terminate.

c) The user's search path is: /bin::. (including the final dot). If the user types /bin/echo hello the output is the same as the user gets from echo hello. What happens when the user types bin/echo hello -- and why?

The second element of the path, the empty string, will be used, giving the concatenation of "" with "/" with "bin/echo" giving "/bin/echo" which will work.

Only 1/7 got this right, while 3/4 earned no credit. Apparently, many did not notice or understand the "::" (empty component) in the search path (or perhaps they thought it was a typo).

2. Background: When you launch an application using execve( ..., argv, ... ), the main program of the application is eventually called, main( char ** argv, int argc ). In both cases, argv is documented as being a pointer to the first of an array of pointers to null-terminated character strings. Recall that execve() abandons the program that called as it loads the new program.

a) How did the system compute the value of argc -- notice that it is not passed as a parameter to execve(), yet it is passed to the launched application.

The code of execve() had to copy the arguments. As it did so, it could easily count them in order to compute argc.

1/2 got this right, 2/5 earned no credit. Typical wrong answers involved discussion of how the shell parses the arguments, all of which happens before execve() gets them, or asserted that argc is a global variable and therefore need not be passed, or confused argv and argc.

b) Is the value of the pointer argv passed to execve() the same as the value of the pointer argv received by the launched application? If not, why not?

No, the value of the pointer is different because the arguments must be copied from the address space of the caller to the address space of the newly launched application.

2/5 got this right, 1/3 earned no credit. Partial credit was given for just saying no without giving a reason or for garbled reasons. A significant number of those who said yes seem to have confused the value of the pointer (a different memory address) with the value of the string pointed to (the same, because it is a copy). Confusion about the difference between pointers and the values pointed to is common among beginning programmers, but by the level of this course, it is hard to excuse.

3. Background: Consider this C code based on the posted solution to MP3 to implement the setenv command:
```a  } else if (!strcmp( argv[0], "setenv" )) {
b          if (argc < 1) {
c                  printf( "setenv with missing argument\n" );
d          } else if (argc == 1) {
e                  int ret = setenv( argv[1], "", 1 );
f                  if (ret != 0) printf( "bad variable name\n" );
g          } else if (argc == 2) {
h                  int ret = setenv( argv[1], argv[2], 1 );
j                  if (ret != 0) printf( "bad variable name\n" );
k          } else {
m                  printf( "setenv with extra arguments\n" );
n          }
p  } else ...
```

The following is an un-edited transcript of the result of compiling and testing the above code:

``` 1  >echo \$gribble
2  no such variable: \$gribble
3  >setenv gribble
4  >echo \$gribble
5
6  >setenv \$gribble toast
8  >setenv gribble limnoria
9  >echo \$gribble
10  limnoria
```

a) What was the value of argc as a result of line 3 of the above test transcript?

The value is 1.

4/5 got this right, 1/10 earned no credit. Partial credit was offered to those who said 2.

The correct answer could be inferred either from an understanding of how argc is counted, from the manual pages and the C language documentation, or from an examination of the C code given. In that code, only the argc values 1 and 2 lead to setenv() being called, so the value must have been 1 or 2. In the case of the value 2, the value set is argv[2], a real value, while in the case of the value 1, the value set is the empty string -- consistent with the blank output from the echo command on line 4.

b) What is the difference between the use of \$gribble on lines 1 and 4.

On line 1, \$gribble is undefined. On line 4, \$gribble has been defined with the empty string as its value.

4/7 got this right. 1/17 got no credit. Partial credit was given for giving the answer only for line 1 or only for line 4, leaving the other half unstated, or for saying that it is defined on line 4 without giving the value (or giving the incorrect value, for example, the null pointer -- a value quite different from an empty string).

c) Which line of the test caused line e of the code to execute.

Line 3 of the test cause line e of the code to execute.

4/5 got this right, 1/7 earned no credit.

d) What was wrong with the variable name offered on line 6?

setenv() returned an error indiction because the empty string "" is not a legal variable name.

1/5 got this right, 1/5 earned no credit. The majority said that \$gribble was not a legal variable name because it contained a dollar sign, ignoring the fact that the shell did dollar-sign substitution before it ever noticed that the command name was setenv. This may indeed be the programmer's error, but there are legitimate reasons for the programmer to type setenv \$var, for example, when the value of \$var is the actual name of the variable to be change. In effect, \$var is being used, conceptually, as a pointer to the variable to be changed.

4. Background: The C and C++ stream model, used with variables of type FILE * (pointer to file). In homework 5, The type FILE was declared as follows:
```typedef struct file {
int fd;
int count;
int size;
char * buf;
} FILE;
```

Here is incomplete code for fopen() using this declaration:

```FILE * fopen( char * filename, ... ) {
FILE * f = malloc( sizeof( FILE ) );
f->fd = open( filename, ... );
f->count = 0;
f->size = 512;
f->buf = malloc( 512 );
return( f );
}
```

a) Suppose the user does ch = fgetc( s ); fputc( s, ch ); on a newly opened stream file s. The goal is that if the file used to read "ab" it should now read "aa". Something important is missing from the declaration of the FILE type that is needed to support the above behavior. What?

Several things are missing. First, there is no direction indicator to keep track of whether the stream was most recently used for input or for output. Second, if count is the count of characters in the buffer (as in Homeork 5, problem 3, on which this problem was based), there is no file position indicator. (Or, if count is the file position, there is no indication of the position in the buffer.)

Only 1 got both issues, while 3/10 got no credit. 1/2 focused on the second issue above, earning half credit.

b) Suppose the user does fputc( s, ch ); c = fgetc( s ) on a newly opened stream file s. The goal is to read the character in the file right after the one just written. What disk I/O is required (calls to the lower level read and write routines, and when would these operations be done?

The turn-around from writing to reading must triggers a write and then a read. That is, both the write and read must happen as a result of the call to fgetc(). This is simplest if we rely on the blocking and deblocking mechanism of the underlying operating system, the we can write just the part of the buffer that changed and then read the remainder.

Only 2 got this right, while 1/2 earned no credit. 3/10 suggested (for partial credit) that the call to fputc() would do the write. This only makes sense if the buffer size is one byte, which eliminates the performance benefits of having a middleware layer such as the C stdio package. Those who suggested read and write operations without indicating what triggered them also earned partial credit.

5. Background: The user side routine for an input device driver might look like this when written in C:
``` 1  char stream_get( stream_device * s ){
2          char ch;
3          int r;
4          while (char_queue_empty( s->queue )) do /* nothing */;
5          ch = char_queue_dequeue( s->queue );
6          r = inp( s->control_register );
7          r = r | s->enable_bit;
8          outp( s->control_register, r );
9          return ch;
10  }
```
a) Why not use constants for the identiy of the control register and the enable bit instead of using fields of the stream device data structure?

By taking the device register numbers from the device data structure, we allow one device driver to be used for multiple similar devices on the same machine.

1/9 got this correct, 4/5 earned no credit. Partial credit was offered to those who suggested that it improved portability (it does, but named symbolic constants allow equal portability). No credit was offered to those who thought that, somehow, the identity of the device registers for one device might change dynamically while a system is running, or that, somehow, making something a variable makes it easier to debug or more reliable than having that thing be constant (invariable, making things variables makes debugging harder and increases the likelihood of bugs).

b) Why does this code set the enable bit in the device control register?

If the input queue is full, the input interrupt service routine returns leaving the enable bit reset. Because stream_get() does a dequeue, it guarantees that there will be space in the queue, so it sets the enable bit just in case it had been reset for a full queue.

1/9 got this right, 4/5 earned no credit. Partial credit was offered to those who failed to mention that the interrupt might have been disabled or failed to mention that it was re-enabled to indicate that there was space in the queue. Many wrong answers got input and output turned around, or confused the stream-device support inside the device driver with the middleware layer discussed in the previous problem.

c) The code to set the enable bit first reads the control register and then writes it. Why not just write a constant to the control register to reset the bit?

Writing a constant to the control register sets the values of all the bits in the register instead of just setting the enable bit. The other bits might contain something that matters and must not be changed.

1/2 got this right, 2/5 earned no credit. There were some answers that were vague but contained enough to offer partial credit.

d) There is a critical seciton in this code where interrupts must be disabled. Identify it.

The critical section runs from line 4 to 8. In detail, we might protect it as follows, but nobody was expected to give this much detail:

``` 1  char stream_get( stream_device * s ){
2          char ch;
3          int r;
disable_interrupts();
4          while (char_queue_empty( s->queue )) do {
enable_interrupts();
/* nothing */
disable_interrupts();
}
5          ch = char_queue_dequeue( s->queue );
6          r = inp( s->control_register );
7          r = r | s->enable_bit;
8          outp( s->control_register, r );
enable_interrupts();
9          return ch;
10  }
```

Just 1 got this right, while 1/2 earned no credit. Good partial credit was given to the 1/10 who saw lines 5-8 as critical. Half credit was offered to the 1/5 who saw lines 6 to 8 as a critical section, or to the 1/17 who saw lines 4 and 5 as critical. Smaller partial credit was given to those who specified smaller parts of the critical sections.

There are actually 3 different issues here:

1. An interrupt between lines 4 and 5 might somehow let an item be removed fro the queue, perhaps because someone else called stream_get() to swipe the character from the queue.
2. An interrupt between lines 5 and 6 might put data in the queue, filling it and therefore making it improper for the get routine to re-enable interrupts.
3. An interrupt between lines 6 and 8 might change some other bit in the control register that will be undone by line 8.

Of the above, the last was most obvious to those taking this exam.