Exam 2: Final

Solutions and Commentary

Part of the homework for 22C:112, Spring 2011
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Final Exam

Mean   = 11.91        X
Median = 11.1         X   X
                      X X X X                 X
 _________X___X_X_____X_X_X_X_____X___X_X_____X________
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24

Final plus Midterm

Mean   = 19.11
Median = 18.5         X       X   X     X X
 _____________________X_____X_X_X_X___X_X_X_X_________X_X_____X_X_____X____
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 36. 28. 30. 32. 34

Top 10 of 11 Homeworks

Mean   = 24.07
Median = 25.4                                       X     X X
                                                X   X   X X X
 _________X_____________________________X___X_X_X___X_X_X_X_X_X____
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 36. 28. 30

Machine Problems 1 to 5

Mean   = 14.26
Median = 14.5
                                          X     
                              X X X       X     
              X               X X X       X     
 _____________X_X_______X_____X_X_X_____X_X_X_____
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22

Overall (excluding those who did not take the final exam)

Mean   = 57.44
Median = 55.2
                              X   X   X         X     X X
 _______X_____________X_______X_X_X_X_X_X_______X_X___X_X_______X______
   20. 24. 28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84
          - -   D   + + - -   C   + + - -   B   + + - -   A   + +

Note in the above that the statistics change significantly if only those who turned in 10 of the 11 homeworks and all 5 machine problems are counted. In that case, the low score wat 50, the mean score was 64.74, and the median was 64.4.

Solutions

Background: Consider this shell script:
```
1    #!/bin/tcsh
2    #  examscript -- a shell script for the exam
3    set files = `ls`
4    @ index = 1
5    @ limit = $#files
6    while   ( $index <= $limit )
7            ls -dl $files[$index] | cat
8            @ index = $index + 1
9    end
```
This script behaves almost the same as the ls -l command. Note that when the shell encounters string quoted with backslashes, such as `ls` it treats the quoted text as a shell command, runs that command, and substitutes the output of that command for the string, with newlines replaced by blanks. Note also that blanks delimit array elements when interpreting shell variables as arraysm and that the $# construct gets the number of elements in the array named.
In the following, assume that the current directory contains the files A, B C, and D. Note that the file name B C contains a blank.
a) What is the value of $files set on line 3 and what is the value of $limit set on line 5?

$files = "A B C D"
$limit = 4
10 of 19 got full credit, 3 earned no credit.

b) Why does examscript sometimes behave differently from ls -l? The correct answer to part a) demonstrates this.

File names containing blanks are split into two by the script. In this example context, the file B C will not be listed, and instead, the script will attempt to list two non-existant files, B and C.
8 of 19 got full credit, 8 earned no credit.

c) Concisely explain why the implementation of line 3 requires use of both the fork() and pipe() system calls.

The script must use fork() in order to save its context and wait for the ls command to finish, and it must use pipe() to capture the output from ls and save it to a shell variable.
9 of 19 got full credit, 6 earned no credit.

d) What is the minimum number of fork() system calls needed to execute line 7? Concisely explain why each is needed.

Two forks. One is required to save the shell's context while it launches the ls command, and a second is needed to save the context while it launches the (entirly unnecessary) cat command. A third fork is not required to make ls run in parallel with cat because an optimized implementation can get this parallelism by not waiting for ls command to finish while it sets to work on cat.
2 of 19 got full credit, 4 of 19 earned no credit. The most common reasons for granting partial credit had to do with a focus on applications launch without mention of parallelism, or a focus on parallelism without mention of applications launch.
Background: Programs running on Unix-like systems typically have three segments: a read-execute only code segment, a read-write static segment and a read-write stack segment.
a) When execve() is used to run a program, which segment or segments can be loaded directly from the object file (presuming that the MMU eliminates the need for load-time relocation).

The code segment and the static segment can be loaded from the object file. The latter is less obvious, but loading it gets the initial values of static variables.
6 of 19 got full credit, only one earned no credit. The most common error involved forgetting the static segment.

b) When a process forks, which segments must be copied and which segments can be shared?

The (read-write) stack and static segments must be copied, while the (read-execute) code segment can be shared.
9 of 19 got full credit, everyone else had partial credit. Forgetting to copy the static segment was the most common error.
Background: Consider the path take by data input from a keyboard.
1. A C program calls fgetc( stdin )
2. which calls read( 0, &ch, 1 )
3. which calls dequeue( keyboardqueue_q, ch )
4. which might return after a call to in( ch, keyboard_data_reg )
a) Which of the above are called inside the I/O driver and which of the above are called inside the user program?

The driver includes 3 and 4, the calls to dequeue() and in().
The user includes 1, the call to fgetc(). Middleware that the system considers to be user code includes 2, the call to read().
10 of 19 got this right, everyone else got partial credit. The most common error involved item 3, the call to dequeue().

b) Which of the above is likely to be followed immediately by a signal operation on a semaphore and which of the above is likely to involve a wait operation on the same sempahore.

4 is followed by a signal to 3, which is to say, after getting a character from the hardware's input register by calling in(), the driver would enqueue it and signal that there is something in the queue. The dequeue() operation would either include a wait or would be called immediately after waiting for data in the queue.
5 of 18 got this right, 10 earned no credit, one more left it blank.
Background: A capability is defined as a combination of access rights with the handle on an object. Each entry in the open file table of a Unix process is a capability for an open file, and each entry in the page table used by the MMU is a capability for a particular page frame. Imagine a system where these ideas were unified, so that entries in the page table could refer either to pages, to open files, or to other objects. This would be a pure capability-based operating system, comparable in many ways to Demos and Amoeba, except that it would operate on a uniprocessor or shared-memory multiprocessor (multicore machine).
In such a system, the MMU would probably only handle operations on page objects currently stored in page frames. All other operations on objects would cause page faults (as far as the MMU was concerned) so that operations on other types of objects would actually be implemented by the page-fault handler.
a) What access rights would be set on the page-table entry for an open file object (the only access rights that matter for this purpose are those understood by the MMU).

In order to force a fault on any attempt to access a non-page object such as an open file, the access rights would have to be set to "no access."
1 of 17 earned full credit, 12 earned no credit, 2 more left the problem blank.

b) When the user calls some kind of open service to open a file, the user gets some kind of handle on the open file that can later be used to read that file. What data type could this handle have if the user were writing code in a language such as C.

The type would be FILE * (pointer to FILE), where the details of the type FILE would depend on how the user accesses the methods of the abstraction.
2 of 18 got full credit, 6 earned no credit, one more left it blank. The most popular answers earning partial credit involved odd pointers, for example, void *, int *, or some kind of pointer to a string.

c) The operations on an open file object might include read, write and seek. Since this operating system is thoroughly object oriented, these will be performed by doing something with the object handle that will cause a trap that the system can interpret as an operation on the associated object. Suggest appropriate machine instructions the user could use to force such a trap, and identify the trap handler that would initiate (but not necessarily complete) the associated service.

An obvious way to do this would be to force a trap by a memory reference to an address in the page of the address space occupied by the open file. For example, a call to word 0 of the page could be interpreted as a close, a call to word 1 of the page could be interpreted as a read, a call to word 2 of the page could be interpreted as a write, and a call to word 2 of the page could be interpreted as a seek.
None of the 17 students who tried to answer this did perfectly, and 8 earned no credit. An additional 2 left it blank. 4 got over half credit by giving answers that might work but were not particularly in the spirit of an object-oriented interface.

d) Most system services could sensibly be handled as suggested by the model given in parts b) and c), but some system services are not sensible to encode as methods applied to some object. Identify some system services that do not fit the object oriented model.

Unix examples that are not object oriented include: exit() and fork(), as well as many others.
3 of 16 got full credit, 10 earned no credit, and 3 more left it blank.
Background: Consider a Unix process running on a machine with a paged MMU and demand-paged virtual memory. The naieve way to implement the fork() system call is to duplicate the memory map of the forking process, so that all pages are shared between both address spaces, and then go through the address space and make duplicate copies of all the read-write pages. This can require copying each read-write page into RAM to make the copy. For a process with hundreds of pages of stack and static variables, this might make the fork() operation take hundreds of disk accesses. In general, this would not be an acceptable way to do it!
The developers of Berkeley Unix found an alternative way to implement fork(). They called it lazy forking. It requires that the page table entries be able to hold two copies of the access rights field, one copy is the copy understood by the MMU hardware, while the other copy is the "real" access rights. Normally, these are identical.
A lazy fork is implemented by marking the pages in the stack and static segments as read-only in the hardware access rights, while they are marked as read-write in the real access rights. After the fork is done, the entire page table of both the original and new processes are identical. The actual copying of any pages that must be duplicated is initiated by page faults and done by the page-fault service routine.
a) How does the page-fault service routine determine that it should respond to a page fault by making a copy of a page?

If the page is marked read-only to the hardware, but the real rights are read-write, it should be copied.
9 of 19 got full credit. 4 earned no credit. The most common answer earning half credit was read-only, with no distinction given between the real access rights of the page and the rights the hardware is told about.

b) After making a copy, what access rights does the page-fault service routine set in the page-table of the process that caused the copy to be made?

The page is marked read-write (so the real and hardware rights are now the same).
12 of 18 got full credit, 4 earned no credit, one more left it blank.

c) What pages actually get copied and what pages remain shared? Relate your answer to the behavior of the processes that are involved?

The pages that get copied are the ones that were changed after the fork. All pages that were not actually changed remain shared.
8 of 18 got full credit, 5 earned no credit, one more left it blank.

d) Note that a large fraction of the forking done in Unix systems is done by shells, and note that the child process usually does an exec() shortly after the call to fork() while the parent usually does a wait() shortly after. Under these circumstances, how much copying would you expect to be required in a system using lazy forking?

Very little copying would be needed because one of the forked processes immediately waits for the other to terminate, while the other branch immediately launches a new application, replacing the current data segments with new ones.
11 of 17 got full credit, 2 earned no credit, and 2 more left it blank.
Background: Flash memory devices are typically only warranteed for a few tens of millions of "flash-erase" cycles per sector. (First-generation flash memories in the 1990s frequently had warrantees of around 100,000 flash erase cycles). Thus, there is a limit on the number of times that each hardware sector of the flash memory can only be erased and rewritten. After that, the reliability of the sector begins to decline. The solution to this is to use wear-leveling algorithms when deciding what sector of the memory to use next. These algorithms attempt to level out the "wear" on the memory sectors.
On a disk drive, disk scheduling algorithms attempt to minimize seek times, and the disk space allocation routines also play a role by trying to allocate space for each file and index sector in a way that minimizes seek times. With flash memory, there is nothing analogous to seek time, since accessing one sector of flash memory takes just as long as accessing any other sector. (Barring anything the firmware in the flash drive might do to delay things).
a) Wear leveling on a flash memory could be done by the operating system. What part of the file system software would be able to do a large amount of wear leveling work?

The routine that allocates space for files (and directories) can easily spread out the wear on the flash memory. On a magnetic disk, this routine typically tries to improve access times by minimizing head movement. In a flash memory, using the free space on a FIFO basis would level out the wear.
3 of 18 did well here. 10 earned no credit, and one more left this blank. The most common answer earning partial credit was to say something like "the open file table", which indeed is related to the space allocation mechanism, but focuses on the data structure instead of the algorithm.

b) One particular disk address used by a Unix like file system would pose special difficulties to the wear-leveling algorithm suggested by part a). What address?

The root of the file system poses particular problems because it typically resides in a fixed address. In Unix, this is called the superblock.
6 of 18 got full credit. 7 earned no credit and one more left it blank. The most common answer earning partial credit was something like "the first address." The problem with this is that it is ambiguous and could refer to many different things including the root.

c) Most operating systems make no effort to help with wear leveling, so the problem is solved by the firmware in the flash drive itself. A typical solution is to store the logical sector number with the data for that sector and when the user tries to read a sector, do a search for that sector. Hashing can be used to make this search fast. What is the consequence of this algorithm with respect to performance?

This scheme requires a search for each access to flash memory. As long as the memory has not been used much and is not very full, most sectors will be found very quickly. As the memory fills and the longer it is used, the longer the searches will take, so the memory will appear to slow down with use.
3 of 18 earned full credit. One earned no credit, and one more left it blank. Partial credit was granted for a variety of answers that showed misunderstanding of the role of hashing or searching. Simply saying "it makes it slower" earned partial credit.

d) What is the consequence of the approach taken in part c) to the problem disk sector identified in part b).

The root of the file system will no longer be stored in a fixed location in the flash memory, so wear leveling will apply uniformly to the root and to all other sectors, but opening the file system on the drive might slow down because of the need to search for the root.
5 of 17 got full credit, 10 earned no credit, and 2 more left this blank.
Background: Garbage collection can slow down program execution by introducing annoying pauses while the garbage is colledted. Suppose you were worried about this effect and decided to add a reference-counting scheme to a Java implementation so that most of the garbage would be automatically collected whenever the reference count (the count of pointers to an object) hit zero.
To help think about the consequences of this, consider adding, to each object, a hidden field called refcount. This field is initially 1 when the object is created (because creating the object leaves a handle to that object in the hands of the routine that created it). The object will be freed (using a call to free()) as soon as the reference count is decremented to zero.
Consider the assignment statement a = b in Java. Beware that the handles a and b could refer to the same object before the assignment. Also note that either of a and b could be null prior to the assignment.
a) Write equivalent C (or C-ish) code that does the assignment and correctly manages the reference counts on a and b. Don't worry about edge cases such as null pointers or object deallocation yet.
```
        a->refcount--;   /* solution 1 */
        a = b;
        a->refcount++;
```
```
        a->refcount--;   /* solution 2 */
        b->refcount++;
        a = b;
```
2 of 19 got full credit, 6 left it blank. There were a variety of small programming errors leading to partial credit. Forgetting to decrement a reference count, forgetting to increment a count, or forgetting to actually assign the pointers were eqaually common. Many students made this very complicated, with the decision to increment or decrement depending on the values of the pointers.
b) Rewrite the code from part a) to account for the edge cases.
```
        /* based on solution 2 above */
        if (b != NULL) {
                b->refcount++;
        }
        if (a != NULL) {
                a->refcount--;
                if (a->refcount <= 0) free( a );
        }
        a = b
```
2 of 19 got full credit, 7 earned no credit. Failing to check for null pointers, confusing the variables a and b, forgetting to free when a count goes to zero, and a variety of bizarre extra code were all common.
c) What is the likely impact of this on the speed of your program? Take into account the frequency of garbage collection, the length of the delay when garbage collection is required, and the total load your program places on the CPU.

Pauses for garbage collection will be less frequent (it could still be needed to collect circular lists or similar data structures that become garbage), but the program will run significantly slower because of the extra overhead of every pointer assignment.
Nobody got full credit, 4 earned no credit, and 2 left the problem blank. By far the most common error was the assertion that reference counting completely eliminates the need for garbage collection. The problem statement clearly said that reference counting was added so that "most of the garbage would be automatically collected," not all of the garbage! A common wrong answer was that reference counting makes the program run faster. Nothing given here suggests this possibility.
Background: Consider adding a third semaphore operation, so the total set of operations is:
-- wait( semaphore s ) -- defined as before
-- signal( semaphore s ) -- defined as before
-- multiwait( list of semaphores s ) returns integer -- waits for a signal to any of the semaphores in the list s, and returns the index of the semaphore that was signalled. If you think of semaphores as simple non-negative integers, it waits until the sum of the listed semaphores is nonzero and then returns the identity of the semaphore it actually decremented.
Note that the wait operation on a conventional semaphore implementation first checks to see if the count is nonzero. In that case, it decrements the count and returns. If the count is zero, the caller is enqueued on that semaphore's queue, some other process is scheduled, and the caller will only become ready when the semaphore is signalled.
a) Explain the relationship between this service and the Unix select() system call.

Both the proposed multiwait() and the Unix select() service block a process until one of a set of independent conditions occurs.
6 of 16 got full credit, 6 earned no credit, and three more left this blank. Partial credit was given to those who gave a correct definition of select() without relating it to multiwait(), as well as to those who gave garbled but partially correct discussions of the relationship.

b) Given a conventional impelementation of semaphores, there are two cases the multiwait() primitive must address: The case where the count on one of the semaphores is nonzero, and the case where all of the counts are zero. Which of these cases is particularly difficult to deal with, and which could you solve in the time available for this exam -- do not solve it, just identify which is the hard case and explain what makes it hard.

The difficult case is the one where the waiting process must block. On what queue or queues does it enqueue itself? When a signal on one of the associated semaphores occurs, how is the multiwaiting process awakened, and how are entries in other queues deleted? In comparison, it is quite easy to enter a critical section to check all the semaphores involved to see if any have nonzero counts.
5 of 19 got full credit, and 7 earned no credit. Partial credit was offered if the answer correctly identified the hard case and then gave an incorrect reason, either a false reason or one that did not justify the conclusion.