Exam 2: Final

Solutions and Commentary

Part of the homework for 22C:112, Fall 2012
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Exam Scores

Final Exam

Mean   = 8.35                        X   X
Median = 8.7                   X     X X X
                               X     X X X X
       X           X         X X X   X X X X X         X
_______X___X___X_X_X___X_X_X_X_X_X_X_X_X_X_X_X_X___X___X_X_X_______X_____
  0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 16. 17

Total Midterm plus Final

                                   X
                                   X
Mean   = 15.74                     X
Median = 16.3                      X
                               X   X
                 X             X X X X X X X
               X X   X     X X X X X X X X X   X
_______________X_X___X_X_X_X_X_X_X_X_X_X_X_X_X_X_X_________X______
  0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Machine Problems 1 to 6

                                                           X
                                                           X
Mean   = 22.62                                             X
Median = 24.5                                  X   X   X X X
                                               X   X   X X X
                                 X             X   X   X X X
                                 X   X X   X X X   X   X X X
___________X_______X___________X_X_X_X_X___X_X_X_X_X_X_X_X_X______
  0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Homework (top 10 of 13 scores)

Mean   = 21.41                                         X
Median = 22.7                            X             X
                                     X   X       X   X X X   X   X
           X                         X   X     X X   X X X X X X X
_______X___X_X_____________X_X_____X_X_X_X___X_X_X_X_X_X_X_X_X_X_X_X_X___
  10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27

Total Scores

Mean   = 59.77                         X X
Median = 64.6                          X X
                                       X X X
               X                       X X X
               X   X X X   X     X X X E E E E   E
_________X_X___X_X_X_E_X_X_E_X___E_E_X_E_E_E_E_E_E_X___E__________
  28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84. 88
 -   D D D   +|-   C C C   +|-   B B B   +|-   A A A   +  

E indicates engineering students.

Exam Solutions

Background: Consider a memory management unit for a 32-bit computer with byte-addressable main memory. The virtual address has the following structure:
```
 _______________________________________ _______________________
|_______________________________________|_______________________|
|                 page                  |     byte in page      |
               (20 bits)                        (12 bits)
```
Each entry in the translation lookaside buffer in the MMU has the following format:
```
 _______________________________________ _______________________
|_______________________________________|_____________|m|v|r|w|x|
|                 page                  |    unused   |  rights |
               (20 bits)                                 
 _______________________________________________________________
|___________________________________________________|___________|
|                        frame                      |   unused  |
                       (26 bits)
```
The TLB is accessed through two 32-bit device registers, the page/rights register and the frame register. When there is a TLB fault, the page field of the virtual address that caused the fault is loaded into the page field of the page/rights register, and the access rights field is set to indicate the right that was violated:
- m - cache miss, no TLB entry matched the page field.
- v - cache hit, but TLB entry marked as invalid.
- r - cache hit, an attempt to read from a non-readable page.
- w - cache hit, an attempt to write to a non-writable page.
- x - cache hit, an attempt to execute from a non-executable page.
To load a TLB entry that maps page p with rights r to frame f, load p and r into the page/rights register and then load f into the frame register. Since the TLB is finite, some previous TLB entry will be forgotten when you do this. You are guaranteed that the TLB replacement policy is no worse than random. When stored in the TLB, the m bit is ignored, while the v, r, w and x bits must be set to permit the corresponding operations.
a) How many bytes are in a page on this machine?

2¹² = 4096
4/5 got this right, 4/5 earned no credit. The remainder made clerical errors or gave gross approximations for partial credit.
This problem ought to have been trivial. There is no excuse for a student who has completed an elementary computer architecture course to do anything short of perfect work on this problem.

b) How many bits are in the physical address?

frame field + byte in page field = 26 + 12 = 38
4/9 got this right, 3/9 earned no credit. 1/9 earned partial credit giving just 26 (forgetting that byte in page is part of the address). A comparable number exponentiated, giving the size of the address space instead of the number of bits.
While this problem was harder than part a), it would also have been an entirely fair question in a final exam for an elementary computer architecture course.

c) If a flat page table was used, with no segmenting, how many entries would there be in the page table? (no calculator required. Estimation and simple algebraic expressions are preferred here).

2²⁰ = approximately 1 million
4/9 got this right, 3/9 earned no credit. 1/9 gave 2²⁶, perhaps confusing the page table with the largest possible frame table.

d) The organization of the page table is entirely up to the software, with the TLB fault service routine responsible for building TLB entries from the page table. Which fields in the TLB entry would the TLB fault service routine extract from the page table entry?

The frame number and access rights (vrwx).
None got this right, 4/9 earned no credit. 3/9 said just "access rights", while 1.9 just said "frame number".
The details of the fault software were irrelevant. What matters is that, in a conventional paged virtual memory system, the page table entries contain just two fields, page number and access rights. None of the extra detail given here changes that.

e) What TLB replacement policy would lead to the best possible performance. This is a hardware question, but the answer is the same for page replacement policies.

LRU - Least Recently Used
2/9 got full credit, 3/9 earned no credit. 2/9 earned most of the credit for suggesting clock replacement. In fact, the clock algorithm is a bad choice for cache, but more important, the question asked for the best possible performance, not merely very good performance. 1/9 suggested random replacement, which is the worst acceptable policy (you have to work to do worse than random, and many better policies are possible). A few suggestede Belady's optimal policy, a policy that is not possible in practice.
Background: Consider writing an I/O driver for a machine with the MMU described in the previous question. The MMU fault handler provides the rest of the system with a paged-segmented virtual addressing model where each page table contains 1024 entries describing page frames, and each segment table contains 1024 entries describing the page tables for each segment. User programs run entirely in virtual memory, while the kernel runs with the MMU turned off. All DMA input-output uses real addresses, not virtual addresses.
A user calls read( fd, buf, len ) on this system.
a) Suppose len just happens to be the same as the page size. How many page-faults might need to be handled before the kernel can fill the user's buffer.

2 page faults
2/9 earned full credit, 4/9 earned no credit. 3/9 earned partial credit for saying just 1 fault (a very rare occurance requiring that the buffer be precisely aligned on a page boundary).

b) Is the MMU involved in any way with these page faults? Explain.

No. The MMU is turned off within the code of kernel services such as read(). Therefore, detection of these page faults must be done by kernel software and not by the MMU.
2/9 got this right. 6/9 earned no credit (many giving elaborate and entirely wrong or irrelevant explanations of how the MMU does this). 1/9 earned partial credit giving garbled or wrong explanations for the right conclusion.
Background: Consider the problems faced by the operating system on the machine described in the previous questions. To perform I/O to a user's buffer, it must be able to address that buffer, but the CPU is evidently only able to use 32-bit addresses, while the physical address size (problem 1b) is different. If the MMU is simply turned off when the operating system is running, this makes it difficult for the system to access pages in the user's address space, for example, pages of the user's I/O buffer.
an old solution to this problem is to treat the most significant bit of the 32-bit virtual address as the MMU enable bit. When this bit is zero, the MMU is off. In addition, the most significant bit of the program counter is the privilege bit. When this bit is zero, the system is privileged. A call or jump that turns on this bit (if it was off) causes a trap.
As above, assume a user calls read( fd, buf, len ) on this system.
a) If the system wants to access the user's I/O buffer at virtual address buf, what is the minimum work it needs to do to use this address?

Nothing, if the current page table is that of the caller. (Otherwise, load that page table.)
1/9 got this, 7/9 earned no credit, many giving elaborate descriptions of complex processes.
What many apparently did not understand was that this scheme, using the high bit of each virtual address as the MMU-enable bit means that the MMU turns on or off each time an address is translated, with no need for the software to take any action to independently control the MMU.

b) Suppose the user passes the address 0x00001000 as buf; by coincidence, this is the address of the page fault handler. What check should the system apply to buf in order to protect itself from damage by careless or malicious users.

Check that the high bit buf is 1.
1/9 got this right, 3/9 earned no credit. 2/9 earned partial credit for the inadequate check that buf was not equal to 0x00001000. 3/9 earned a little credit for mention of the word privilege.
Background: Consider a system where each of several user process has its own page table, where some pages are potentially shared by multiple processes. Page table entries give the frame number for pages in RAM, along with the access rights for those pages.
a) What information must be in the frame-table entry for a page, assuming that page is not shared.

the page-table address (or process id) and page number in that frame.
Only 1 student got this right. 5/9 earned no credit. 3/9 earned partial credit for just saying "page-table address", without any mention of the page number in that table.
The important thing here is, the frame table is an entirely classical frame table if there is just one process and no shared pages. Adding multiple processes without sharing requires adding the page-table ID or process ID.

b) What information must be in (or accessible from) the frame-table entry for a shared page?

For each page table that shares the page, the above must be duplicated.
2/9 got this right, 6/9 earned no credit. For some reason, many wrong answers discussed such things as page ownership, an issue that was never mentioned in the problem statement and only adds complexity to any possible solution.

c) If the clock page replacement algorithm is used, the clock hand is an index into one of the tables discussed here. Which table?

The frame table.
4/9 got this, 5/9 earned no credit.
The answer to this question does not depend on any of the complexity of this problem, but applies to all implementations of the clock replacement algorithm. The algorithm always and only works on the frame table. Failure to grasp this indicates failure to understand clock replacement.
Background: Consider the problem of creating a shared heap. Each user of the shared heap calls the mmap() kernel call to map that heap into their virtual address space. Once the heap is mapped, each process may independently examine the heap data structures to allocate or deallocate items -- with appropriate care to use mutual exclusion mechanisms to protect any critical sections in the code.
One of the parameters to mmap() is addr. Here is what the man page says about it: "The starting address for the new mapping is specified in addr. ... If addr is NULL, then the kernel chooses the address at which to create the mapping ... If addr is not NULL, then the kernel takes it as a hint about where to place the mapping; on Linux, the mapping will be created at a nearby page boundary." In either case, mmap() returns the virtual address of newly mapped shared segment.
a) Suppose your process tries to map the shared heap into your address space, and it maps it into a different virtual address than where it mapped the heap for some other process. What problem does this pose for shared data structures in the heap?

Pointers in that data structure to other locations in that data structure will point to different places depending on which process is looking.
2/9 got this, 5/9 earned no credit. 1/9 earned partial credit for vague statements about data corruption without explaining the problem.

b) Pages in a mapped memory region may map to a shared file, or they may map to an anonymous block of memory. Which type of mapping would you use to access the shared universe of a multi-player game where users independently launch the game after signing in to the host computer?

Use a shared file, because players must be able to share the data when they independently enter the game by launching the game application.
1/2 got this, 1/2 earned no credit. A remarkable number of the latter suggested using anonymous shared regions, which are impossible to share by processes that are launched independently.

c) Suppose you use an anonymous shared region in an applicaiton. How can two different processes possibly share access to this region?

Create the anonymous shared region and then fork; all processes descended from the creator will share that region.
1/9 got this, 7/9 earned no credit.
Background: Consider a boundary tag heap manager, where all blocks are aligned on a word boundary (assume 32-bit words and 32-bit pointers here). The boundary tag structure is declared as follows in C:
```
struct tag {
        struct tag * back; /* pointer to previous tag */
        struct tag * next; /* pointer to next tag */
};
```
The status of the block is stored in the least significant bit of the back field. Zero implies a free block, one implies a used block.
a) Given b, a pointer to a block, write a boolean expression that is true if that block is in use.

((intptr_t)(b->back)) & 1
Just 1 got this right, 4/9 earned no credit. 4/9 forgot to cast the pointer to an integer type.

b) Given b, a pointer to a block, give code to compute p, the previous block before b. The assignment b=b->back; fails because of the status bit. Write the correct code.

b = (struct tag *)(((intptr_t)(b->back)) & ~1);
None earned full credit, 7/9 earned no credit, a shocking number of whom did things like shifting the pointer. Most of the errors among those earning partial credit involved casting.

c) Given b, a pointer to a block, give code to compute the size of that block.

size = ((intptr_t)b->next - (intptr_t)b) - sizeof(struct tag);
None earned full credit, 6/9 earned no credit. Casting errors were very common.
Background: Assume that your machine allows some kind of spin-lock (for example, using a test-and-set instruciton, or exchange memory and register, or load-locked and store-conditional instructions). The operating system also supports kernel semaphores. Also assume that you have a multicore or multiprocessor machine, so that independent processes have a significant probability of running in parallel.
a) If you are writing a heap manager, would you be more likely to use spin locks or kernel semaphores for the mutual exclusion required by your code. Very briefly, Why?

Spin locks, because the critical sections in the heap manager would be extremely short.
1/9 got this right, 4/9 earned no credit. Those earning partial credit generally selected the right type of lock and then gave extraordinarily poor or outright wrong reasons for the selection.

b) If you are writing a multi-player dungeon game, where users compete to pass through doors in a maze, what kind of mutual exclusion locks would you use to prevent two users from simultaneously passing through the same door.

Use kernel semaphores.
5/9 gave good answers, 4/9 earned no credit.
No reason was requested here, and the reasons given were ignored (many were wrong). The key is, game users contending for game actions will contend at human speed, taking actions slowly, so the high overhead of kernel semaphores is acceptable. The number of CPUs is not relevant, nor is the particular spin-locking protocol used (test-and-set versus any of several others).
Background: Suppose you wanted to build a Demos or Amoeba-like operating system where An open file is a capability or link to a server that accepts read, write, seek or close operations.
a) If you create a process, the process manager returns a capability or link that allows you to manipulate the process. Initially, the process is stopped with an uninitialized data segment, but the manager for the process allows you to start and stop the process. Should processes be viewed as a subclass (in the object-oriented sense) of files? Why?

Yes, the process's memory image is file like, so read, write, and seek make perfect sense for manipulating it. closeing a process is a way to tell the process manager that you are giving up your management rights to it.
2/9 got this, 4/9 earned no credit. Partial credit generally involved poor or garbled reasons. In a few cases, credit was docked for reasons that were simply illegible.

b) An open directory is a link or capability that accepts deposit, open, delete and close operations. The deposit operation associates a link or capability with a name in that directory, the delete operation deletes the association with a given name. What does the open operation do?

Open, applied to a directory, returns the link associated with a particular name in that directory.
2/9 got this, 6/9 earned no credit.

c) What kinds of directory structures does this permit that are forbidden under Unix? This is not a question about directory contents, but structures.

This permits arbitrary graph structures.
1/9 earned full credit, 6/9 earned no credit. Many earned partial credit for mentioning some specific example, multiple root directories, for example.

d) If you create your own server process, how can you make that server avialable to other processes on the system?

You need to advertise it by putting a capability or link to your server in a public place, for example, entering it in a directory known to your potential client processes.
2 did well, 8/9 earned no credit. Only a few vague answers earned partial credit. Many suggested methods of advertising that were not part of the system described in the problem statement or preceeding parts. Others basically suggested using magic to assure that all clients had the links they needed.

e) You want to create a multplayer role-playing game, but this type of system does not support a shared memory model for the universe your players inhabit. How would your players interact under the model of such an operating system?

Use a client-server architecture where the game server maintains the state of the shared universe for the clients.
3/9 got this, 4/9 earned no credit. Partial credit was given for a variety of vague answers and for a few alternative architectures, mostly ones that offer very poor scaling as the number of players grows (worse even than a centralized game server).