Final Solutions

22C:116, Spring 1997

Douglas W. Jones

Grade Distribution:

Mean = Median = 20.6
                         X X     X X
 ______X___________X_____X_X___X_X_X_X_X_______________X___X_____
  13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28

Solutions:

  1. Part A: (2.7 points) Outline the response of the file server to a touch operation directed to one of it's objects.
    When the server receives a touch for a directory it maintains, the server must touch, in turn, every item under that directory. Since this is a distributed system, we will use message passing to do this:
            touch(o)
               -- local object o is to be touched
               if o already marked,
                  exit
               else -- o not yet marked
                  mark o
                  if o is a file,
                     exit
                  else -- o is a directory
                     for each link x in o
                        send a touch message to x
                     exit
                  endif
               endif
            end touch
    	
    Most students had good solutions to this part. Many presented an RPC based solution that would be expected to be slower than the above.

    Part B: (2.7 points) Present a convincing (short) argument that the marking phase of our distributed garbage collector terminates after a finite time.

    There are a finite number of directories, each with a finite number of links. Because the above code marks each directory prior to sending touch messages to each link in that directory, and because touch messages to already marked items are ignored, the code only sends a touch message once per link. Therefore, the finite number of links implies a finite number of touch messages, and that implies termination.

    Most students understood the termination of their algorithm. The common use of RPC based solutions made the termination arguments a bit more intuitive than the solution given above.

    Part C: (2.7 points) Discuss (briefly) the problem of detecting termination of the marking phase.

    The solution above does not detect termination. To detect termination, either a reply must be sent to each touch message, or a completely independent termination detection computation must be run. The former is easier to describe:
            touch(o,r)
               -- local object o is to be touched
               -- r is the sender's reply address
               if o already marked,
                  send reply to r
                  exit
               else -- o not yet marked
                  mark o
                  if o is a file,
                     exit
                     send reply to r
                  else -- o is a directory
                     for each link x in o
                        send a touch message to x
                     for each link in o
                        await a reply message
                     send reply to r
                     exit
                  endif
               endif
            end touch
            
    Note above that the only thing this code checks is that the number of replies received matches the number touch messages sent. The identity of the reply's sender is not checked. This code allows all touches made from one directory to be processed in parallel, and relies on the system to queue reply messages. A slower serialized version may be written using RPC calls.

    Better performance will typically result if each system notes which directory links are to remote objects and which to local objects. Marking of local objects referenced from a directory can be done recursively, with the only message traffic being for marking remote objects.

    This was a bit harder than the first two parts. Some students didn't seem to understand the distribution of their algorithms presented in part A or B, and talked of recursion and procedure return as if the network wasn't present.

  2. Part A: (2.8 points) What sequence of messages would the fault handler typically exchange with the other machine in the network in response to a page fault.
    1. If there is no free frame locally, export the contents of one page to the remote machine.

    2. Request the needed page from the remote machine. A protocol could be designed (for the 2 machine case) where the request is contained in the same message holding the page to export.

    3. Import the required page.

    A surprising number of students discussed disks in this problem, even though the problem statement specifically indicated that there was no secondary storage.

    Part B: (2.8 points) Given multiple machines, propose a decentralized protocol for finding the machine that has a particular page and obtaining a copy of that page.

    This is essentially a distributed broadcast problem on a point-to-point network, so our solution is a variant on the distributed broadcast algorithm, which is itself a spanning tree algorithm:
            find page(p,r,i)
               -- a request for page p has been received
               -- if p is found, send it to r 
               -- this message has unique serial number i
               if i is in find-page cache
                  -- a loop has been found in the network
                  exit
               else -- this system has not yet been checked
                  add i to local find-page cache
                  if page p is availabl locally,
                     invalidate p in local virtual memory maps
                     send page p to r
                  endif
                  exit
               endif
            end find page
    	
    (Note for the curious: The same scheme can be used to find all free pages on the system and report them to the machine that had the fault. That's likely to cause too many reply messages, so we need to avoid doing this. For that matter, the broadcast initiated by find-page is probably also not something we want to do with every page fault, so it would be a good idea to keep a local cache of the last known locations of pages and free frames, and only initiate broadcasts if this cache does not yield useful results. Alternately, and somewhat equivalently, a name-server system, possibly hierarchic, could be introduced to track locations of pages.)

    Some students presented really bad algorithms were here, taking no apparent advantage of what they should have learned about broadcast algorithms. Others simply said "do a broadcast" with no reference to how this should be done in the network context given.

    Part C: (2.8 points) Assuming that the distributed virtual memory software maintains some bits of extra state information per page frame, what is the most effective page replacement policy that can be implemented on each machine.

    This system does not allow clean/dirty detection because there is no support for read-only pages! With only two states, read-write and unavailable, the only thing we can do with soft page faults is detect that a page is being used. This allows the clock algorithm to be used, with soft page faults used to mark pages. The states of a page are thus:
    • Marked. This becomes Unmarked when the clock hand sweeps by this page's frame.
    • Unmarked. This becomes Marked on soft page fault. This becomes Invalid when the clock hand sweeps by; signalling that this page has been replaced, freeing its frame.
    • Invalid. This becomes Marked on hard page fault. No frame is a associated with this page
    A two bit code can distinguish these 3 states. Marked pages are the only ones marked as read-write by hardware. A marked bit could also be maintained by software. Unmarked and invalid pages would both be marked as inaccessible by hardware; a software bit is needed to distinguish between these.

    Too many students talked about clean-dirty policies, even though the given hardware clearly does not allow for support of such a policy.

  3. Part A: (2.7 points) Suppose you have a file object f with the method read, where the message to f.read contains the sector number of the file you wish to read. What information goes in this message? How would you structure a program to read and process all sectors of a file?
    Presumably, sending a message to f.read causes the file object to send the contents of one sector of f and a return code in a message to somewhere. Therefore, the parameters to f.read are: 1) what sector to read, and 2) where to send that sector.

    Our system has no real concept of processes, only nonblockable threads, each of which runs to completion before another thread runs on that cpu; therefore, a "process" must be viewed as a string of method invocations run in sequence, where control flow from method to method is accomplished by messages.

    This leads to the following solution:

            object fp -- the file processor
               local file f -- the file to process
               local int i -- the sector number
    
               method start(f')
                  i = 0 -- assume sectors are sequential
                  f = f' -- assume sectors are sequential
                  send( , f.read )
                  exit()
               end method start
    
               method process(s,r)
                  -- s is a sector of data, unless r=EOF
                  if r /= EOF
                     process s as needed
                     i = i + 1
                     send( , f.read )
                  endif
                  exit()
               end method process
             
            end object fp
    	
    The most common problem here involved a failure to notice that conventional notions of process cannot be supported by this class of programming environments. Typical answers involving this class of oversight involved busy waiting and flag setting to compensate for the lack of blocking primitives, even though the problem statement said that there is no context switching, which implies that each thread runs to completion before the next starts.

    Part B: (2.7 points) Discuss how you would structure a semaphore object under Charm or Amulet.

    I'll repeat the observation that our system has no real concept of processes, only nonblockable threads, each of which runs to completion before another thread runs on that cpu; therefore, a "process" must be viewed as a string of method invocations run in sequence, where control flow from method to method is accomplished by messages.

    The semaphore would have an integer count and a queue. The parameter to P would be the destination to which a message should be sent when the semaphore permits the "calling process" to continue. If the count is positive and nonzero, P decrements the count and sends this reply immediately. If the count is zero, P enqueues the reply destination.

    The V operation, with no parameters, either removes an item from the queue and sends the deferred reply to it, or, in the case the queue is empty, increments the count. As described above, reply messages contain no data; they merely start a parameterless method.

    It is possible to add an extra parameter to P, the message to be sent with the reply message, but so long as the caller of P and the method to which the reply is sent are methods of the same object, this extra parameter is not needed (local data of the object can be used for this). Furthermore, any program can be rewritten to meet this constraint and thus avoid the need for this extra parameter and thus, the need for the queue in the semaphore to hold anything other than the destinations of deferred reply messages.

    The same problem discussed above persisted here, with students suggesting that the semaphore queue should contain process ID's, so that their answer ended up being almost exactly the textbook implementation of semaphores instead of the somewhat quirky implementation required by this "process free" version of the object oriented worldview.

  4. Part A: (2.7 points) How would demand driven virtual memory on the 80286 differ from traditional demand paged virtual memory?
    The key thing to note is that the 80286 deals with variable sized segments (up to 64K bytes), so there is no such thing as a fixed size page frame, making page placement and replacement more complex. On fault, the size of the referenced segment must be found, then a free block that size must be found. If there is no free block that size, a block must be made by evicting one or more segments.

    If faults are detected when a segment description register is loaded, the instruction that caused the fault must have referenced some particular slot in the process base to load the register. The process base corresponds to the page-table in a paged system, so this makes the analog of page-table updates required inside the fault handler fairly easy.

    On the other hand, if faults are only detected when a segment description register is actually used in memory addressing, it may be difficult to determine, in the fault handler, what entry in the process base must be updated.

    Some students seem to have taken too much advantage of their knowledge of later members of the Intel 80x86 family and made assumptions that the immensely more complex memory models of those machines applied here.

    Part B: (2.7 points) Consider two kernel services, insert(cap,p), which puts cap, a capability, into slot p of the process base segment, and extract(p), which returns the capability found in slot p of the process base segment. The capability representation used by these is a bit pattern; how can this be protected?

    The Amoeba server-side authentication trick can be used! Here, we view the kernel as the server and the caller of insert and extract as the client. If the kernel maintains, for each segment, a "magic number" and grinds this together with the rights for the segment through a one-way or trap-door function to create a check code added to the capability, then cracking the system can be made arbitrarily difficult.

    Most of the class did fairly well here!

  5. Explain how the use of a cache or caches in virtual memory address translation, network name service and directory servers can speed name to address translation, and explain what operations are required, in each of these contexts, to account for a cache miss.
    All of these contexts involve name-to-address translation: Virtual address to physical address, file name to file number, and network resource name to network address are all examples of such translations. In each case, a cache allows this translation to be faster for commonly used names.

    With virtual memory, the cache allows us to avoid the memory cycles needed to access a large page table -- in the case of paged segmented memories, this can involve multiple memory cycles per cache miss.

    With file servers, a cache can allow us to avoid reading a directory from disk and then searching that directory; in the case of hierarchical file names, this can involve multiple disk acceses.

    With network name service, a cache can allow us to avoid messages to a central name server or broadcast messages to peer systems in the net in order to locate the network address to which a message should be sent. In large networks, the cost of a cache miss can be large because of the expense of broadcast and the delays involved in a sequence of queries passed up the name-server hierarchy.

    Perhaps half did well here; common problems included too much focus on one of the three subjects given or excessive abstraction -- speaking about the virtues of caches without connecting the discussion to the examples given in the problem statement.