Assignment 9, due Nov. 1

Part of the homework for 22C:112, Fall 2013
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On all assignments, your name must be legible as it appears on your University ID card! Assignments are due at the start of class on the day indicated (usually Friday). Exceptions will be by advance arrangement unless there is what lawyers call "an act of God" (something outside your control). Homework must be turned in on paper, either in class or in the teaching assistant's mailbox. Never push late work under someone's door!

  1. Background: Consider an MMU for a computer with a 32-bit word and a 32-bit virtual address, with the following format:
     |_ _ _ _._ _ _ _|_ _ _ _._ _ _ _|_ _ _ _._ _ _ _|_ _ _ _._ _ _ _|
     |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
     |      segment      |  page in segment  |     byte in page      |
    

    In this case, the MMU has no support for a mark bit or a dirty bit, and in fact, not even a valid bit. Page-table entries have the following format (in binary):

     |_ _ _ _._ _ _ _|_ _ _ _._ _ _ _|_ _ _ _._ _ _ _|_ _ _ _._ _ _ _|
     |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
     |                 frame  number                 |  unused |R W X|
    
       R - read permission bit
       W - write permission bit
       X - execute permission bit
    

    The unused bits in each page-table entry are ignored by the MMU but may be used by software for any purpose.

    a) Give the physical address format. (0.4 points)

    b) How many gigabytes of RAM can be attached to this computer? (0.4 points)

  2. Background: Consider the following outline of the MMU-trap handler for the MMU described in the above sectionn:
    void MMUtraphandler() {
            /* assume entire CPU state has been saved */
    
            intptr_t va = get_virtual_address_from_mmu();
            intptr_t op = get_memory_operation_from_mmu();
            /* va is the address that caused the trap */
            /* op is the memory operation (rwx) that caused the trap */
    
            uint32_t page = va >> 12;
            uint32_t frame = pagetable[ page ] >> 8;
            uint32_t rights = pagetable[ page ] & 0xFF;
    
            uint32_t hardr = rights & 7;
            uint32_t realr = (rights >> 3) & 7;
            uint32_t ondisk = rights >> 6;
    
            if ((op & realr) == 0) {
                    throw_user_exception_in_saved_state( op );
            } else if (ondisk) {
                    frame = handle_page_fault( page );
                    ondisk = 0;
                    hardr = realr & op;
            } else {
                    hardr = hardr | (realr & op);
            }
            pagetable[ page ] == (frame  << 8)
                               | (ondisk << 6)
                               | (realr  << 3)
                               | (hardr            );
    
            /* assume entire CPU state will be restored */
    }
    

    a) Give the layout of the rights field of a page table entry, showing the interpretation, if any, of each of the 8 bits. Keep it brief -- for example, use the access rights r, w and x, with the qualifiers real and hardware, and do not use English sentences. (0.4 points)

    b) Which combinations of access rights correspond to unmarked pages, that is which combinations indicate that the page is in RAM but has not been recently used. (0.4 points)

    c) Give the appropriate code to apply to the rights field of a page table entry to clear the marked-state of that page when the clock hand sweeps by the frame holding that page. (Don't bother packing or unpacking the fields of the page-table entry, since the code for that can be swiped from the code given above and is therefore trivial. The correct answer is just one line of code!) (0.4 points)

    d) Which combinations of access rights correspond to clean pages, that is, pages that, if replaced, need not be copied back to disk. (0.4 points)

  3. Background: If a computer system's virtual memory mechanism has no connection to the file system, then the sector number of a page within the backing storage (probably a disk partition) can be a simple function of the page number.

    Look up the mmap() system call on Linux.

    A problem: The easiest way to support MMAP would be to enlarge each page table entry to contain more data. Following this lead, what are the obvious fields to add to each page table entry to make MMAP work. (0.6 points)

     

    Machine Problem V

    Due Monday, Nov. 11

    The solution to Machine Problem IV involved using the following routines from <stdio.h>:

    Implicitly, as part of the initialization of <stdio.h>, one other routine is called:

    Your job is to write new versions of the above routines, replacing the routines found in <stdio.h> with your alternative code. You need not worry about output. This is an input-only assignment. Your routines should be 100% compatible with the following header file:

    /* myio.h -- include file for MP5 */
    
    typedef struct myfile MYFILE;
    
    MYFILE * myfdopen( int fd, const char * mode );
      /* identical specification to fdopen() */
    
    int mygetc( MYFILE * f );
      /* identical specification to fgetc */
    
    long int myseek( MYFILE * f, long int pos );
      /* equivalent to fseek( ... SEEK_SET ) */
    
    long int mytell( MYFILE * f );
      /* identical specification to ftell */
    

    A solution to MP4 will be distributed shortly that is modified to work using this interface for input while continuing to use <stdio.h> for output. You will be able to compile and link this solution with your code for test purposes.

    Your code should be separately compiled from the main program, so that it's use of your routines is made possible only because of the use of a linker. Your code may not use <stdio.h> for anything at the time you turn it in. (You may find it useful to include <stdio.h> and calls to its routines during debugging, but you must remove these calls before you turn in any code.) Your code must use the Unix/Linux kernel calls read() and lseek().

    Your code, as turned in, may not use read() with a buffer size of 1, as in some of the discussion earlier in the semester. (You might find this useful during early stages of debugging, but by the time you turn in the code, the buffer must be enlarged.) A buffer size of 32 bytes is recommended for this problem. (32, not because it is efficient, but because correct handling of end-of-buffer is tested more frequently when you use small buffers).

    The file position known to the operating system must always be a multiple of the buffer size, so that if the buffer size is matched to the disk sector size, all reads() into the buffer can be doen directly without relying on the blocking and deblocking facilities of the operating system. This maximizes the efficiency of the disk I/O.

    Additional requirements:

    As usual, your solution should be in a file called mp5.c. Yes, we know that the "natural" file name would be myio.c but please give it the standard name for a machine problem solution. As usual, your code should be submitted using the submit program, this time in the directory mp5.

    Commentary

    A version of MP4 modified to test MP5 has been distributed on line. This, in turn, has been tested with a minimal implementation of the MP5 user interface. The latter is functional, but it uses a buffer size of 1 and therefore is not a solution to this machine problem.

    The original include file myio.h given above has been edited to fix a bug in the declaration of the type MYFILE. This required a change to mp5demo. The changes were made Thursday, Nov. 6 at 5 PM. If you started work before that time and had problems with type errors, you should update your code.

    The original include file myio.h has been edited again. There was a bug! The type of mygetc() was given as char, but the real getc() returns an int, because the end-of-file value is indicated by -1, which is not a legal char. Changes made Saturday Nov. 9, at 10 PM.

    How do I compile it? Suppose you have these file in your current directory: myio.h -- the header file, mp5.c -- your solution, and mp5test.c -- the modified version of mush distributed to test your code. You can compile it all (producng an executable called a.out) with the following shell command:

    cc mp5.c mp5test.c
    

    The above shell command recompiles both of them each time you compile. It creates object files, links them, and then deletes them. You can also build a makefile for the project if you want. Given how small the programs are, this is hardly worth the effort.

    A student asked: "Why do we need to change the code for fseek() distributed in the minimal implementation of the MP5 user interface from the file mp5demo?"

    The answer lies in the motivation for reading by the buffer instead of one character at a time, as in mp5demo. If each buffer contains exactly one disk sector of the file, reads can be very efficient. If a buffer contains part of one sector and part of the next, the burden of blocking and deblocking is placed on the operating system, and performance will drop.

    A student asked: "Will the length of the input always be multiple of 32?"

    Read the specification of read(). You tell it the buffer size when you call read(), and it returns the number of characters it actually read. When reading from disk files, it always reads as much as you asked for unless you try to read past the end of file. When reading from the keyboard, it might just read to the end of line.

    Of course, you should write your code so that the buffer size can be changed easily. If I were silly enough to demand that you change your code to work with a 51 byte buffer, it should be trivial to change.