Midterm Study Questions, Fall 2002

Part of the homework for 22C:116, fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Warning: These questions are not intended to exhaustively cover everything that might be on the exam. Their only purpose is to focus some of your study time on specific problems in order to allow some of the exam questions to be more focused than would be possible on a purely in-class exam. This allows the exam to have some of the character of a take-home exam.

Background: You have been hired to improve the memory management components of ECOS, the Excessively Complex Operating System. Your boss proposes that a new feature be added to the ECOS virtual memory mechanism; his idea is as follows: Instead of moving pages between disk and main memory, let's use data compression. When a page would normally be copied to disk, compress that page and store it elsewhere in main memory. When a page would normally be copied from disk to main memory, uncompress that page.
Naturally, the compression algorithms used must be lossless. All lossless compression algorithms have the property that the size of the compressed output depends on the nature of the input. Some inputs compress very well -- those with highly predictable content, while othes compress poorly -- those with content that appears random. Because of this, we must use something like a heap manager to store compressed page images.
A problem: The heap manager that stores compressed page images will suffer from fragmentation, and for any compression algorithm, we can speak of the average amount by which it compresses pages encountered in practice. Can you state in quantitative terms the relationship between these that must hold for this idea to perform reasonably well?
A problem: The MMU hardware's page table layout allows f bits for the frame number in each page-table entry, where f is smaller than the size of a physical address. What data structure problems might this pose when compressed pages are stored in a heap in some part of main memory.
A problem: If we know the size of the working set of a program, the size of the memory needed by that program, and the amount by which we can expect typical pages to be compressed, and the fragmentation of the heap, how much main memory should be buy to run that program, and how should this be divided between page frames and the heap?
Background: The ECOS process manager supports a generalization on semaphores where both wait and signal take extra parameters. These are described as follows in the official documentation.

wait(s,n)
Wait until the count in the semaphore s is greater than n and then atomically decrement the count by n.
signal(s,n)
Atomically increment the count in the semaphore s by n.

A Problem: How is the ECOS wait() operation different from: ewait(s,n): repeat n times conventional_wait(s); return; Can you construct an example where the difference between the ECOS and conventional versions would make a difference?
A Problem: Suggest a reasonable implementation of these ECOS synchronization operations.
Background: All ECOS I/O operations are nonblocking. To request an operation on a file or device, users use the following kernel calls:

read(d,buf,len,sem)
write(d,buf,len,sem)
These initiate read or write operations on the file or device indicated by descriptor d, using the buffer buf and transferring len bytes (eventually). Each time a byte or group of bytes is transferred, the semaphore sem may be signalled, incrementing its count by the number of bytes already transferred; if the count has been incremented by n, this always means that the first n bytes of the buffer have been transferred. Users may block on the semaphore to wait for the completion of the entire transfer or for the completion of some part of the transfer.

Like Unix, ECOS allows any size buffer, with no requirement that the buffer size match the sector size of the device. Like most Unix implementations, the ECOS system will try to do the I/O directly from the user's buffer if the buffer is aligned appropriately.
A problem: When reading from a disk file, under what circumstances would the count in sem be incremented by less than len? When reading from the keyboard or an asynchronous communications line, under what circumstances would the count be incremented by more than one?
A problem: Can this nonblocking version of read be used to create a thread-safe read operation? If it is insufficient, suggest an augmentation to some part of the ECOS specification that would make it sufficient.
A problem: Consider a disk driver for ECOS, where that driver has no disk scheduler. How does the interrupt service routine differ from that of, for example, a Unix like system. How does adding a disk scheduler change your answer?