Assignment 6, due Mar. 4

Part of the homework for 22C:112, Spring 2011
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On every assignment, write your name and the course number legibly. Exceptions will be by advance arrangement unless there is what insurance companies call "an act of God" (something outside your control). Homework must be turned in on paper, either in class or in the teaching assistant's mailbox. Never push late work under someone's door!

Background: The simplest adequate implementation of a disk scheduler consists of a linear array of pointers to queues, one per cylinder of the disk. When a disk request is enqueued, it is added to the queue for the cylinder of that request. When the disk interrupt service routine begins to serve a cylinder, it sets that queue aside, replacing it with an empty queue, and then works on the set-aside queue. When successive interrupts empty the set-aside queue, the interrupt service routine moves to a new cylinder.
a) The need to set aside the queue before serving it is the result of what could otherwise happen if disk requests were enqueued in the queue for the cylinder currently being served. What is the problem and why is it solved by setting the queue aside before beginning to serve it? (0.5 points)
b) Consider this alternative approach: The queue is maintained as a binary search tree, sorted by linearized disk address (of the various BST organizations, a splay-tree works best, but this is not relevant to the problem). The left child is for all requests less than or equal to a node. A unidirectional elevator algorithm is used, traversing the tree from left to right and deleting each node as it is served. Does this approach suffer from the problem identified above? Explain why or why not. (0.5 points)
Background: Consider the following organization for a file system: Each file in the file system is given a unique 32-bit file ID number. Each sector of the file has, of course, a sector number in that file; assume this is also a 32-bit number. Therefore, each sector in the file system can be given a unique number that is simply the 64-bit concatenation of the file number with the sector number.
One way to organize the sectors of a file on disk is to use this 64-bit sector number as the key in a search tree, where the leaves are the disk sectors of files, and the internal nodes of the tree are index sectors used to locate the correct leaf. The B-tree data structure is particularly appealing for this purpose (see the Wikipedia entry on B-trees).
Assume that disk sectors are 512 bytes, and that linearized sector numbers are 64 bits long.
a) If the aggregate content of all data in all disk files occupies 1 gigabyte (that is 2³⁰ bytes), how many disk sectors would be occupied by the B-tree needed to provide access to this data. (0.5 points)
b) If the aggregate content of all data in all disk files occupies 1 gigabyte, How many disk accesses would be required to find the location of a randomly selected disk sector. (0.5 points)
Background: Consider the problem of allocating space to growing files. If data sectors were the only concern, and there were only one file, the optimum way to allocate space for one file would be to allocate every second or third sector along one track to the file (catching the interleaved sectors on successive revolutions), filling successive tracks of one cylinder until the cylinder is full, and then filling the adjacent cylinders successively. Interleaving is needed because the interrupt service routine needs time to work and as a result cannot schedule operations on the immediately succeeding sector.
a) If the file system uses some kind of index sectors to locate the data sectors of a file, how would you modify the above scheme to allow files to be read as quickly as possible? (0.5 points)
b) Suppose your application has multiple files open for output and it occasionally adds sectors to each of them. Suggest an allocation policy that tries to appoximate an optimal allocation pattern without knowing in advance which files will end up being large files and which will stay small. (0.5 points)

A Machine Problem

Due Monday, March 21

Modify the code for Machine Problem 1 to support double-quotes around arguments, used to group those arguments into a single argument, and add the following built-in commands to your shell:

do -- a command with no arguments marking the top of a loop. Loops must be allowed to be nested at least three deep. How you support nesting is up to you, but you might want to begin development without supporting nesting.
while args -- a command that marks the end of a loop. The additional arguments are interpreted as a shell command. If the command terminated with EXIT_SUCCESS and if standard input is being read from a disk file, your shell should go back to the point immediately after the corresponding do command. Otherwise, it should continue.

See man 3 exit and man 2 wait for documentation on how to set and test the exit status of a program.

See man 2 lseek and man 3 fseek for tools that allow you to mark the current position in a file or change the position to the marked position. These pages also document the return values that result from attempts to seek on devices that do not allow seeking.

See man 1 test for documentation of a command that will be the most popular command to execute as the argument of a while command.

tcsh -c "tcsh command" will call tcsh to execute a single command from its command line argument. If you create a directory with some files in it, you can make a loop that successively deletes files from the directory using the following shell command as a loop body:

        tcsh -c "rm `ls | tail -1`"

All of the above assumes that you have the search path mechanism assigned for machine problem 1 debugged.

Submission instructions, as with machine problem 1, but call your shell mp2.c and submit it in the mp2 directory.