Assignment 6 Solutions

Part of the homework for 22C:116, fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Background: Classical operating systems texts cover the relationship between files, file systems and disks at great length, but say little about window managers.
The Problem: In what sense is the relationship between windows, window managers and graphical display devices similar to that between files, file systems and disks? What are the key conceptual differences, aside from the obvious difference that a file is a one dimensional string of bytes while a window is a two dimensional array of pixels and the differences that are consequences of this.

A window can be thought of as a virtual display device, created from an area of a real display device by the window manager in very much the same way that a disk file can be thought of as a virtual disk, created from the space on a real disk by the file system. There are differences. For example, in allocating space on the display screen, windows may overlap, while this is never the case with files on disk, and, when fully exposed, a window is always contiguous, while the sectors of a file on disk need not be contiguous on the underlying medium. Another family of differences is a consequence of the fact that files are an information storage abstraction, while windows are a user interaction abstraction. Windows do not necessarily have associated memory -- so the application may be required to refresh the contents of a window when that window is exposed; there is no corresponding operation on files. Windows frequently have the concept of keyboard focus, so that the keyboard input device is multiplexed between multiple virtual keyboards, one associated with each active window; there is no corresponding auxiliary virtual device associated with disk files.
Background: An open disk file can be considered as a virtual disk, where the interface between upper levels and the file system supports the exact same operations as the interface of the disk I/O driver. Assume these operations are:

disk->post_read(buffer,sector,done)
disk->post_write(buffer,sector,done)
given a pointer to a buffer holding one sector, given a sector number, and given a pointer to a semaphore, read or write that sector and signal the semaphore when the transfer is completed.
disk->get_info(buffer)
given a pointer to a buffer, return, in that buffer, the number of sectors on this disk or in this file, the number of bytes per sector, the number of sectors per track, the number of tracks per cylinder (all useful for optimizing data layout), and perhaps other information that doesn't matter for our purposes.

Usually, we think of disk files as being smaller than the corresponding physical disk, but this need not be the case.
Assume your machine has a disk interface bus that allows several tens of disks to be attached, and your disk I/O driver is able to direct input or output to any of these disks.
The problem: Your customer wants to develop software for editing motion pictures, with an expected uncompressed file size of 2 terabytes (2 hours times 24 frames per second times 4 million pixels per frame times 24 bytes per pixel). You can store such a file only on an array of disks. Describe the internal representation of an open file of this type and outline any special code you would need in the read and write routines.

Assume that we haven't got any special RAID hardware. All we have is, for example, 16 conventional 120 gigabyte disks with conventional drivers for each. An open "superfile" would therefore be described by an array of references to the disks being used to support that superfile, and the "superfile system" might allocate entire disks to superfiles instead of allocating sectors. The virtual sector number translation code in the read and write routines for superfiles could divide the virtual sector number by the number of sectors per disk to get the disk number, using the remainder as the sector number within the selected disk.
Background: The Xerox Distributed File System was based on the idea that each sector and file would be completely self-describing. On an XDFS disk, two words in the prolog of each disk sector were used to hold the sector number of the sector in the file and the file number of the file in the file system (analogous to an I-number). In theory, to read sector i of file j, all that was needed was to scan the entire disk for a sector with <i,j> in its prolog; in fact, the tuple <i,j> was used as a search key in a B-tree data structure that related each key to a particular sector number on disk.
All XDFS files were self-describing, with sector -1 of each file holding the file name of the file and the file number of the directory where that file was indexed. In theory, therefore, no data needed to be stored in each directory, since these back pointers encode the entire directory tree; In fact, of course the directory stored <textual-name,file-number> tuples.
Assume 32-bit file numbers, 32-bit sector numbers in the file, 32-bit sector numbers on disk. Assume 1K bytes per sector, and assume a disk that holds 4 Gigabytes. (All these are modern numbers; the real system had smaller numbers!)
Part A: Given that each B-tree node is 1 disk sector, and given that the file system is half full, how many disk accesses would be expected to find the disk address of sector i of file j, assuming that sector exists!

1K bytes per index node implies 256 32-bit words per index node. Each index entry in a B-tree stores one sector number (32-bits) and one search key (32-bits). Therefore, the index node holds 128 entries. An average B-tree index node is half full, so each one has, on the average, 64 children.
If our disk is 4 gigabytes, with 1K sectors, there are 4 megasectors on the disk. If only half of these are in use, our B-tree must have 2 million leaves. So, the height of the tree is log₆₄ of 2 million. Note that 2 million is approximately 2²¹ and 64 is 2⁶ so the tree must be ciel(21/6) or 4 levels deep. 64³ is 2¹⁸, while 64⁴ is 2²⁴.
So, for this file system, we expect to traverse 4 levels of index nodes to reach each leaf, for a total of 5 disk accesses per leaf disk access. (of course, we'd maintain a cache of index nodes in main memory, so most accesses to a file after the first would not require these extra disk accesses!)

Part B: A power failure during a write operation corrupts the contents of a random sector on disk. Describe the computation required to recover from this loss prior to the start of normal operation. Also, what is the maximum information loss due to this system failure. (Unix systems have a similar recovery program called fsck, but it cannot recover as well as its cousin on XDFS).

First, scan the disk. For each sector on disk, verify that the corresponding B-tree entry is correct (and if not, add it to the tree!). It may be faster to simply delete the B-tree and then rebuild it. This is equivalent to rebuilding all the I-nodes of a Unix file system after the I-table has been deleted -- an impossible job under the classical Unix file system.
Second, look at sector -1 of each file on the disk, and verify that the directory entry for that file is contained in the file's parent directory; if it is not, then create the directory entry. It may be faster to simply delete the entire directory tree and rebuild it. Unix cannot do this for data files, but it can restore the directory tree because each directory contains a link to its parent (..); the Unix fsck checks the correctness of these links and restores any that are incorrect.
As a result, under the XDFS file system, we have a guarantee that the destruction of one sector of the B-tree or of any file will never cause loss of access to the sectors of that file. The most we can lose is one sector of data in one data file.
The problem: What are the objects in the domain of a Unix process, from the point of view of the operating system? There are several different classes represented in this domain!

The process domain contains memory segments (code, static, stack, shared) that are typically further subdivided into pages. The domain also contains open files, a real-time clock, and the set of files the process may open, as defined by the access control lists of the file system.