Homework 5 Solutions

22C:116, Spring 1995

Douglas W. Jones

  1. The problem: As described in the notes (any errors included!), what is the exact maximum size, in bytes, of a file under the classic UNIX file system. This number is stated causually as about 8 megabytes.

    The notes said that an I-node contains addresses of 8 sectors plus a pointer to a sector containing 128 addresses plus a pointer to a sector containing addresses of 128 sectors containing 128 addresses each. Each sector is 512 bytes, so the maximum file size is (8+128+128x128)x512 bytes. This is 8458240 bytes.

  2. The problem: Design a disk-resident data structure appropriate for tracking the free sectors on such a disk. Show how this structure can be used to support a reasonably efficient algorithm for allocating a new disk sector to a file, given the disk address of its predecessor.

    For each cylinder, allocate one sector to hold a bit vector indicating which sectors in that cylinder are free. This is an inefficient use of storage (it uses 512 bits out of an available 4K bytes); in fact, one sector of the hypothetical disk will hold the free-space information for as many as 64 cylinders, but the wasted space is under 0.2 percent of the available disk capacity.

    To allocate a sector to follow sector x, inspect the free-space record for the cylinder containing sector x, and only if this cylinder is full, inspect the free space records for adjacent cylinders. The seek needed to get the free space record for the cylinder from which the new sector is allocated should be short, assuming that a write was just completed to sector x, and the next seek should be short, assuming that it will be a write to the new sector.

  3. The Problem: Give a high level description of a reasonable policy for anticipatory fetches in such a cache, and describe what information from the file system must be available to the policy in order to correctly anticipate demands posed by the file system.

    If sequential reads and writes actually dominate file system use, the disk cache should consider that a read or write to a particular sector indicates that the sector in question will not be referenced again.

    A reasonable anticipatory fetch policy would be to enqueue read requests for the next n successive sectors each time a program requests successive sectors of the same file. If we're lucky, each such read will find that the desired sector has just been read. If the desired sector has not yet been read, the value of n was too small, so increment n. If the program does a random-access operation within a file, set n to zero (stop trying to anticipate sequential reads). Periodically decrementing n for each open file will keep the system in balance.

    For writes, just use the disk queue to allow write operations to be queued while the user program continues with computation. You could adaptively adjust the number of queued writes allowed, as suggested above for reads, but if additional buffer space is available, the best performance will result if as many writes as possible are buffered.

  4. The Problem: Why did the UNIX designers forbid arbitrary graph structures in the links between directories? Some experimental file systems have allowed arbitrary linkage between directories. What special problem must be solved when such links are allowed?

    Garbage collection. Unix uses reference counts to decide when an I-node is no longer accessable. The rmdir command assumes that it is safe to remove the . and .. link of a directory, and no link to a directory may be removed by any other command. If circular or other arbitrary links to directories are allowed, removing a link to a directory must be legal at any time, and removing a link to a directory need not imply that . and .. should be removed. Thus, something akin to garbage collection must be used to determine when a directory may be unlinked.