22C:18, Lecture 37, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

Disk Interfaces

The following information is needed to address a block of data on a typical moving-head disk drive, whether that disk drive was made in the mid 1960's or today:

  1. Cylinder number -- what position should the heads be moved to?
  2. Surface number -- which head should be selected?
  3. Sector number -- which sector is being referenced?
Typical disks spin very quickly; consider a disk from the early 1970's, with 20 sectors per track and 512 data bytes per sector, spinning at 20 revolutions per second. In each second, 204,800 bytes of data fly past the heads on this disk (512 bytes times 20 sectors per revolution times 20 revolutions per second). This was in an era when a typical small coumputer could only execute perhaps 1,000,000 instructions per second, so it was nearly impossible to imagine the input/output software dealing with data to and from disk on a byte by byte basis.

Modern CPUs are much faster than the computers of 1970; typical modern main memory allows access in only 70 nanoseconds, but our disks are proportionally faster, and the result is that, as in 1970, we cannot afford to transfer data to and from disk by reading bytes or words individually under program control. Instead, as in 1970, modern disk interfaces incorporate direct-memory-access hardware.

A direcct-memory-access (DMA) device is one that can read and write main memory without the need for the CPU to transfer individual bytes and words. The video interface on a typical computer is one example of such a device; this interface constantly reads the contents of a designated area of memory, the video RAM, and uses what it reads to generate an image on the display screen.

With a disk interface, we don't designate a fixed area of memory for direct memory access; instead, before starting a disk input or output operation, we specify the main memory address to or from which data is to be transferred by the disk interface. Thus, in addition to telling the interface what sector, cylinder and surface we wish to access, we also tell the interface what memory address to use. A typical crude disk interface for the Hawk might have the following registers:

FF101000 -- the disk address register
Contains fields specifying the drive, cylinder, surface and sector.

FF101004 -- the DMA register
Contains the memory address of the next word to be read or written; incremented by 4 with each word read or written.

FF101008 -- the command-status register
Used to start a data transfer and indicate the progress of a transfer.
This register set is typical of disk interfaces designed in the late 1960's and early 1970's for minicomputers. Mainframe disk interfaces of that era and microcomputer disk interfaces of today are considerably more complex, for reasons that will be explained. It is far easier, however, to outline how the disk interface works in terms of this simple interface than in terms of the more complex interfaces typical of real systems today.

Even on low-end systems of 1970, the disk address register typically allowed multiple disk drives to be connected to one controller. As a result, it was (and is) typical to specify a disk address with something like the following format:

Disk Address Register
 _______________ ___________________________ ___________________
|___________|___|___________________|_______|___________________|
|31       26| 24|23               14|13   10|9                 0|
    unused  drive   cylinder         surface        sector
Typically, only 4 drives would be supported. The format shown allows up to 1024 cylinders and up to 1024 sectors per cylinder, with 16 surfaces. We will assume that the Hawk machine uses sectors of 4096 bytes each, so this allows for a huge disk drive! Any particular drive would typically support fewer cylinders, surfaces and sectors!

The command and status register must indicate, at minimum, when the disk drive is idle or busy, whether it is to request an interrupt when it completes an operation, and what operation it is to do. Typically, additional information is returned to the CPU that is useful for diagnostic purposes:

Disk Command and Status Register
 _______________________________________________________ _______
|_______________________________|_|_______|_____|_______|_______|
|31                           16|15     11|10  8|7     4|3     1|
           word count            I unused command  code  E C T R
The fields of the command and status register are as follows:
R - Ready
Indicates that the disk interface is idle

T - Transfer
Indicates that the disk interface is actually transferring data

C - Cylinder
Indicates that the disk heads are over the addressed cylinder

E - Error
Indicates an error on the most recent operation

code - Error code
Indicates the nature of the error that was detected
  1. No error was detedted.
  2. Requested drive not present.
  3. Illegal cylinder number.
  4. Illegal surface number.
  5. Illegal sector number.
  6. Drive not up to speed.
  7. Cannot read any headers at requested cylinder.
  8. Cannot read any headers on requested surface.
  9. Cannot read header for indicated sector.
  10. Parity error on transfer to or from memory.
  11. Illegal memory address on transfer to or from memory.
  12. Checksum error on read from disk.

Command
Indicates what operation the disk controller is to perform.
  1. No operation. Ready will be set immediately. Primarily useful for diagnostic purposes.

  2. Seek indicated track. Ready will be set as soon as the disk heads of the requested drive are over the indicated track. Useful for diagnostic purposes, and also useful for pre-positioning the heads while I/O is in progress on other devices.

  3. Read. Ready will be set after one sector of data has been read from the indicated sector.

  4. Write. Ready will be set after one sector of data has been written to the indicated sector.

  5. Write format. Ready will be set after a full track worth of sectors has been recorded on the addressed surface and track.

I - Interrupt Enable
Indicates that the disk will request an interrupt when ready. This interrupt will be requested whenever the ready bit is set and interrupts are enabled.

Word Count
Indicates the number of words remaining to be transferred to or from the disk. Set to the size of the sector by the read and write commands and decremented with each word read or written.
This interface is quite complex, but every detail of it follows naturally from the function of the disk controller. The long list of error indications includes all of the different ways that a disk operation could fail; the set of operations includes all of the operations a user might need, plus uperations that are useful in diagnosing failures, and the status bits include all the status the user needs plus bits that are useful in monitoring the operation of the drive.

The hardware of a typical disk interface can be viewed as executing a fixed program in response to the setting of these interface registers:

	Repeat
	    await the CPU writing the command register
	    if not no-op
		while heads not over requested cylinder
		    move heads toward requested cylinder
	    endif
	    if command = format
		select requested surface
		for sector number = 0 to sectors-per-track
		    for word-count = 0 to words-per-header
		        write a word of the header
		    for word-count = 0 to words-per-sector
		        write 0
		    write checksum
		endloop
	    else if command = read or write
		select requested surface
		repeat
		    await start of header
		    read header
		until header for desired sector found
		set word-count to sectors-per-track
		set checksum to zero
		repeat
		    if command = write
			write m[DMA-address] to disk
		    else if command = read
			read from disk to m[DMA-address]
		    endif
		    include transferred data in checksum
		    increment DMA-address
		    decrement word-count
                until word-count = 0
		if command = read
		    read trailer
		    set error if trailer and checksum unequal
		else if command = write
		    write checksum as trailer
		endif
	    endif
	    set ready and request interrupt if enabled
	endloop
This fixed program for the disk controller is complex enough that it is easy to compare the complexity of a disk controller to that of a CPU; in fact, most disk controllers built since the mid 1970's have incorporated standard microprocessor or programmable interface controller chips, with the above algorithm stored as a program on a ROM chip that is part of the controller.

Disk Input/Output Software

At the lowest level, disk input/output software can be fairly simple. Consider the following disk-read routine, shown in pseudocode form:

diskread( disk, cylinder, surface, sector, buffer )

	disk-address register =
		    disk << 25
		  + cylinder << 13
		  + surface << 9
		  + sector
	DMA register =
		    buffer
	command and status register =
		    3 << 7 /* a read command */
		  + 0 << 14 /* interrupts disabled */
	repeat
	    /* do nothing */
	until odd ( command and status register )
	return
This simple version does nothing if an error is detected, and most of the code is devoted to formatting the contents of the various device control registers. Furthermore, this code makes no use of interrupts and therefore prevents effective overlap of disk activity with computation.

Users would generally find programming with disks using a user interface such as the above to be quite uncomfortable! The primary problem with this user interface is that it requires all disk input/output to be requested in terms of the physical disk address of the sector. Users are generally far more interested in abstract storage units such as files!

Translating file-I/O to disk I/O is most easily done in at least two stages. First, we build a simplified disk addressing scheme on top of the hardware, and then we build a file system on top of that. It is inconvenient to view a disk address as having fields for cylinder, surface and sector, because for many purposes, we would like to talk of sequential disk addresses. Thus, we define a linear disk address as follows:

    Given:
	sectors-per-track
	tracks-per-cylinder
	cylinders-on-disk

	sectors-per-cylinder =
		  sectors-per-track
		* tracks-per-cylinder

    Define:
	linear-address =
	      sector-number
	    + surface-number * sectors-per-track
	    + cylinder-number * sectors-per-cylinder
Linear addresses run from zero to sectors-1, where sectors is the total number of sectors on the disk; all intermediate linear addresses are defined, so we can speak of consecutive sectors even when crossing from one surface to the next and when crossing from one cylinder to the next.

Files are easily defined above this level. A file is, at an abstract level, an ordered set of disk sectors, and at an abstract level, a directory is a mapping from names to such sets. There are many ways of organizing sets of disk sectors, and having seen one, a good programmer should be able to propose many others!

The solution we will consider here is perhaps the simplest: we will view a file as a set of consecutive disk sectors, where we define consecutive sectors in terms of the linear addressing scheme outlined above. In this scheme, a file can be described by the address of its base sector and by the length of the file, in sectors; a directory is just a table of names of files, each with a base and length.

	Minimalist Directory Structure

	 Name                  Base  Length
	 ____________________ ______ ______
	|____________________|______|______|
	|____________________|______|______|
	|____________________|______|______|
	|____________________|______|______|
	|____________________|______|______|
	|____________________|______|______|
	|____________________|______|______|
Of course, we need some way of indicating that a directory entry is unused; we might, for example, use a name that begins with a null character as an indicator that the corresponding region of disk is available for use.

When we open a file, we search the directory for a file with the given name and return a file descriptor -- in its simplest form, such a descriptor would be the index of that file's entry in the directory data structure. Before reading or writing a sector of the file, we use this descriptor to translate the user's logical sector number, relative to the file, to a physical sector number. For our simple-minded file structure, we do this by adding the base for the file to the sector number given by the user.

Typically, we reserve at least one sector on the disk to hold the directory for that disk, and it is common to reserve a second sector as a boot sector; the latter typically contains code that is read into memory by code in ROM when the system is restarted. The code in the boot sector is then responsible for reading in the whole operating system from disk, or at least, for starting to do so -- code it reads from disk may be used to read other code from disk.

Following this typical layout, we might designate logical sector number zero as the boot sector -- and write our ROM code so that, on receiving a restart interrupt, it attempts to read this sector from each disk device on the system. Given this convention, logical sector 1 of each device could be used to hold the directory for that device.

The very first file system constructed on each generation of computers frequently follows an outline such as that given above, but there is little reason to prefer such a system. Most users would perfer a file system that allowed hierarchicall structured directories -- that is, where each directory entry could refer to either a file or another directory. Most users would prefer a file system that could patch together available free space into useful storage space even when it isn't contiguous, and most users would prefer a file system that posed no fixed limit on the number of entries in each directory.

Real Disk Interfaces

The disk interfaces found on todays computers rarely look like the interface described above. The same registers are present, but they are partitioned between two different components, and software access to the interface registers is far more complex than is outlined above.

Why this complexity? A primary reason is that disk manufacturers don't want to have to build disk controllers for each CPU on the market when they come out with a new disk model. Instead, the manufacturers support a very limited set of controller interfaces, and computer manufacturers build their machines with interfaces to these.

One common interface standard is the SCSI standard -- the Small Computer System Interface standard, usually pronounced scuzzy. Most computer manufacturers offer SCSI interfaces, and most disk manufacturers build disk drives with SCSI interfaces. This allows a manufacturer to sell the same disk drive for use on supercomputers, mainframes, scientific workstations and home computers. Note that the word Small in the acronym SCSI is only of historical interest today -- SCIS interfaces are not limited to small systems! The following illustrates how a SCSI interface is arranged:

 _____
|     |           Main Bus
| CPU |=================================|
|_____|  ____|____   ________|________
        |         | |                 |
        |  Memory | | SCSI controller |
        |_________| |_________________|
                             |           SCSI Bus
                   |==================================|
                   _____|_____      _____|_____
                  |           |    |           |
                  | SCSI Disk |    | SCSI Tape |
                  |___________|    |___________|
The SCSI controller has three basic functions: Read data from the SCIS bus into memory, write data from memory to the SCSI bus, and send commands to devices on the SCSI bus.

With this interface structure, the SCSI controller has DMA address register, while the disk address registers are in the disk controller. The SCSI controller and disk controllers must both count the block length; data transfers on the SCSI bus are 8 bits at a time (or 16 bits on an extended SCSI bus), so the SCSI controller also packs and unpacks words on their way to and from memory.

To read or write a disk sector, the software must first tell the SCSI controller what device on the SCSI bus is being used, so the SCSI controller contains a SCSI device address register. Then, the software must pass disk address and disk command information to the SCSI controller, which passes it on to the disk interface. Finally, the software must give the SCSI controller the DMA address and the start-transfer command. From that point on, the controller and disk interface work together until the read or write is finished, at which point, the SCSI interface signals that the job is done with the done bit in its status register and possibly requesting an interrupt.

SCSI interfaces for low performance machines frequently have no DMA interface. Instead, the SCSI controller has a buffer built into the controller that can hold one sector. Transfer of data between this buffer and main memory is done by software, one byte or word at a time, either before a write or after a read.