| Minimal Operating SystemsCS:3620 Notes, Spring 2018
    
     Part of 
      
      the CS:3620, Operating Systems Notes
      
     
      | 
Since the 1960's, customers have demanded a minimal suite of system software when they buy even the smallest computer systems. The bare minimum consists of a text editor, an assembler, a linker, a loader, a set of subroutines for doing input-output, and a command language for directing the operation of the various programs. Of course, high level languages are even better, and by the start of the 1970s, even the smallest computers came with a FORTRAN compiler. If the subroutine library included support for a disk-based file system, it was common to refer to the system as having a a disk operating system or DOS. There have been many DOS systems, all unrelated to each other.
These software components are sometimes lumped together as the operating system, but we can distinguish between the system support utilities such as editors and language processors, on the one hand, and the run-time services offered by the operating system and command language on the other hand. The former are used in preparing a program to run, while the latter are used to run programs once they have been prepared. Here, we will use the term operating system only to refer to the latter.
Occasionally, a vendor will emphasize that they have integrated everything into their operating system. Microsoft famously claimed, for example, that Microsoft's web browser was an integral part of Microsoft Windows and could not be unbundled from the system in response to antitrust actions. Other vendors have made similar claims about FORTRAN compilers and various text editors. Generally, these claims have been false. If true, such claims would be evidence of very bad system design, since there is no good reason to entangle things like that.
The minimal operating system discussed here is typical of the single-user systems found on most early machines. These include the mainframes of the 1950's, the minicomputers of the 1960's, and the microcomputers of the 1970's. Few general purpose computers in common use today have such primitive operating systems, but most of today's operating systems can be traced back to such systems.
The term general purpose is, of course, essential to the above paragraph. Special purpose computer systems are frequently sold with no operating system at all. Embedded computer systems, those that become part of some large product, whether a TV remote control, a home appliance, an airplane cockpit instrument or a voltmeter frequently execute fixed programs where the line between system and application is completely blurred. The programs for such systems are frequently developed on general purpose computers and burned into read-only memory as part of the manufacturing process.
Although the particular set of peripheral devices attached to the computer is not of critical importance in this discussion, it is useful to consider a system with the following peripherals: A keyboard for user input, a display screen for output, a printer, and a number of storage devices for sequential files. On a modern system, these sequential files would usually be stored on one or more disk drives, but disk drives are complicated. In order to build a minimal system very quickly, we are better off talking of extremely simple I/O devices like the cassette tapes that were widely used with the first generation of personal computers in the 1970's or the open-reel magnetic tape drives of first-generation mainframes. Here is a picture of such a system.
                                   ________
 __________                     --| tape 1 |
|          |                   |  |________|
| display  |      __________   |   ________
|  screen  |-----|          |--   | tape 2 |
|__________|     | computer |-----|________|
               --|          |--
 __________   |  |__________|  |   _________ 
| keyboard |--                  --| printer |
|__________|                      |_________|
Figure 1: A dismayingly simple computer system.
In the early 1950s, such a system would have cost millions and occupied a large machine room, while by the mid 1980s, such a system would only cost a few hundred dollars. The operating system requirements for such a simple computer system do not depend on the tape technology being used, and they do not depend on whether the computer has a microprocessor or vacuum tubes in its heart.
The tape drives used by computers of the 1950s and 1960s used half-inch wide magnetic tape, wound on reels that were about a foot in diameter. The drives themselves were the size of a refrigerator, and large computer centers frequently had tape drives lined up along an entire wall of the center, dwarfing everything else. Cartoonists drawing computers from that era frequently focused on these gigantic tape drives as a visual symbol of computers and computer technology. Of course, it helped that the reels could be drawn as eyes and the loop of tape hanging between them could be drawn as a nose. an example
The run-time system services provided by a single-user operating system for a machine such as that illustrated in Figure 1 would typically allow for the following operations: input from any device other than the display screen, output to any device other than the keyboard, rewinding any tape drive, and loading a program from any tape drive. These services can be packaged in any of a number of ways, but since the loader service itself requires use of the input service, it is reasonable to think in terms of the hierarchic design shown below.
_____________________________________________ | | | command language interpreter | | | | __________\calls/________ | | | | | | | application | | | | programs | | | | | | |_\calls/_|___\calls/___ | | | | | | | loader | | | | | | | |________\calls/________|__\calls/__|_\calls/_| | | | input-output library | |_____________________________________________|Figure 2: A hierarchic design for a simple operating system.
The hierarchy shown in Figure 2 can be thought of in terms of subroutine call relationships, with the calling system component on top and the called component on the bottom. The picture also shows how the command language interpreter came to be thought of as a shell, as it wraps around the application programs, callint them and also calling on the services below.
The input-output library provides the services that all of the rest of the system relies on. The loader reads from files, so it needs the input output library, and it loads the application programs from those files in memory. The command language interpreter calls on the loader, but application programs may also load other applications, so long as there is enough free memory.
When a computer system has no protection mechanisms, that is, mechanisms to prevent application code from corrupting the operating system, calls from application code to system services and calls from the command language interpreter to user code are all generally done using the same subroutine call instructions that one part of an appliction might use to call another part of the same application.
On large computers with protection mechanisms, normal subroutine calls are only used to call procedures or functions within the same protection domain. Calls that cross domain boundaries frequently use entirely different mechanisms. This change of mechanism involves no change in the user's intent, but is only required in order to prevent unauthorized accesses from either calling or called routine to code or data belonging to the other. Because of this, we will begin by avoiding mention of protection and treating all system services as if they were simple subroutines that can be called by the same mechanisms used to call parts of user programs.
In many primitive systems, a different subroutine was provided for each system service. As a result, a user wishing to change a program to read from tape drive A to tape drive B would be forced to rewrite the program. In a remarkable number of cases, different devices actually used different character sets, so it was necessary to change the code that processed the input or generated the output when the change was made from outputting to the printer instead of outputting to the display screen or when a change was made from keyboard input to input from tape. Even in the mid 1950's, these extreme cases were the subject of ridicule, and programmers strove to create uniform interface standards.
For an example of such ridicule, see "How to Design a Kluge" by J. W. Granholm. Datamation, Feb. 1962. 149-155.
The FORTRAN language, developed in the mid 1950's by John Baccus at IBM, provides for a uniform input-output interface by addressing all input-output operations to abstract input-output units. All a programmer needs to do in order to change the unit being operated on is change the unit number. On a very crude system, unit 1 might be the keyboard, unit 2 might be the display, unit 3 might be the printer, and units 4 and up might be tape drives.
Baccus did not know anything about abstract data types; that concept was only formalized in the late 1960's, but despite this, it is clear that each input-output unit in Baccus's model is an instance of an abstract data type. To use modern terminology, each input-output unit is an instance of a polymorphic class. Instances of this class support the methods read and write, or perhaps put and get (plus some others), and the class is polymorphic because the implementations of these methods differs for each device. For example, the read from a keyboard is done by entirely different code than the read from a tape drive.
Baccus's use of numerical unit numbers suggests that, in the abstract, he required the programmer to view the available input-output units as being arranged in an array. Named devices would have been nice, but only after the mid 1960's did operating systems routinly support symbolic device names, and as file systems became common, symbolic file names. Naming devices and files requires that the system include a symbol table to relate names to devices or files; this might be called the device table, or on later systems, the directory of the file system.
If cost was no object, we might be tempted to invent a system input-output service along the lines of put(d,c) where d is a symbolic device name and c is a character to be output to that device. The problem with this idea is that it suggests that we have to look up the device by its symbolic name once for each character output to that device, and we do not generally want to pay the price of so many lookup operations.
In order to avoid excessive device-table lookup operations, we typically introduce a new operation, open that takes a symbolic file or device name and returns a descriptor for that file or device. In crude systems, the descriptor is nothing more than an old FORTRAN-style unit number, but in an abstract sense, an I/O descriptor is always the handle on an object that represents an I/O device that is capable of performing the basic I/O operations, usually given names such as read, write and rewind (the classic FORTRAN names) or put, get and seek (in the C tradition).
The opened file descriptor can be described as an abstract data type, as in the following figure:
    S -- the set of symbolic file names
F -- the set of file descriptors
C -- the set of characters
Functions on the sets
    open: S → F  (given a symbolic file name, returns an open file)
rewind: F → F
read: F → F × C
write: F × C → F
Informal descriptions
Figure 3: The file variable abstract data type.
In the above, two different concrete notations are shown for the calls to the read, write and rewind operations. The lefthand notation is a purely procedural notation, where read, write and rewind operate through side effects on their parameters; the righthand notation is an object-oriented notation where the operations are the methods of the class, and a device descriptor is a handle for objects in this class. Aside from the question of programming language syntax, these views are equivalent.
The input-output operations suggested in Figure 3 are sufficient for most purposes, but they are hardly convenient! If these are the only available services, programmers would be required to do all input-output one character at a time, and this poses both conceptual and practical problems. Most systems therefore provide additional services such as those suggested below:
Figure 4: Additional input-output routines.
In an object-oriented world, we can view the additional services shown in here as being provided by a class-extension to the basic set of services, but this presumes that all devices support the basic character sequential services as primitives, and this is not true! Devices such as the keyboard do indeed support character sequential input, and low-end printers support character sequential output. Most tape formats, however, are implemented as block sequential formats, with hardware interfaces that directly support operations analogous to readblock and writeblock. On these, the character oriented services would be logically an extension of the block oriented services!
Even when our underlying devices are character sequential, it is frequently a good idea to provide block-oriented I/O services as primitives on high performance devices. The reason for this is twofold. First, the overhead of one subroutine call per character being output can be significant, and second, some devices only perform well if there is a guarantee that data will arrive at a uniform rate. For example, consider a high-speed tape reader. To read one character, the device must start the tape moving, read, and then stop the tape. To read a block of characters, the tape can be started and left moving for the entire length of the block. As a result, using character oriented I/O to read a block of 100 characters might end up starting and stopping the tape 100 times, while using a block-oriented approach would only have one start and one stop. Mechanical start and stop operations can be very slow compared to the electronic operations of actually reading or writing a byte of data!
The Unix kernel does not provide character-at-a-time primitives for a different reason. Unix, like all modern operating systems, has a strong protection model that separates user programs from the system. System calls are significantly slower than calls to regular user subroutines, so if a program does I/O one character at a time, the overhead is considerable. The Unix kernel's I/O model forces the user to operate a block at a time. If you really need character-at-a-time I/O, you can do it by setting the block size to one byte, but the design of the Unix read() and write() services discourages this.
Note that the typical input-output primitives presented in here are significantly different from the primitives of even such ancient high level languages such as FORTRAN, not to mention Pascal, C, and more modern languages, in that no provisions are made for data format conversions. Most programming languages include an intermediate software layer between the operating system's input-output routines and the user for format conversion between textual representations of values and their internal representations. Thus, FORTRAN has its FORMAT statement, C has fprintf() and fscanf(), and object oriented languages tend to include standard methods for each predefined class to perform such conversions. However format conversion is presented to users, format conversion can be dismayingly complicated. Do you want to output this numeric value left or right justified in the field. Do you want leading zeros suppressed? How much precision should be shown after the point? What radix should be used? This is not an operating system issue.
Protection mechanisms can complicate the way in which a user views a file variable or open file descriptor, since the contents of this variable are, in a sense, the private property of the file system. In an unprotected environment, file variables can be records (or pointers to records, or objects) containing whatever information is necessary to access a particular device. In a protected environment, the system must be able to guarantee that no user can misuse the information in the file variable, even if the user is working in a language like C or assembly language where the language does not enforce type integrety.
The protection mechanisms of operating systems frequently protect the open file variables or file descriptors by removing this information from the address space of the user program. Instead of giving the user a pointer to the data structure describing an open file, the user is given a small integer that is the index into a small array of open file descriptions, where there is one array entry per file that user has open. This exactly matches the old Fortran model, with open files described by unit numbers, and it is, in fact, the model that is used in Unix.
On some systems, notably the various flavors of OS on the IBM 360/370/390 family of mainframes (now called enterprise servers) slots in this table have symbolic names; typical (and still widely used) names are SYSIN, SYSPRINT, and SYSPUNCH (nowdays, rarely applied to actual printers and card punches, but named after those historic devices). Since the table for any user is small, the cost of a linear search for a table entry is small enough that the open file table can be searched each time the user manipulates a file.
Some other older systems provide a radically different set of primitives; an extreme alternative is to have a single input-output system service which accepts a file control block as a parameter. This block is a record listing such things as the buffer address and size, the direction of the transfer, the type of the transfer (line or block), whether the file should be rewound before the transfer, the name of the file to be used, and space for the system to fill in additional information it may need for its own purposes. Typically, on the first call to this general input-output routine, the system looks up the file name in the directory and the system space in the block is initialized to contain the information which would otherwise reside in an opened file descriptor. All subsequent calls will be fast because they do not need to look up the file name. The risk in such systems is that the user might accidentally corrupt some of the system information in the file control block.
The smallest of systems sometimes offer only an absolute (non-relocating) loader, thus forcing all application code to be written with knowledge of where in memory it will eventually be loaded. Since relocating loaders are only slightly more complex than absolute loaders, most systems provide a relocating loader that can load applications in any available memory. Some larger systems provide, as a system service, a linking loader, but linking loaders are usually not system services, but must first be loaded themselves before they are run.
A typical loader must be passed the name of the file to be loaded or an already open file variable from which the object code should be read. Even when the loader can't handle relocation, it is useful to pass it the maximum and minimum addresses of the region into which it can load code without damaging the rest of the system. If it handles relocation, the minimum address is typically also the relocation base.
There are two models for loader use; in one, the load-and-go model, the loader itself transfers control to the loaded routine immediately as it finishes loading, and then returns to the caller only when the loaded routine returns. The alternative is for the loader to load the program and then return to its caller; in that case, it must return the starting address of the loaded code so that the code that called the loader may then call the loaded program. It is also useful for the loader to return the highest address it actually used, since this is likely to be the start of the memory region available for dynamic storage allocation. The load-and-go model suffices for the most common usage pattern, but the more general alternative allows for other usage patterns.
For our fictional example system, the loader system service described below will be assumed. This does not use the load-and-go model, but it is easy to construct a load-and-go service using this service as a foundation.
Figure 5: A loader system service.
The loader service described here relies on its caller to allocate memory for the loaded program, and it relies on its caller to handle any transfer of control to the loaded program. Typically, there will be some system wide convention about how the caller can transfer control to the program; a typical convention would require that the caller perform a procedure call to the starting address of the program, thus allowing the loaded program to return to the caller when it is done.
In systems where control is transferred to the loaded program by a goto instead of a call, the system must include a facility allowing the user to transfer control back to the caller when it is done. This is frequently done with a system service named end or exit; In systems that allocate activation records on a stack, this service also typically restores the stack to its state prior to entering the loaded program, and because of this, it is a useful way to terminate application programs when there is an exceptional condition.
Program termination by a call to a system service such as end or exit was especially important on systems which had very little memory. On such a system, it is common for the loader to load the new program directly into the memory which had been used by the caller, thus requiring the caller to be re-loaded when the loaded program finally terminates. The end or exit service could do this.
Small operating systems typically offered no storage management services at all. Such systems included the original MS-DOS in the early 1980s (the microprocessor era), or minicomputer operating systems of the late 1960s, or mainframe operating systems of the 1950s. On such systems, user programs had to include their own storage management. A common storage management discipline on such a system establishes all of memory as a stack; whenever a program wants to load another program, it loads it immediately beyond itself in memory, thus effectively pushing one program on the stack. If procedure calls are done using a stack, this may require a second stack, as is shown in the map of the memory usage on such a small system given below.
 -----------------------------------------------------------------------
| code for |   command   |  user   |  user   |//////////////| procedure |
|  system  |  language   | program | program |//// free ////|  calling  |
| services | interpreter |    1    |    2    |//////////////|   stack   |
-----------------------------------------------------------------------
                             push (load) new ---->      <---- push activation
                             programs here                    records here
Figure 6: Map of the memory use in a typical small system.
The figure shows the state of the system while user program 2 is running, where user program 2 was loaded and then called by user program 1, and user program 1 was loaded and called by the command language interpreter. In this kind of simple environment, it is useful for a program to be able to find the highest address it occupies so that it can tell where memory is available to load any programs it needs. To simplify this, we will assume that the linker implicitly defines the relocatable symbol codetop as the address beyond the end of each loadable file.
On larger operating systems, services are usually provided for allocating and deallocating memory, and the loader frequently makes direct use of these services to obtain memory for the relocatable parts of loaded programs. The form that these services take is frequently strongly dependent on the structure of the system hardware; thus, these services are frequently quite different from the storage management services offered to user programs in high level languages (eg. Pascal new and dispose, C malloc and free, or C++ and Java object creation).
The use of loader system services by one user program which loads and calls another is very similar to the use of the same services when the main segment of a user program calls a procedure in one of its overlays. As a result, the explicit use of loader system services is sometimes described as a form of overlay management. An alternate view of this is that a user program which loads another and then calls it is making use of run-time linkage or dynamic binding, as opposed to pre-run-time linkage such as typically done with a linkage editor.
The simple system design discussed above explains why the designers of the PDP-11 computer opted to make the push operation on that CPU's system stack decrement the stack pointer. The Intel 8080 copied this and many other design decisions of the PDP-11, and from this, the x86 family inherited this.
When the command language interpreter loads a user program, or when one program loads another, a new problem arises: How can the newly loaded program be told why it was loaded and what it is to do? Some early systems did little to help solve this problem, so a program which was loaded was forced to open a communications channel with the user and ask why it was run and what it was to do. This made it very hard to write general utility programs that could be run interactively by a user one time, and run automatically by another program at a different time.
Since the 1960's, a common solution to this problem has been to introduce the concept of parameters to an executable program. This solution implicitly underlies the design of the Pascal language, where each program has a heading that lists, as parameters, the files 'input', 'output', and optionally others. Thus, it is reasonable to think of a Pascal compiler translating the heading of a program as illustrated below.
program p( input, output, another );Equivalent C function heading:
#include <stdio.h> void p( FILE *input, FILE *output, FILE *another )
Figure 7: Program parameters.
This approach allows the user of a program to set up and open the files used by that program before running it, whether that user is another program or a person running the program via the command language interpreter.
Another approach to solving this problem is sometimes used on systems which maintain a table of opened files on behalf of each user. Under the classic OS operating system on the IBM 360/370/390 series of machines, where the opened file table has symbolic names for its slots, the slots named SYSIN and SYSPRINT are usually used for standard input and output by most programs, so a program which runs another program can first reset these files appropriately in order to change where input-output is directed when the new program is run. Unix uses a very similar scheme; in Unix and the many systems that have copied ideas from it, the integer unit numbers 0, 1 and 2 are used as the indices of the file table entries for standard input, standard output, and error messages, respectively. A program which wishes to run another program on particular input-output files can first close these units and then re-open them on the new set of files.
The C programming language, based on the conventions of Unix, assumes that open files are passed using an open file table. In addition, the structure of the exec() service in the earliest 16-bit version of Unix passed just one array of strings, argv. As C developed, the parameters to the main() function of a C program were confined to argv and environ, with the residual and redundent argc thrown in for compatability with earlier versions.
The Windows API copies a number of features of Unix, and many programming languages designed after C retain compatability with the C worldview. As a result, for example, Java still retains the notion of a main() method that takes an array of strings conventinally called args as its parameter, with the standard files in, out and err passed through entirely different mechanisms to package System. There is hardly any innovation from the era of FORTRAN unit numbers to the file numbers of Unix and the conventions for launching a modern Java program.
#include <stdio.h>
#include <math.h>
int main( int argc, char * argv[] )
{
    int i;
    printf( "argc = %d\n", argc );
    for (i = 0; i < argc; i++)
        printf( "argv[%d] = %s\n", i, argv[i] );
    return 0;
}
The command line used to execute this program
a.out runs the programThe output when the program is run
argc = 4 argv[0] = a.out argv[1] = runs argv[2] = the argv[3] = program
Figure 8: Textual parameters to a C program
The example program above simply tests the passing of textual parameters to the main program by printing the parameter count and the parameters. The parameter count is passed as the first parameter, conventionally called argc, while the second parameter is an array of strings, conventionally called argv (the argument vector). The convention is that the first parameter, argv[0], is always the name of the program, and as a result, argc always seems to be one more than the number of parameters.
The implementation of this is straightforward. Prior to calling the function named main in a C program, the loader (or the program that called the loader) pushes each of the textual arguments on the stack, and after all of the arguments, it pushes the array of pointers to the arguments. Finally, after these, it pushes the count of arguments and a pointer to this array, as illustrated below:
                                               array of pointers   actual
                                                  to strings     parameters
    | <-- stack top after return from program|-------------------|-------|
 ________________________________________________________________________
    | a.out\0 | runs\0 | the \0 | program \0 | o | o | o | o | \0| 4 | o |
 ___|_^_______|_^______|_^______|_^__________|_|_|_|_|_|_|_|_|___|___|_|_|
      |         |        |        |____________|___|___|___|           |
      |         |        |_____________________|___|___|               |
      |         |______________________________|___|                   |
      |________________________________________|                       |
                                               ^                       |
                                               |_______________________|
Figure 9: The Unix/C model of how the call to a main program is implemented.
The example here also illustrates a feature of the C and Unix environments that many programmers find to be very obscure. The main program in C is not a procedure, but rather, it is an integer function. By convention, returning zero indicates a normal exit, while a nonzero value such as -1 indicates an error. C (or C++) programs may also terminate by calling exit(v) to return the value v without executing all the way to the end of the main program; this is like using the return command to escape prematurely from a function.
In a point-and-click windowing environment, programs also have parameters! In many ways, such environments are just another kind of command language. In such an environment, clicking on the icon for an object code file is a request to load and run the program in that file with no parameters, while dragging and dropping the icon for one file onto the icon for another is a request to load and run the program in the second file, passing the first file as a parameter. Typically, the part of the window manager that supports drag-and-drop execution passes the textual name of the file that was dragged and dropped as a parameter to the application.
If we were designing a new system without reference to history, the natural thing to do would be to allow arbitrary parameters to be passed to the programs that any other program elects to launch. Once you load a program into memory, why not allow any parameters to be passed. Programs designed to be launched from the command language interpreter (shell) would have to be written to accept parameters in the form that the command language passes, but if you write a program that intended to be loaded and launched by another of your programs, you should be able to pass any parameters you want, open files, strings, floating point numbers, or objects of any class that both the caller and called program agree on.
The reader should be familiar with the following terms after reading this section:
run-time services storage management command languages dynamic linkage supervisor call instructions run-time binding opened file descriptors program parameters file variables command language procedures input-output primitives Unix shell file control blocks IBM job control language
In addition, the reader should be familiar with the following system services of the example operating system:
open writeblock rewind readline read writeline write loader readblock codetop
a) readblock, as defined in Figure 4.
b) writeblock, as defined in Figure 4.
c) readline, as defined in Figure 4.
d) writeline, as defined in Figure 4.
You should probably write these in an informal pseudocode and not in a formal high level language.
a) When you call the loader, what values would you use for the loader parameters b and t.
b) How would you detect that the system is out of memory, and when would you make checks for this situation? Assume that the stack has one entry pushed on it each time a procedure or function is called, and that this entry holds all local variables and any other memory needed by that procedure.
The subjects discussed in this chapter are covered to a remarkably uneven extent in the literature. Many authors seem to consider system command languages to vary so extremely that there is little that unifies the command languages of different systems, while others consider the subject to be so elementary that it does not require discussion. The discussion of command languages in Section 4.4 of
may be helpful. Section 7.4 of
is also a reasonable source.
For a discussion of the classical IBM 360/370/390 job control language, see
This discussion starts in Section 2.3.2, and continues in Section 6.4.1. Chapter 7 contains a discussion of system services, although the term is used slightly differently than it is here; this chapter includes a discussion of the command language interpreter.
For information on Unix, see
Chapter 5, contains an extensive discussion of the Unix shell. The official description of the Unix shell is included in section sh(1) of the
Sections open(2), read(2), write(2), and lseek(2) provided much of the inspiration for the basic input-output services discussed in this chapter. Section exec(2) documents the way in which the Unix loader is called; this was not used as a pattern for the loader service described here, but is an interesting contrast.
Any Unix system should support the man command, and should have, on line, a complete copy of the Unix Programmers' Manual, updated appropriately for that system. Any command that is documented as command(x) can be looked up using the command man x command.
The classic introduction to the Unix system and a basic introduction to the system services and shell is included in the paper
For a look at the kind of effort required to retrofit old operating systems with small memory models to support a reasonably modern view of the kind of dynamic linking suggested by the example in Figure 8.10, see
This paper also includes a small example showing the different programs in a real user's environment with an indication of which programs dynamically load and call which others.
The Xerox corporation undertook an interesting venture in the 1970's, investigating the automated office of the future. They decided to abandon all financial good sense and imagine an era when computers were inexpensive enough that each office worker could have their own computer on their desk. Cost was no object, so they imagined each office worker having a machine with a full 64K bytes of main memory, a disk drive, and a graphics display screen, and they imagined all these machines networked, with a laser printer on the net.
The prototype, called the Alto, after the Xerox Palo Alto Research Center, was a very expensive machine. They added a mouse to it, the idea came from Stanford Research Institute, and they developed a new network protocol for the network, called Ethernet. The keyboard, display screen and mouse sat on the desktop, while the electronics filled a small 19-inch wide relay rack next to the desk.
The next question is, what to do with this hardware. Several tens of these machines were distributed to workers at PARC and eventually also at Xerox Webster Research Labs in New York. Several different uses were found for the machine. One branch of development pursued the Smalltalk language, while another looked at how to do document formatting on these machines.
One result of this work was the idea of the WIMP model of user interface, that is, the Window Mouse Icon Pointer model. This model now dominates all desktop computers, and it has its roots in earlier graphics systems, but it reached its first maturity at Xerox PARC on the Alto. Steve Jobs took a tour of PARC and responded by sending Apple down the same road with the Apple Lisa and then the Apple Mac. When Bill Gates saw these, he sent Microsoft along the same road with the development of Microsoft Windows. The user experience on all of these modern systems echoes what first emerged on the Xerox Alto.
As anyone who has used a Mac or Windows PC knows, the basic idea of the WIMP user interface is that the screen is a desktop, with icons on the screen representing objects you are able to use or manipulate. Clicking on an object on this desktop typically opens a window allowing you to manipulate the object.
The application launched depends on the type of the object. If you click on a word-processing document, you launch the word processor. If you click on a directory object, you launch the directory manager. If you click on a JPG object, you launch an image viewer, and so on.
This basic WIMP model suggests something of an object-oriented world view, where each file is an object, in the object-oriented programming sense of the word. All file objects are therefore required to have a "click on me" method that opens the window to manipulate that object.
This viewpoint has its limits, but it immediately asks: What about other methods of the object? Object-oriented programming where each class has just one method is a very limited view. There are several ways WIMP user interfaces have dealt with this:
More mouse buttons: Left click and right click can call two different methods, or single click and double click. So long as there are only two or three things anyone might want to do with an object, this works.
Method select menus: Clicking on an icon causes a popup or pulldown menu to appear listing the available operations on the object.
Combinations of the above: Clicking the left button invokes the most common method, while clicking the right button opens the menu of other methods.
Drag and drop: Instead of clicking on an icon, drag it to another icon and drop it there. This launches a method of the destination icon passing the dragged object to it as a parameter.
In a conventional file system, each file has two attributes: A name and a value, where the value is the entire contents of the file.
In systems like Linux, file names are entirely uninterpreted by the system, but users quickly adopted the convention of putting suffixes on file names to indicate their use.
Additional headings emerged on other systems, where suffixes were mandatory:
On some systems, some of these suffixes have rigidly determined meanings, while on other systems such as Linux, they remain advisory.
This system of adding attributes to files as extensions of the file name is very limited. In the context of the WIMP world view, it is entirely inadequate.
In the WIMP world, each file needs the following attributes:
This list led Apple and some later WIMP GUI developers to decide that each file in the MacOS file system should have two parts, a resource fork and a data fork. The resource fork of a file is for listing all these attributes, while the data fork is the actual content of the file. In theory, the resource fork should be short, but it can be significant.
There is another issue. Where on the desktop should each icon be displayed. It is fair to ask, is this an attribute of the file, or is it an attribute of the desktop. Does moving a file change it, or does it merely change a record of the desktop. The usual conclusion is that the location belongs to the desktop, not to the file.
One option is to augment every directory entry with coordinate information, so that when that directory is viewed, the objects in that directory are displayed with coordinates taken from the directory. In this case, the desktop itself is naturally a kind of directory.
Given directory entries containing coordinates, and given that each file has a resource fork, we can construct the core of a desktop manager as follows:
        /* first plot, or replot the screen */
        for each directory entry e {
                file f = e.file
                at e.coordinates
                display f.resources.icon
        }
        /* then process mouse clicks */
        loop {
                await mouse click c
                for each directory entry e {
                        if c.coordinates is near enough to e.coordinates {
                                file f = e.file
                                launch application f.resources.click
                                passing f to it as a parameter
                        }
                }
        }
The above algorithm is, at heart, no different from a shell. The only difference is that instead of reading lines of shell commands, it responds to sequences of mouse clicks.