Lecture notes for 22C:50 Introduction to System Software, by Douglas W. Jones, University of Iowa Department of Computer Science
If it were not for system software, all programming would be done in machine code, and applications programs would directly use hardware resources such as input-output devices and physical memory. In such an environment, much of a programmers time would be spent on the relatively clerical problems of program preparation and translation, and on the interesting but unproductive job of reinventing effective ways to use the hardware. System software exists to relieve programmers of these jobs, freeing their time for more productive activities. As such, system software can be viewed as establishing a programming environment which makes more productive use of the programmer's time than that provided by the hardware alone.
The term programming environment is sometimes reserved for environments containing language specific editors and source level debugging facilities; in this text, the term will be used in its broader sense to refer to all of the hardware and software in the environment used by the programmer. All programming can therefore be properly described as takin place in a programming environment.
Programming environments may vary considerably in complexity. An example of a simple environment might consist of a text editor for program preparation, an assembler for translating programs to machine language, and a simple operating system consisting of input-output drivers and a file system. Although card input and non-interactive operation characterized most early computer systems, such simple environments were supported on early experimental time-sharing systems by 1963.
See: J. McCarthy, et al. A Time-Sharing Debugging System for a Small Computer. Proceedings of the 1963 Summer Joint Computer Conference, AFIPS Conference Proceedings 23. Pages 51 to 57.
Although such simple programming environments are a great improvement over the bare hardware, tremendous improvements are possible. The first improvement which comes to mind is the use of a high level language instead of an assembly language, but this implies other changes. Most high level languages require more complicated run-time support than just input-output drivers and a file system. For example, most require an extensive library of predefined procedures and functions, many require some kind of automatic storage management, and some require support for concurrent execution of multiple parts of a program.
Many applications require additional features, such as window managers or elaborate file access methods. When multiple applications coexist, perhaps written by different programmers, there is frequently a need to share files, windows or memory segments between applications. This is typical of today's electronic mail, database, and spreadsheet applicatons, and the programming environments that support such applications can be extremely complex, particularly if they attempt to protect users from malicious or accidental damage caused by program developers or other users.
A programming environment may include a number of additional features which simplify the programmer's job. For example, library management facilities are frequently included, allowing programmers to extend the set of predefined procedures and functions with their own routines. Source level debugging facilities, when available, allow run-time errors to be interpreted in terms of the source program instead of the machine language actually run by the hardware. As a final example, the text editor may be language specific, with commands which operate in terms of the syntax of the language being used, and mechanisms which allow syntax errors to be detected without leaving the editor to compile the program.
See, for example, T. Teitelbaum and T. Reps. The Cornell Program Synthesizer: A Syntax-Directed Programming Environment. Communications of the ACM 24, 9 (September 1981) 563-573.
In all programming environments, from the most rudimentary to the most advanced, it is possible to identify two distinct components, the program preparation component and the program execution component. On a bare machine, the program preparation component consists of the switches or push buttons by which programs and data may be entered into the memory of the machine; more advanced systems supplement this with text editors, compilers, assemblers, object library managers, linkers, and loaders. On a bare machine, the program execution component consists of the hardware of the machine, the central processors, any peripheral processors, and the various memory resources; more advanced systems supplement this with operating system services, libraries of predefined procedures, functions and objects, and interpreters of various kinds.
Within the program execution component of a programming environment, it is possible to distinguish between those facilities needed to support a single user process, and those which are introduced when resources are shared between processes. Among the facilities which may be used to support a single process environment are command language interpreters, input-output, file systems, storage allocation, and virtual memory. In a multiple process environment, processor allocation, interprocess communication, and resource protection may be needed. Figure 1.1 lists and classifies these components.
Editors Compilers Assemblers Program Preparation Linkers Loaders ======================================================== Command Languages Sequential Input/Output Random Access Input/Output File Systems Used by a Single Process Window Managers Storage Allocation Virtual Memory ------------------------------ Program Execution Support Process Scheduling Interprocess Communication Resource Sharing Used by Multiple Processes Protection MechanismsFigure 1.1. Components of a programming environment.
This text is divided into three basic parts based on the distinctions illustrated in Figure 1.1. The distinction between preparation and execution is the basis of the division between the first and second parts, while the distinction between single process and multiple process systems is the basis of the division between the second and third parts.
Historically, system software has been viewed in a number of different ways since the invention of computers. The original computers were so expensive that their use for such clerical jobs as language translation was viewed as a dangerous waste of scarce resources. Early system developers seem to have consistently underestimated the difficulty of producing working programs, but it did not take long for them to realize that letting the computer spend a few minutes on the clerical job of assembling a user program was less expensive than having the programmer hand assemble it and then spend hours of computer time debugging it. As a result, by 1960, assembly language was widely accepted, the new high level language, FORTRAN, was attracting a growing user community, and there was widespread interest in the development of new languages such as Algol, COBOL, and LISP.
Early operating systems were viewed primarily as tools for efficiently allocating the scarce and expensive resources of large central computers among numerous competing users. Since compilers and other program preparation tools frequently consumed a large fraction of an early machine's resources, it was common to integrate these into the operating system. With the emergence of large scale general purpose operating systems in the mid 1960's, however, the resource management tools available became powerful enough that they could efficiently treat the resource demands of program preparation the same as any other application.
The separation of program preparation from program execution came to pervade the computer market by the early 1970's, when it became common for computer users to obtain editors, compilers, and operating systems from different vendors. By the mid 1970's, however, programming language research and operating system development had begun to converge. New operating systems began to incorporate programming language concepts such as data types, and new languages began to incorporate traditional operating system features such as concurrent processes. Thus, although a programming language must have a textual representation, and although an operating system must manage physical resources, both have, as their fundamental purpose, the support of user programs, and both must solve a number of the same problems.
The minicomputer and microcomputer revolutions of the mid 1960's and the mid 1970's involved, to a large extent, a repetition of the earlier history of mainframe based work. Thus, early programming environments for these new hardware generations were very primitive; these were followed by integrated systems supporting a single simple language (typically some variant of BASIC on each generation of minicomputer and microcomputer), followed by general purpose operating systems for which many language implementations and editors are available, from many different sources.
The world of system software has varied from the wildly competitive to domination by large monopolistic vendors and pervasive standards. In the 1950's and early 1960's, there was no clear leader and there were a huge number of wildly divergent experiments. In the late 1960's, however, IBM's mainframe family, the 360, running IBM's operating system, OS/360, emerged as a monopolistic force that persists to the present in the corporate data processing world (the IBM 390 is the current flagship of this line, running the VM operating system).
The influence of IBM's near monopoly of the mainframe marketplace cannot be underestimated, but it was not total, and in the emerging world of minicomputers, there was wild competition in the late 1960's and early 1970's. The Digital Equipment Corporation PDP-11 was dominant in the 1970's, but never threatened to monopolize the market, and there were a variety of different operating systems for the 11. In the 1980's, however, variations on the UNIX operating system originally developed at Bell Labs began to emerge as a standard development environment, running on a wide variety of computers ranging from minicomputers to supercomputers, and featuring the new programming language C and its descendant C++.
The microcomputer marketplace that emerged in the mid 1970's was quite diverse, but for a decade, most microcomputer operating systems were rudimentary, at best. Early versions of Mac OS and Microsoft Windows presented sophisticated user interfaces, but on versions prior to about 1995 these user interfaces were built on remarkably crude underpinnings.
The marketplace of the late 1990's, like the marketplace of the late 1960's, seems to be dominated by a monopoly, this time in the form of Microsoft Windows. The chief rivals are MacOS and Linux, but there is yet another monopolistic force hidden behind all three operating systems, the pervasive influence of UNIX and C. MacOS X will be fully UNIX compatable. Windows NT offers full UNIX compatability, and so, of course, does Linux. Much of the serious development work under all three systems is done in C++, and new languages such as Java seem to be simple variants on the theme of C++. It is interesting to ask, when we will we have a new creastive period when genuinely new programming environments will be developed the way they were on the mainframes of the early 1960's or the minicomputers of the mid 1970's?
The goal of this text is to provide the reader with a general framework for understanding all of the components of the programming environment. These include all of the components listed in Figure 1.1. A secondary goal of this text is to illustrate the design alternatives which must be faced by the developer of such system software. The discussion of these design alternatives precludes an in-depth examination of more than one or two alternatives for solving any one problem, but it should provide a sound foundation for the reader to move on to advanced study of any components of the programming environment.