Chapter 15, Procedures, Coroutines and Processes

Lecture notes for 22C:116 Introduction to System Software, by Douglas W. Jones, University of Iowa Department of Computer Science

The Problem

Although some simple systems support only one process, most operating systems support multiple processes. The term process is used to refer to an independent execution path. Thus, for example, each user of a multi-user system will typically have a process responding to activity at their terminal. The terms task and thread of control are sometimes used as synonyms for process. Multiple processes are essential when a machine supports multiple processors, but they are useful even on single processor machines. This chapter introduces the process concept with no attention to how processes are implemented. An understanding of processes must rest on the distinction between code and an activation of that code, and this is most easily introduced in a more familiar context, that of procedures and functions.

If storage is statically allocated; that is, if the code of a program includes the actual address of each variable being referenced, each piece of code can have only one activation at a time. If the initial values of these statically allocated variables are established at load time, then the code must be re-loaded prior to each use. If, on the other hand, the code itself initializes its variables, the code can be re-used and is said to be serially reusable.

When the variables used by a block of code are not allocated as part of that block, but are separately allocated, the code may be reentrant or recursive. In this case, some index register must typically be reserved to point to the block holding the variables, and this register must be used with each reference to a variable in the block.

A new block must be allocated with each call to a recursive procedure. As a result, there may be many blocks for such a procedure, one for each activation of the code which has not yet returned to its caller. If two processes in a multiple process system both wish to use the same procedure, they may use it independently if each allocates a different block for the variables in the code; in this case, the code is described as being reentrant.

Procedure Calling Sequences and Activation Records

A block of code is said to be active while it is being executed. An activation record is a data structure which describes a single activation of a procedure, function or other block of code. The activation record of a block holds the values of all variables local to the block, and it holds sufficient information to establish an environment for interpreting references global to that block. In the context of procedures and functions, the times at which the activation records are allocated and deallocated determine the lifetimes of the local variables and whether or not the procedures or functions are allowed to be recursive.

Older languages such as FORTRAN had ill-specified local variable lifetimes, and most implementations of such languages use static allocation of activation records. As a result, classic FORTRAN implementations do not allow recursion, and local variables retain their values from one call to a procedure to the next. In contrast, the languages descended from Algol, including Pascal, C, Ada and Java, all allocate a new activation record with each call to a procedure and deallocate it on return. Thus, local variables in such languages do not retain their values from one call to the next but recursion is allowed.

When one procedure or function calls another, the activation of that procedure is not considered to be terminated; instead, it is considered to be suspended until the called procedure or function returns. As a result, when a procedure calls itself recursively, there will be a number of coexisting activations of that procedure, each with its own activation record. All but one of these activations will be suspended awaiting the termination of others, and the one which is not suspended will be the one which was called most recently.

The sequence of machine instructions which is used to transfer control to a procedure is called the calling sequence for that procedure. The code at the head of the called procedure to which the call transfers control is called the receiving sequence.

Other terms, such as register save sequence and entry sequence are sometimes used.
Finally, the code at the end of the procedure which returns control to the caller is called the return sequence. These sequences of instructions are strongly dependent on each other, so it is common to use the term calling sequences in a more generic sense to refer to all of them. The remainder of this section illustrates the concept of activation record management with a discussion of the calling sequences which might be used for procedures on a typical computer.

Most computers have a special procedure calling instruction which will be at the heart of any calling sequence. For the examples used here, the JSR instruction will be used, a mnemonic for jump to subroutine. The basic resources of the example machine and the semantics of the JSR, JMP and MOV instructions are given in Figure 15.1.

type { basic resources }
     word = integer;
     address = integer;

var  { registers in the central processor }
     R: array [0...] of word;
     PC: address;

     { main memory }
     M: array [address] of word;

{ definitions of instructions }

procedure JSR( var link: word; destination: address );
begin { jump to subroutine }
     link := PC;
     PC := destination;
end { JSR };

procedure JMP( destination: address );
begin { jump }
     PC := destination;
end { JMP };

procedure MOV( source: word; var destination: word );
begin { move data }
     destination := source;
end { MOV };

Figure 15.1. Formal definition of a partial machine architecture.

It will be assumed that the example machine has multiple general registers (the particular number of registers is unimportant), and it will be assumed that the basic process performed by the central processor is to repeatedly fetch instructions from M[PC], increment PC, and then execute the instruction as suggested by the procedural descriptions of the instruction semantics given in Figure 15.1.

As an example of a simple instruction, MOV(R[1],R[2]) will move the contents of the register R[1] into the register R[2]. Note that, in the examples given here, it will be assumed that the assembler uses a Pascal or C like syntax to specify operand addressing modes.

The JSR instruction first moves the old value of the program counter register PC into a linkage register (or memory location) specified by the programmer and then it transfers control to the destination address. Because of the convention that the program counter is incremented after fetching the instruction and prior to execution, the saved value of the program counter refers to the instruction immediately following the JSR instruction; that is, it is the return address.

The JSR instruction described here is very similar to the corresponding instructions on the IBM 360/370/390, the Power PC, the Silicon Graphics R series super-microprocessors, and other modern RISC machines. Many older architectures, notably the classic Intel microprocessors, the widely studied DEC PDP-11 and VAX architectures, and the old Motorola 680x0 microprocessors have procedure call instructions which are far more complex.

All calling sequences on the example machine will have a structure which is similar to that shown in Figure 15.2.

; calling sequence
  -- save any important register values
  -- place parameters in agreed upon registers
  JSR( R[link], <destination> )
  -- get results from agreed upon registers
  -- restore any important register values

; receiving sequence
  MOV( R[link], M[linksave] )
  -- save parameters (if needed)

; return sequence
  -- get parameters (if needed)
  JMP( M[linksave] )

Figure 15.2. A skeletal calling sequence.

Lines starting with -- in Figure 15.2 indicate sequences of additional instructions which might have to be added when this skeletal calling sequence is used as part of a real program. The particular register used for linkage is of no importance, and the memory location in which the linkage register is saved depends on the procedure being called. Some calling sequences allow the use of a linkage register to be bypassed entirely by having the procedure call instruction save the return address directly in memory, for example, on a stack.

The skeletal calling sequence shown in Figure 15.2 is typical of the sequences used for hand-coded assembly language procedures with small numbers of parameters and statically allocated activation records. If large numbers of parameters are to be passed, data structures must be constructed in memory to hold them, and if activation records are dynamically allocated, code must be added to the calling, receiving and return sequences to take care of this.

The division of responsibility between the calling, receiving, and return sequences may vary considerably. Activation records may be allocated by either the calling sequence or the receiving sequence, and they may be deallocated by either the return sequence or the part of the calling sequence after the JSR instruction.

Activation records for languages such as Pascal, C, C++ and Java can be allocated on a stack, since they are allocated in a strictly last-in first-out order. Although most language implementations use a stack, there is no requirement that a stack be used for this. It would be perfectly practical to allocate activation records on a heap, using general purpose allocate and deallocate routines, but calls to these routines must use a special calling sequence and their activation records must be allocated specially, since we cannot call the allocate routine in order to allocate an activation record for itself.

In addition to the local variables of the called procedure, each activation record must contain at least two other items: The identity of the activation record of the caller, and information about how to return control to the caller. These fields have many different common names, but the former is typically called the dynamic link or return link, while the latter is called either a return address or a continuation point. Unfortunately, the latter terms are not interchangible. Both refer to the place where the linkage register is saved, that is, the address in the calling routine where execution is to resume when the called routine returns. If the linkage register is saved in the called routine's activation record, it is called the return address; if it is saved in the calling routine's activation record, it is called the continuation point.

A typical activation record format using continuation points is shown in Figure 15.3.

       |        |
       |        |
       |========|  <------ bottom of activation record
       |  link  |  the return link
       |--------|
       |  cont  |  the continuation point
       |--------|
       |   .       parameters (if any)
           .
           .    |
       |--------|
       |   .       local variables (if any)
           .
           .    |
       |--------|
       |   .       temporary variables (if any)
           .
           .    |
       |========|  <------ top of activation record
       |        |

Figure 15.3. The structure of an activation record.

During the execution of any procedure, some register must be reserved to point to the current activation record. This will be called R[local] in the examples used here, since this register is used as the index register for the local variables of the current procedure. Other names for this register abound, among them, FP, the frame pointer register; this name comes from the common use of the term stack frame to refer to an activation record allocated on a stack. The constants link and cont will be used for the offsets of the return link and continuation point fields of the activation record, so M[R[local]+link] and M[R[local]+cont] will be the memory locations holding the return link and continuation point of the current activation record.

The activation record illustrated in Figure 15.3 includes no assumptions about the use of a stack. While most modern languages push activation records onto a stack, nothing requires this! The calling sequence given in Figure 15.4, for example, makes no assumptions about the use of a stack.

; special calling sequence for allocate
  -- put size of desired block in R[temp]
  JSR( M[alloclink], allocate )
  -- pointer to allocated block is returned in R[temp]

; special calling sequence for deallocate
  -- put pointer to deallocated block in R[temp]
  JSR( M[alloclink], deallocate )

; regular calling sequence
  -- save any important register values in caller's activation record
  MOV( <destination activation record size>, R[temp] )
  JSR( M[alloclink], allocate )
  -- insert parameter values in new activation record
  MOV( R[local], M[R[temp]+link] )
  JSR( M[R[local]+cont], <destination> )
  JSR( M[alloclink], deallocate )

; receiving sequence
  MOV( R[temp], R[local] )

; return sequence
  MOV( R[local], R[temp] )
  MOV( M[R[temp]+link], R[local] )
  JMP( M[R[local]+cont] )

Figure 15.4. A calling sequence with activation records on the heap.

The calling sequence given in Figure 15.4 uses calls to allocate and deallocate to manage activation records on the heap; as noted previously, the calling sequences for these routines must be special. As coded here, they use a special dedicated location, M[alloclink] to store their return address, and they presumably have statically allocated activation records. This calling sequence requires that the caller do most of the work! The result is receiving and return sequences which are relatively simple, and the code to move parameter values into place can be simple because the activation record of the caller is available for computing parameter values at the same time that the activation record of the called routine is available for storing them.

One disadvantage of the calling sequence given in Figure 15.4 is that the caller must know the activation record size needed by the called routine in order to allocate it. Simple programming conventions can overcome this problem; for example, it can be required that each procedure declaration provide a symbolic definition of the activation record size for that procedure, for example, we might have the convention that the procedure proc has its activation record size given by procARS.

Many systems have calling sequences which are carefully designed to put most of the work in the receiving and return sequences. If this is done right, the result is a 'universal calling sequence' which allows programs written in any language to call procedures written in any other language. One way in which operating system designers encourage the use of a system wide standard calling sequence is by providing a set of macros, giving them names such as CALL, PROC and RETURN, which make the use of the standard sequences so convenient that few programmers invent other calling sequences. Where such macros are not provided, assembly language programmers are well advised to write their own in order to avoid problems with calling sequences. Programmers should be aware that the standard calling sequences provided by some machines are not particularly efficient, so it is sometimes profitable to design and use special purpose calling sequences.

The calling sequences used by languages such as Algol, Pascal, PL/I and Ada are more complex than that given above because of the problems of dealing with nested blocks. C and C++ have far more restricted nesting because they forbid functions from being declared local to other functions. The possibility of local procedures and functions introduces a need for a static link in each activation record which is used to identify the activation record of the block that statically encloses the block associated with the current activation record. Whenever a variable declared in one of the enclosing blocks is referenced, the chain of static links is followed out to the appropriate activation record. There is an alternative to the static link called a display, which is a global array of pointers to the current activation records of all accessible blocks, but this also requires a new entry in each activation record to save the previous display entry for each block. These issues are beyond the scope of this work.

Calling sequences using a stack are not difficult to design; space can be allocated by pushing the initial values of the allocated storage on the stack, and deallocation can be done by popping the stack. Figure 15.5 illustrates a typical structure for an activation record on a stack.

          |        |
          |        |  previous activation record
          |        |
          |========|
R[local]->|  link  |  the return link        |
          |--------|                          - a stack mark
          |  cont  |  the continuation point |
          |--------|
          |   .       parameters, local variables, and temporaries (if any)
              .
              .    |  top word on stack
          |========|
R[stack]->|        |  first free word beyond stack top
          |        |  

Figure 15.5. A typical activation record on the stack top.

In Figure 15.5, it is assumed that the stack pointer register R[stack] normally points to the first free word beyond the top item on the stack. We will also assume that there are instructions PUSH and POP for operating on the stack, and that these increment and decrement the stack pointer, respectively. The basic structure of the activation record shown here is the same as that shown in Figure 15.3 except for the fact that activation records are embedded in the stack.

When activation records are pushed on a stack, the pair of locations containing the return link and the continuation point are sometimes called a stack mark. Consecutive activation records on the stack are separated by the stack marks, so these locations mark the point in the stack which determines how far the stack must be popped when deallocating an activation record. Some machines have special instructions for pushing a stack mark onto the stack.

A calling sequence using the stack based activation record format shown in Figure 15.5 is given in Figure 15.6.

; regular calling sequence
  -- save any important register values in caller's activation record
  MOV( R[stack], R[temp] )   |
  PUSH( R[local] )            - push the stack mark
  PUSH( 0 )                  |
  -- PUSH parameter values on stack
  JSR( M[R[local]+cont], <destination> )

; receiving sequence
  MOV( R[temp], R[local] )
  -- PUSH initial values of local variables

; return sequence
  MOV( R[local], R[stack] )  
  MOV( M[R[local]+link], R[local] )
  JMP( M[R[local]+cont] )

Figure 15.6. A stack based calling sequence.

The calling sequence in Figure 15.6 is a straightforward modification of that shown in Figure 15.4. It differs from the original in the use of PUSH instructions to build the stack mark and allocate storage for the new activation record. As a result of this, the caller need only push the stack mark and the required number of parameters on the stack; the called routine is responsible for pushing the local variables and any temporary storage. In fact, if the programming language does not advertise any particular initial value for local variables, the called routine may simply increment the stack pointer by the number of local variables needed in order to avoid the need to push default initial values one by one, only to have the programmer ignore the default initial values.

The return sequence shown in Figure 15.6 deallocates the activation record by reseting the stack pointer to point to the start of its activation record; note that the base register for the activation record was initially derived from the value of the stack pointer prior to the call, so this simple assignment to the stack pointer undoes the effects of any pushes since that time. It should be noted that the use of a stack pointer register is not always required when pushing activation records on a stack! Simpler, although less intuitive calling sequences make use of the fact that it is usually possible to statically determine the size of the stack within each activation record from an inspection of the code, so the stack top at any point in the code can be determined by a constant displacement from the base of the current activation record.

Some machines designed in the early 1970's include stack mechanisms which are particularly inconvenient from the point of view of activation record management. These include the DEC PDP-11, the R 6502, the Intel 8080 and its descendants up through the Pentium. The problem on these machines is that the procedure call instruction itself pushes the return address on the stack and the return instruction simply pops the stack into the program counter. As a result, the return address is typically above any parameters which were pushed on the stack instead of being incorporated into a stack mark.

The basic return instructions on these machines do nothing to solve the problem of activation record deallocation. As a result, the return sequence must typically pop all local variables off of the stack prior to the return instruction, and the calling sequence, after control returns, must finish deallocating the activation record by popping the parameters. Late in the history of these architecture, special instruction, for example, the MARK instruction on the PDP-11, were introduced to try to solve these problems, but very few programmers ever understand how to use such instructions, and as a result, they are rarely used.

Coroutines

A coroutine is a routine which does not enforce a last-in first-out ordering on control transfers that we associate with procedures and functions. As a result, activation records for coroutines cannot usually be allocated on a stack, but must be allocated from a heap. Like a procedure, a coroutine is a body of code which must be associated with an activation record before it is run. Unlike procedures, however, the activation records for coroutines are not automatically allocated as a result of control transfers. Instead, the user must explicitly allocate coroutine activation records prior to a transfer of control, and the user may transfer control to a given coroutine, running in a given activation record, a number of times before that coroutine terminates.

The transfer of control to a coroutine is accomplished by a resume operation. This causes the computation which initiated the control transfer to be suspended, as a coroutine, and it causes the resumption of the computation to which control is transferred. As a result, two or more coroutine activations may easily transfer control back and forth, where each time a coroutine regains control, it picks up at the point it was most recently executing. Note the distinction between the terms coroutine and coroutine activation. A coroutine is a block of code which may be activated, producing an activation to which control may be transferred. The alternate term coroutine instance is sometimes used as a synonym for coroutine activation, because the activation record for a coroutine can be viewed as an instance of a special kind of class.

When procedures were the only concern, there was little reason to choose between the use of continuation points and return addresses in designing activation records and calling sequences. With coroutines, on the other hand, the choice is obvious: Continuation points should be used. This is because, when control is transferred to a coroutine, it must be possible to find the correct value of the program counter for that coroutine without knowledge of how control was most recently transferred away from that coroutine. Recall that continuation points are stored in the activation record of the routine as control is passed to some other routine, while return addresses tend to be stored in the activation record of the routine to which control is being transferred.

As motivation for the development of coroutines, consider the problem which was originally presented in Figure 10.3 through Figure 10.5. The problem was to write a program which processed data which arrived at some rate at an input port, delete all occurences of 'z' from the input, double all occurences of 'x', and pass the data on to an output port. Input/output at these ports happens at varying rates, so queues must be used to buffer the transfer of data between the input/output ports and the applications program.

A coroutine-based solution to this problem is given in Figure 15.7, using a Pascal-like language supporting coroutines.

{ global variables }
var poll, appl: activation;

     coroutine pollingloop;
     begin
          repeat
               if inputstatus = ready and not full( inputqueue )
                 then enqueue( inputqueue, inputdata );

               if outputstatus = ready and not empty( outputqueue )
                 then dequeue( outputqueue, outputdata );

               resume appl;
          forever;
     end { pollingloop };

     coroutine application;
     var ch: char;
     begin
          repeat
               while empty( inputqueue ) do resume poll;
               dequeue( inputqueue, ch );
               if ch \(!= 'z' then begin
                    if ch = 'x' then begin
                         while full( outputqueue ) do resume poll;
                         enqueue( outputqueue, ch );
                    end;
                    while full( outputqueue ) do resume poll;
                    enqueue( outputqueue, ch );
               end;
          forever;
     end { application };

begin { main program }
     poll := activate pollingloop;
     appl := activate application;
     resume poll;
end { main program }.

Figure 15.7. An example using coroutines.

In Figure 15.7, variables of type activation hold pointers to activation records, and the special operator activate takes the name of a coroutine as an operand and returns a pointer to an activation record for that coroutine. The activation record allocated by activate has a size sufficient for the coroutine in question and has its continuation point set to the first instruction of the body of that coroutine. The resume statement takes a pointer to an activation record as an argument and transfers control to that activation record.

The important thing to note about Figure 15.7 is that it allowed the problem originally posed in Chapter 10 to be solved with what look like two main programs, each of which is a coroutine. The first of these, pollingloop, emphasizes the structure of the problem from the point of view of performing input/output; thus, the control structure is the same as that of the main polling loop solution illustrated in Figure 10.2. The second coroutine, application, has a structure which emphasizes the problem which is to be solved; thus, the control structure is the same as that of the polling procedure solution shown in Figure 10.5.

The code for the application coroutine shown in Figure 15.7 could have been improved by the use of read and write procedures which would encapsulate the references to the input and output queues and hide the small polling loops which await status changes in these queues. Unfortunately, this change introduces the problem of how a procedure or function can be called from within a coroutine and resume another. This problem has been solved, but the solutions are beyond the scope of this discussion.

As with calling sequences, there are many possible coroutine resume sequences. The resume sequences discussed here use the activation record structure shown in Figure 15.3. This structure includes a link field which is not really needed in the context of "pure" coroutines, but is quite useful in situations where it is useful for a coroutine to be able to resume its caller. The structure also includes provisions for parameters. Parameters were not used or needed in the example given in Figure 15.7, but many coroutine systems allow parameters to be passed to a coroutine at the time of activation record allocation.

Two coroutine resume sequences are shown in Figure 15.8.

; resume sequence 1
-- save any important register values
MOV( R[local], R[temp] )
MOV( <destination activation record>, R[local] )
JSR( M[R[temp]+cont], M[R[local]+cont] )
-- restore saved register values

; resume sequence 2
-- save any important register values
MOV( <destination activation record>, R[temp] )
JSR( M[R[local]+cont], M[R[temp]+cont] )
MOV( R[temp], R[local] )
-- restore saved register values

Figure 15.8. Two coroutine resume sequences.

The first resume sequences shown in Figure 15.8 does all of the work before transferring control, while the second transfers control as soon as possible, requiring that the destination coroutine finish the job. These resume sequences are equally good, but it is important to note that they are not compatible; a coroutine which uses one of these resume sequences cannot resume one which uses the other. The second resume sequence may be somewhat confusing because the first two instructions are executed prior to a transfer of control to a different coroutine, while the third instruction is executed as a result of some other coroutine resuming the original one.

Generators

A generator is a special kind of coroutine which is similar to a procedure in many ways. As with coroutines, the allocation of an activation record for a generator must be done explicitly prior to any transfer of control to that generator, but unlike coroutines, the transfer of control to a generator activation resembles a procedure call. The difference between a call to a generator activation and a call to a procedure is that each call to a generator activation causes it to begin execution immediately after the point where it most recently returned.

The term 'generator' stems from the use of functions which generate sequences of values. Conventionally, such a function would be written using global variables to remember previous values to be used in generating the next. Generators allow these to be replaced with local variables which persist from one call to the generator to the next. The function and generator approaches are contrasted in Figure 15.9.

{ global variables needed to generate the Fibonacci numbers }
var i: integer { initially 0 };
    j: integer { initially 1 };

function fibonacci: integer;
var k: integer;
begin
     fibonacci := i;
     k := i + j;
     i := j;
     j := k;
end { fibonacci };

generator function fibonacci: integer;
var i,j,k: integer;
begin
     i := 0;
     j := 1;
     repeat
          return i;
          k := i + j;
          i := j;
          j := k;
     forever;
end { fibonacci };

Figure 15.9. Generators and functions contrasted.

Note that a 'generator function' is a generator which returns a value each time it returns to its caller.

Assuming that the global variables used by the function fibonacci in Figure 15.9 are properly initialized, and assuming that they are not accidentally used by other code, successive calls to this function will return successive values of the Fibonacci series. The generator returns these same values, but it has the advantage of having complete control over its local variables and their initialization. Note that, unlike the function, the generator cannot be called until an activation record has been allocated, for example, by an activate operation, and that the call to the generator must be done using something like the resume statement used with coroutines.

In an object oriented language, generators would be modeled as objects, where the local variables of the generator are private components of the object, and the generator function itself is the method of the object. This object oriented solution is intermediate between the simple functional model and the couroutine model for achieving the same effect.

The transfer of control to a generator is accomplished by a generator resume sequence, and the return of control from a generator to its caller is done by a generator return sequence. These strongly resemble hybrids of the sequences used with procedures and coroutines. As was the case with coroutines, the activation record structure shown in Figure 15.3 is also sufficient for generators. Two example sets of sequences for use with generators are shown in Figure 15.10.

; generator resume sequence 1
-- save any needed registers
MOV( R[local], R[temp] )
MOV( <destination activation record>, R[local] )
MOV( R[temp], M[R[local]+link] )
JSR( M[R[temp]+cont], M[R[local]+cont] )
-- restore any needed registers

; generator return sequence 1
-- save any needed registers
MOV( R[local], R[temp] )
MOV( M[R[temp]+link], R[local] )
JSR( M[R[temp]+cont], M[R[local]+cont] )
-- restore any needed registers

; generator resume sequence 2
-- save any needed registers
MOV( <destination activation record>, R[temp] )
JSR( M[R[local]+cont], M[R[temp]+cont] )
-- restore any needed registers

; generator return sequence 2
-- save any needed registers
MOV( R[local], R[temp] )
MOV( M[R[temp]+link], R[local] )
JSR( M[R[temp]+cont], M[R[local]+cont] )
MOV( R[local], M[R[temp]+link] )
MOV( R[temp], R[local] )
-- save any needed registers

Figure 15.10. Two sets of generator calling and return sequences.

Both of sets of sequences given in Figure 15.10 accomplish the same thing, but one sequence puts an equal burden on the caller and the generator itself, while the other puts most of the work in the generator, with very little work in the caller.

Note that the execution of the first half of a resume sequence will transfer control to the second half of some return sequence, and the execution of the first half of a return sequence will transfer control to the second half of a resume sequence. This symmetry illustrates the close relationship between generators and coroutines, but it also allows the relationship between generators and procedures to be clarified: A generator resume sequence is similar to a procedure calling sequence, and a generator return sequence is similar to a procedure return sequence followed immediately by a procedure receiving sequence, with the additional constraint that the procedure return sequence is modified to save the continuation point of the procedure so that the next call will transfer control to the following receiving sequence.

Processes

A process, in the context of a computer system, is an independently executed stream of instructions, where the instructions in the stream are executed in sequence. A process is sometimes called a task. Each process can be viewed as being executed on a different central processing unit, although there are many other ways of achieving the same effect. Each user of a multiuser computer system is usually serviced by a different process which runs programs on behalf of that user; under some operating systems, each window under a window manager may provide an interface to a different process. Programmers with access to multiple processor machines may construct programs consisting of many parallel processes which cooperate to solve a single problem, running far faster than a program to solve that problem with just a single processor.

There are many parallels between processes and coroutines. In many cases, the control structure of a user program constructed from coroutines will be the same as the control structure of an equivalent program constructed from processes, with one big exception. The transfer of control between coroutines is accomplished with explicit operations such as resume statements. In contrast, there is no explicit transfer of control between the processes making up a parallel program; in a sense, they all have control at the same time.

In the world of personal computers, Microsoft's old MS-DOS operating system allowed only one process, although some applications used application provided interrupt service routines to achieve the effect of multiple processes. Older versions of Microsoft's Windows operating system and of Apple's MacOS system used a coroutine-based approach frequently described as cooperative multitasking. Microsoft's Windows NT and Apple's MacOS X support fully developed independent processes, as do Linux and essentially all other operating systems descended from post 1965 mainframe and minicomputer environments.

An alternate definition of a process describes a process as a locus of control. The term locus means place or locality. In geometry, many curves are described as the locus of points satisfying some condition, with the basic rule being that all points on the locus are, in an abstract sense, adjacent. Similarly, the points at which a process executes code, that is, the sequence of addresses held in the program counter, are adjacent in time, each one leading naturally to the next. In most cases, these addresses are also consecutive in memory, but they are not consecutive when a branch instruction is executed.

Each process in a system must have its own activation record, just as each coroutine instance or procedure instance must have an activation record. Just as there may be many different activations of a procedure or coroutine, distinguished by their activation records, there may be many different activations of the code for a process, distinguished by their activation records. When the code for a process includes calls to procedures or when that code resumes coroutines, the activations of those procedures or coroutines are considered to be logically part of the process; this is most clear when each process activation record contains a stack on which activation records for procedures called by that process are allocated.

When multiple processes cooperate to solve a single problem, it is frequently the case that one or more of the processes must wait for some other process to finish some computation. For example, one process might be producing a sequence of data which the another process consumes. This sequence can easily be buffered in a first-in first-out queue, if this queue is ever empty, the consumer process must wait for data, and if this queue is ever full, the producer process must wait for space in the queue. This problem can be solved by polling loops to wait for the desired condition, but such polling loops are not always desirable. As a result, it is common to introduce special process synchronizing operations such as wait(x) and signal(x) which can be used to wait for some condition x or signal that this condition has occurred. There are a number of different formalisms for the variables on which the wait and signal operations act. These differ in the detailed semantics of the operations.

The introduction of wait and signal operations, whatever the details of their semantics, introduces the concept of a process state, above and beyond the basic notion of process state that refers to such things as the values of the variables and registers of a process, including the program counter. The normal state of a process is running, but if a wait operation on some condition is performed, and that condition is not met, the state of the process is changed to waiting. The process will return to a the running state when that condition is met.

When a process terminates, we speak of a state transition from running to dead, and when a process is created, we sometimes speak of this as a state transition from dead to running. Most systems also include operations which allow an errant process to be killed, forcing a state transition to dead from outside that process. Figure 15.11 summarizes these process states and the operations that cause them.

              create                       wait 
        ---------------->           ---------------->
   dead                    running                     waiting
        <----------------           <----------------
             terminate                    signal
                kill

Figure 15.11. The states of a process.

The semaphore abstract data type is perhaps the simplest universally useful implementation of the variables used as arguments to the wait and signal operations. Although there are many implementations of semaphores, they can be intuitively thought of as integers, where wait(x) or the object-oriented x.wait waits until x is greater than zero and then decrements it, and signal(x) or x.signal increments x. An incorrect but intuitively useful implementation of these operations using polling loops is shown in Figure 15.12.

type semaphore = integer;

procedure wait( var s: semaphore ) { P };
begin
     while s <= 0 do { nothing };
     s := s - 1;
end { wait };

procedure signal( var s: semaphore ) { V };
begin
     s := s + 1;
end { signal };

Figure 15.12. The basic operations on semaphores.

It should be noted that the operations on semaphores are frequently called P and V instead of wait and signal. These abbreviations which were originally used in the THE operating system developed at the Technische Hogeschool Eindhoven in the Netherlands by E. W. Dijkstra; P stands for proberen (to test or probe) and V stands for verhogen (to increment). The use of the names P and V explicitly indicates the use of semaphores, while terms such as wait and signal have been used to refer to related process synchronization primitives that differ in many details.

An example of the use of the semaphore primitives is provided by the code shown in Figure 15.13.

var inputnotfull: semaphore { initial value = the queue capacity };
    inputnotempty: semaphore { initial value = zero, implying empty };
    outputnotfull: semaphore { initial value = the queue capacity };
    outputnotempty: semaphore { initial value = zero, implying empty };
         .
         .
         .

process pollinput;
begin
     repeat
          if inputstatus = ready then begin
               wait( inputnotfull );
               enqueue( inputqueue, inputdata );
               signal( inputnotempty );
          end;
     forever;
end { pollinput };

process polloutput;
begin
     repeat
          if outputstatus = ready then begin
               wait( outputnotempty );
               dequeue( outputqueue, outputdata );
               signal( outputnotfull );
          end;
     forever;
end { polloutput };

process application;
var ch: char;
begin
     repeat
          wait( inputnotempty );
          dequeue( inputqueue, ch );
          signal( inputnotfull );
          if ch <> 'z' then begin
               if ch = 'x' then begin
                    wait( outputnotfull );
                    enqueue( outputqueue, ch );
                    signal( outputnotempty );
               end;
               wait( outputnotfull );
               enqueue( outputqueue, ch );
               signal( outputnotempty );
          end;
     forever;
end { application };

Figure 15.13. An example using processes.

This code solves exactly the same problem as was solved by the code shown in Figure 15.7 using coroutines. As with that code, this code consists of independent loops for the application code and the input/output service. In this case, the input/output service has been coded as two loops, one for input and one for output. This allows either one to wait for queue space without blocking the other. Unlike the coroutine solution, this solution includes no explicit transfer of control between the loops; instead, each loop runs as a process. The semaphores are used to stop the consumer processes when the queue from which they consume is empty and to stop the producers when the queue into which they write is full.

From an inspection of the code in Figure 15.13, it can be shown that the count stored in the notempty and notfull semaphores associated with each queue always tells how much free space there is and how much data there is in the queue. This use of semaphores as counters allows the code to be written without any explicit calls to routines to test to see if the queue is full or empty.

Terminology

The reader should be familiar with the following general terms after reading this section

process				task
thread of control               locus of control
serially reusable code          reentrant code
recursive code                  lifetime of a variable
procedure call                  procedure call instructions
calling sequence                continuation point
receiving sequence              return address
return sequence                 return link
stack frames                    activation records
stack mark                      static link
coroutine                       coroutine activation
resume sequence                 generators
generator calling sequence      generator resume sequence
generator return sequence       process states
wait                            signal
semaphores                      P and V

Exercises

  1. Rewrite the sequences shown in Figure 15.4 to save a return address in the called procedure's activation record instead of saving a continuation point in the caller's activation record.

  2. Rewrite the sequences shown in Figure 15.4 so that the called routine is responsible for allocation and deallocation of it's own activation record. (Ignore the problem of parameter passing.)

  3. Rewrite the sequences shown in Figure 15.6 to save a return address in the called procedure's activation record instead of saving a continuation point in the caller's activation record.

  4. On some machines, the procedure call instruction is very similar to the JSR instruction used here with one major exception: The instruction can only save the program counter in a linkage register, it cannot save it directly in the return address or continuation point fields of an activation record. Rewrite the sequences shown in Figure 15.6 to conform to this constraint.

  5. Write code to create an activation of the generator function shown in Figure 15.9 and then write out the first 3 values in the sequence it returns. Assume that the keyword "resume" can be used as a unary operator in an expression to resume a generator function instance and get the value it returns.

  6. Write a paragraph comparing the way that global variable use is reduced by the use of coroutines or generators, for example, as shown in Figure 15.9, with the way that Ada packages can be used to accomplish the same end, as illustrated in Figure 3.16.

  7. Which of the state changes shown in Figure 15.11 are initiated by the process to which they apply and which are initiated by some other process?

Readings

Procedure calling sequences are discussed in many assembly language programming texts, although this discussion is frequently at a very elementary level. A more advanced discussion is contained in Section 7.3 of

Compilers: Principles, Techniques, and Tools, by Aho, Sethi and Ullman (Addison Wesley, 1986).

For a good discussion of procedure and function activation records, but with no detail on calling sequences, see Sections 3.4 to 3.6 of

Programming Language Concepts, by Ghezzi and Jazayeri (Wiley, 1982)

Calling sequences and activation records are covered in Section 8.4 of the classic

Systems Programming, by Donnovan (McGraw Hill, 1972).
but the discussion is somewhat dated and very specific to the classic IBM 360 architecture (now equally applicable to the 370 and 390).

The best reference on coroutines is almost certainly

Coroutines - Lecture Notes in Computer Science Vol. 95 by C. D. Marlin (Springer Verlag, 1980).
This goes into considerable detail about the relationship between coroutines, generators, and procedures, and presents an integrated language design allowing free intermixing of these control structures.

For an introduction to concurrent processes at about the same level of detail as is presented here, see Section 3.1 of

Operating System Principles by Per Brinch Hansen (Prentice Hall, 1973)
A similar level of coverage is provided by Section 4.2 of
Software Systems Principles by Peter Freeman (SRA, 1975).