Processes and Threads

Part of 22C:112, Operating Systems Notes
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Origins

In the early 1960's, three different ideas emerged, all of which lead in the same direction:

First, timesharing systems such as the Dartmouth Basic system (the origin of the Basic programming language) pioneered the idea of providing each user with a virtual personal computer. From a user perspective, each user could edit and run Basic programs. In fact, the computer system held all of the user's work in its memory (RAM and slower drum or disk memory), taking turns executing instructions on behalf of different users. In effect, each user had a share in the large expensive mainframe computer systemn.

In fact, the programming language was originally named BASIC, an acronym for the Beginner's All-purpose Symbolic Instruction Code. More properly, this is a backronym. The developers decided to call their language Basic, and then struggled to figure out what that stood for as an acronym. Because the name came first and was only then re-interpreted as an acronym, calling it Basic instead of BASIC is not a major injustice.

By the later 1960's, there were timesharing service bureaus that commercially sold timesharing services. Instead of confining users to single languages, these systems allowed many users to run programs on a single "mainframe" computer, giving each user a "virtual personal computer" in an era before people imagined that genuinely personal computers would soon become economically viable. Each user on timesharing systems of the late 1960s had the freedom to to run programs written in languages as diverse as Fortran, LISP, BASIC and assembly language.

Second, developers of batch systems found that multiprogramming could be used to improve system performance. In a batch system, users submitted jobs, typically on punched cards, to a clerk who formed the jobs into batches for processing on the mainframe at the computer center. On a uniprogrammed system, each job was run to completion before the next job began. This meant that the CPU was idle while waiting for I/O to finish, and I/O devices were idle while the job was doing CPU-bound computation.

In a multipropgrammed system, multiple jobs were run at once, so that, when one job had to stop to wait for I/O to finish, another job could start running, and while a job had no I/O activity, there was a good chance that another job might be waiting for I/O to finish. This kept the very expensive CPU and I/O devices of the mainframe computer system well utilized, lowering the average price per job. By the mid 1960's, multiprogrammed batch systems were commonplace.

Third, multiprocessors were developed. These are the direct predecessors of today's multicore computer systems. The first solid proposals for such systems came in papers connected with Burroughs corporation, notably M.E.Conway, A multiprocessor system design, In Conf. Proc. 1963 FJCC, pages 139-146. AFIPS Press, 1963. The idea was to connect two or more processors to two or more main memory modules through a fast switch. This permitted what we now call multithreaded applications to run.

Burroughs had significant success with this idea in the 1960's while the rest of the industry largely ignored it. IBM reported poor speedups with multiprocessors in this era, reporting that adding an extra CPU doubled the cost of a machine while the performance was increased by a much smaller amount.

In the 1960's, Gordon Bell of Carnegie-Mellon University noted that minicomputers had a much better cost/performance ratio than mainframes, and he asked: Why not use multiple minicomputers to do the job instead of one mainframe. This led him to develop a system called C.mmp, with 16 minicomputer CPUs sharing access to 16 memory modules through a special memory switch. The switch cost as much as the processors, but the result worked well enough to inspire followups.

By the 1980's, commercial vendors such as Sequent and Encore sold multi-microprocessors with 16 to 64 CPUs sharing access to one large memory. On these machines, each CPU was a separate microprocessor chip, but it was common to mount multiple CPU chips on one circuit board. These machines worked well -- the University of Iowa had two of them. Today, the software developed for those machines provides a solid foundation for supporting the multicore processors available from companies like Intel, AMD and ARM. While today's multicore processors are single-chip integrated circuits, many of the architectural ideas they embody were originally developed in the 1960's and 1970's.

A User Perspective

Conway recognized, from the start, that the division of an application into parallel threads depended on the application, not on the number of processors available. He therefore proposed a model of process execution that allowed the system to handle the assignment of threads to processes. From the user perspective, it is useful to consider a very simple case, the set of operations A, B, C and D -- where each operation may be a complex block of code or even a subroutine call. Consider the problem of writing a program where A must be executed first, followed by B and C, which may run in parallel, and D, which may not start until B and C have completed. We might write this as a flow chart as follows:

Flow-chart notation is not very useful in a textual programming language, so people proposed notatins such as:

          A;                         A;
          cobegin                    fork
             B;                         B;
             C;                         C;
          coend                      join
          D;                         D;

Here, the cobegin and coend keywords indicated that the statements between them could be executed in parallel. The terms fork and join referred to the forking and joining of lines representing flow of control in the flowchart.

Conway recognized, from the start, that the symmetry inherent in the above presentations of parallelism is misleading. In a practical implementation, one thread of control will flow from A to D through one of the statements, arbitrarily, let's select B, while the other statement, C, will be run by a second thread. This led Conway to suggest the following basic interface:

          A;
          fork child;                child:
          B;                            C;
          wait X;                       exit X;
          D;

Conway's fork statement took a label, the address of the first instruction of a new thread or process. That child process was launched to run in parallel with the parent, running until it executed an exit statement. The exit statement used the variable X to signal that the child had exited. The wait statement waited until this variable indicate that the child had exited.

Others proposed that the fork operation should call a subroutine, running it in a different thread. Yet others suggested other models. The Unix process manager interface is one variant. Here is the Unix variant of the above:

         A;
         if (fork()) { /* parent */
            B;
            wait();
         } else { /* child */
            C;
            exit();
         }
         D;

In Unix, fork() , wait() and exit() are all defined as system calls (part 2 of the programmer's reference manual). The fork() function causes the calling process to be duplicated -- the code segment is shared, so really, this just means that the stack and static segments are copied. The return value of fork() is different in the parent (or original copy) and the child (the new copy). In the parent, fork() returns the process ID of the child, an integer. In the child, it returns zero.

The wait() kernel call causes its caller to stop and for any child of the parent to terminate. It returns the process ID of the child, so if the parent wants to wait for a particular process, it must repeatedly call wait() until the right ID is returned. (There is also a waitpid() service that allows a process to wait until the process with a particular process ID terminates).

The exit() terminates a process. The parameter is passed to wait, but the way it does this is awful -- look it up in the manual for details.

It is noteworthy that all files that were open at the time of the fork remain open in both the parent and the child. If, before the fork, the parent creates a pipe, the parent and child can communicate through the pipe. A pipe is a first-in-first-out buffer, with a file descriptor for the write end of the pipe and a file descriptor for the read end of the pipe.

Most modern Unix variants support shared memory segments. The system service shmat() attaches a shared segment to a process. Once such a segment is attached, it will stay attached through a fork, so both parent and child can use if for access to shared variables. Files that are opened with mmap() before a process forks will also be shared as a result of the fork.

When the Unix shell is used to launch an application, what it actually does is fork:

        if (fork()) { /* parent waits for child to terminate */
                wait();
        } else { /* child launches application */
                execve( commandname, argv, envp );
                /* code to handle execve failure goes here */
                exit();
        }

This relies on the executed program to terminate with a call to exit(), allowing the shell to continue.

The pipe() kernel call creates a pipe. As already mentioned, a pipe is a FIFO data structure with two file descriptors attached to it. The system call takes one parameter, a pointer to a two-element integer array. After calling pipe(a), write(a[1],buf,len) will send characters to be received by read(a[1],buf,len). It is good programming practice to close unneeded files before a fork. the following example program illustrates this by setting up a pipe allowing the parent process to send a character to the child.

        int a[2];
        pipe(a); /* set up the pipe */
        if (fork()) { /* parent */
                char ch = 'x';
                close(a[0]);      /* parent will never read the pipe */
                write(a[1],ch,1); /* send one character to the child */
                close(a[1]);      /* parent is done with the pipe */
                wait();
        } else { /* child launches application */
                char ch;
                close(a[1]);      /* child will never write the pipe */
                read(a[0],ch,1);  /* get one character from the parent */
                close(a[0]);      /* child is done with the pipe */
                /* do something with the character? */
                exit();
        }

This relies on the executed program to terminate with a call to exit(),