Assignment 11, due Apr. 13

Solutions

Part of the homework for CS:3620, Spring 2018
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On all assignments, your name must be legible as it appears on your University ID card! Assignments are due at the start of class on the day indicated (usually Friday). Exceptions will be by advance arrangement unless there is what lawyers call "an act of God" (something outside your control). Homework must be turned in on paper, either in class or in the teaching assistant's mailbox. Never push late work under someone's door!

Background: Look up the Unix/Linux pipe() kernel call. Your goal in this exercise is to suggest how pipes are implemented inside the kernel. Recall that a user file descriptor returned to a user when you open a file or, in this case, create a pipe, is an integer index into an array of open-file objects that the kernel maintains on behalf of each process. The entries in this array are pointers to open-file data structures (objects). The class "open-file" has many subclasses, things like disk file, terminal and pipe are all, logically speaking, subclasses of open file.
Also recall that when a process forks, the newly created process gets a copy of the parent's open file table. This means copies of the pointers in that table, not new objects. Both parent and child have all the same access rights to each file the parent had open, but if either process opens more files after the fork, those are not shared.
a) The pipe() kernel call returns two file descriptors. Do these descriptors refer to the same object or different ones. If different, do the objects members of the same subclass of open file or different subclasses? If different, what is the difference? (0.3 points)

Different ones, logically members of different subclasses, one end of the pipe is a write-only file (it is meaningless to try to read from it), the other end is a read-only file (it is meaningless to try to write to it).

b) Ultimately, the pipe is probably implemented with a bounded-buffer implementation of a FIFO queue. How is this related to the object or objects discussed in part a? (0.3 points)

Each open file object uses the same bounded buffer to actually implement the stream from write-only producer to read-only consumer, so the objects that represent the two ends of the pipe must each include pointers to (handles for) the FIFO queue object that connects them.

c) What other objects would you probably include as part of the implementation of the pipe? (Hint, pipes are only really useful if they are used for interprocess communication.) (0.4 points)

We need something like a mutual exclusion semaphore to guard access to the queue representation plus something like counting semaphores used to count the data and free space in the bounded buffer.
Background: Look up the Unix/Linux pipe() kernel call again, and also the dup2 and fork() kernel calls. Recall that the file descriptors 0, 1 and 2 are used for standard input, standard output and standard error.
Our goal here is to think about how these kernel calls are used by the Unix/Linux shells to implement shell pipes. In the standard shells, if you type commanda|commandb, where the two commands may have any number of command-line arguments, commanda is launched with its standard output directed into a pipe, and commandb is launched in parallel with its standard input taken from that pipe.
a) Is the pipe created before or after the forking involved in launching the commands? Why? (0.3 points)

The pipe must be created before the fork. After the fork, the parent and child process each have access to the pipe.

b) How many fork operations must be involved. You know how the shell works to launch one command, this question asks how to launch two commands running in parallel. (0.3 points)

It should be possible to launch each command in the string of piped-together commands with a single fork, and then wait for all of them to terminate at the very end.
It may be that there are simpler solutions that use more forks.

c) Where does the call (where do the calls) to dup2() go in relation to the pipe(), fork() and execve() calls and what are their parameters? A complete answer will explain how the file descriptors returned by fork() end up becoming standard input and standard output for the launched commands. (0.4 points)
- Step 1: pipe( pipefd )
- Step 2: fork() to create processes a and b for the left and right commands on the command line that are to be separated by a pipe.
- Side a step 1: dup2( pipefd[1], stdout )
- Side a step 2: execve() for left command on command line
- Side b step 1: dup2( pipefd[0], stdin )
- Side b step 2: execve() for right command on command line
The above answer is deliberately simplified in that it does not discuss, at minimum, one additional fork() required to allow the shell to regain control after both the left and right commands are complete.
Background: A final question among what may be too many questions about the pipe() system call. It is good practice for a program to dispose of any resources it does not need as soon as it learns that it no longer needs them. For file descriptors, the close() operation deletes the pointer (sets it to null) in the open file table of a process.
a) Look at your answer to question 2 and indicate which file descriptors should be closed when and where in the code between the time pipe() is called and the two execve() calls that complete the launching of commanda and commandb. (0.5 points)

Immediately after the fork, in the parent and child process, the file descriptor returned by the fork that will not be used by that process should be closed.
Immediately after each call to dup2, the file descriptor that was duplicated should be closed.

b) Look at your answer to part a and to question 1 and suggest a way that the kernel can know when to deallocate the objects associated with a file descriptor. Some call to close() will do this, but which one and how can the system track when to actually delete the object? (0.5 points)

We need to keep a count of the number of open file descriptors that reference the same open file object and only delete that object when no descriptors remain.
In more detail, when you open a file or create a pipe, the count on each object is one; fork() increments the reference counts on each file that is open in the process that forked. All of the various dup() operations increment the reference count as they duplicate a file descriptor; close() decrements the reference count and deletes the object if the count reaches zero.