Assignment 11, Solutions

Part of the homework for 22C:112, Fall 2012
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Background: As originally developed, Unix (and Linux, for that matter) had no semaphore services. These were eventually added, as an afterthought, in two semi-incompatible varieties (Posix versus System V). As with many such afterthoughts, it looks like the designers tried to throw in every idea anyone ever had about the subjectl; in addition, they avoided using the standard names for semaphore operations.
Use man -k semaphore to find all man pages relating to Linux semaphores. One of the pages is an overview; read it first.
Now, suppose you have a program that forks into several processes that share a memory segment attached using shmat() prior to forking. The objects allocated in that segment need to incorporate semaphores, perhaps several in some objects.
a) How would you declare one such semaphore as a component named semf of a C structure, and once an instance obj of that structure is allocated, how would you initialize the obj.semf semaphore to have an initial count of zero. (0.4 points)
Declare the semaphore:
```
include <sempahore.h>

struct whatever {
        ... other fields ...
	sem semf; 
        ... other fields ...
};
```
Initialize it, after an obj of type struct whatever has been created.
```
        sem_init( &obj.semf, 1, 0 );
```
Above, we did not include a pointer in the struct, so the semaphore is directly embedded in the object. Therefore, we needed the ampersand operator to make a pointer. If the obj field of the struct had been declared as a pointer to a semaphore, the ampersand would not be needed, but allocation of the struct would take two separate calls to malloc().
b) Give Linux code to wait on the obj.semf semaphore. (0.3 points)
```
        sem_wait( &obj.semf );
```
c) (0.3 points) Give Linux code to signal on the obj.semf semaphore.
```
        sem_post( &obj.semf );
```
Background: Linux supports Posix threads. See man pthreads for detail.
A question: Can you figure out whether these are kernel threads or user threads. What information in the man page leads you to this conclusion? (Quoting Dr. Google does not count here, although Google can easily give you an answer, you must locate the reason for that conclusion in the man page.) (0.5 points)

To quote from man pthreads, regarding both the obsolete LinuxThreads implementation and the NPTL implementation of threads:

Both of these are so-called 1:1 implementations, meaning that each thread maps to a kernel scheduling entity. Both threading implementa‐ tions employ the Linux clone(2) system call. In NPTL, thread synchro‐ nization primitives (mutexes, thread joining, etc.) are implemented using the Linux futex(2) system call.

Given that each thread runs on a clone of the launching process, known to the kernel. Posix threads are therefore at least partly kernel threads. The futex synchronization primitive, however, attempts to reduce the involvement of the kernel in thread synchronization, as documented in man 7 futex:

Futex operation is entirely userspace for the noncontended case. The kernel is only involved to arbitrate the contended case.
A question: Suppose you use user-level threads, for example, something like the thread library in the course home page. What happens to your application when one of the threads does a read() from the keyboard? (0.5 points)

When a thread calls read(), if the system call blocks, not only is that thread blocked, but also all other threads in the process are blocked.
Background: Consider the following implementation of a priority queue:
```
struct item {
        struct item * next; /* makes items queueable */
        int priority;
};

struct queue {
        struct item * head; /* the head item in the queue */
        sem lock; /* semaphore used for mutual exclusion */    /* addition */
        sem data; /* semaphore to count data */                /* addition */
};

void enqueue( struct queue * q, struct item * i ) {
        sem_wait( &q->lock );                              /* addition */
        if ((q->head == NULL)
        ||  (i->priority > q->head->priority)) {
                /* enqueue in empty queue or at queue head */
                i->next = q->head;
                q->head = i;
        } else { /* linear search for place in queue */
                struct item * p = q->head;
                while ((p->next != NULL)
                &&     (i->priority <= p->next->priority)) {
                        /* march on down the queue */
                        p = p->next;
                }
                i->next = p->next;
                p->next = i;
        }
        sem_post( &q->data );                              /* addition */
        sem_post( &q->lock );                              /* addition */
}

struct item * dequeue( struct queue * q ) {
        struct item * i;                                       /* modified */
                                                               /* modified */
        sem_wait( &q->data );                              /* addition */
        sem_wait( &q->lock );                              /* addition */
	i = q->head;                                        /* modified */
        if (i != NULL) {
                q->head = i->next;
        }
        sem_post( &q->lock );                              /* addition */
        return i;
}
```
All is well and good if the above code is used by just one process. Now ask, what if the priority queue is in a shared heap and is used by multiple processes (or threads)?
a) Add a semaphore or semaphores to the above code so that it will work for queues in a shared heap. Note that there are two solutions. One is quite easy, using only one semaphore per queue for mutual exclusion. Give that solution. (0.5 points)

See notes additions and modifications in the code above. The modification is just stylistic -- putting all the declarations before all the code in the body of the function required breaking one line of code in two so the lock could be claimed first.
Note that there was no need for a semaphore to count free space because the queueable items are neither allocated no deallocated inside the queue manager. A semaphore to count the data in the queue is needed, however, because even if the code never produced an empty queue inside a single thread, multiple threads may create the possibility that the queue is emptied by one process after another process has concluded that there is available data.

b) That solution has a serious problem: As the queue grows longer, the time spent in the critical seciton for enqueue grows, limiting potential parallelism. There is a better solution involving multiple semaphores, one per item. Briefly describe it -- do not attempt to write code. (0.5 points)

Put one mutual exclusion semaphore in each item in the queue as well as the semaphore guarding the whole queue. During enqueue, only hold each lock until any changes to the locked item are completed. So, as enqueue marches down the queue, it holds locks on at most two items at a time, the item containing the pointer that may need to be changed to do an insertion, and its successor. Dequeue then needs to lock the item being dequeued as well as the head of the queue. The result is much more locking, but also more potential parallelism.