Assignment 10, solutions

Part of the homework for 22C:112, Spring 2009
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

  1. Background:

    Consider the problem of writing a heap manager, equivalent to that behind the C malloc() and free() routines. On some machines, guaranteeing that all blocks in the heap begin on a word boundary can significantly improve performance.

    Note that in c, sizeof(int), sizeof(void *) and sizeof(char) can be used to find out the sizes of these data types on whatever machine you are using. On a machine where there is a significant performance advantage to using aligned data, you can assume that statically allocated arrays of int or void * will be aligned on word boundaries.

    a) If heap is a statically allocated array of char, how can you check to see whether heap[i] is aligned appropriately for storing the first byte of a pointer on a machine where such alignment matters? (0.5 points)

    if ((&heap[i]) & (sizeof(void *) - 1) == 0) ...

    This works so long as pointer sizes are a power of two. If, for example, sizeof(void*) is 4, we mask with 3, which is the same as masking with binary 11. This returns zero if the least significant two bits of the address are zero, which is true if the address is divisible by 4.

    b) If you know that your heap is likely to be used on a machine where alignment matters, and if you know that many of the heap management data structures will involve pointers back and forth in the heap, should you declare the heap so that all of your pointers will be aligned without needing to do any checking of the type discussed above. (0.5 points)

    Yes. But, the word how was missing from the above, so here is how to declare the heap to contain heapsize bytes while helping with alignment:

    void * heap[ heapsize / sizeof(void *) ];

  2. Background: In a boundary tag implementation of the heap, each block in the heap needs a one-bit field indicating whether it is currently allocated or free. If garbage collectio is used, there are additional one-bit fields needed. It seems a pity to allocate an extra byte or word for such information in each boundary tag, so we are looking for places to sneak these bits into some unused bits of an existing fields of the heap data structure. Assume that all blocks in your heap data structures are aligned on word boundaries.

    a) What bits of what fields in the boundary tag are available to store one-bit auxiliary flags? (0.5 points)

    If pointers always point to aligned data, their least significant bits will always be zero. Therefore, these bits can be used for other things.

    b) Given that p points to a boundary tag, and p->f is a field where one bit of auxiliary data can be stored, give code to set the auxiliary bit, reset the auxiliary bit, and test the auxiliary bit. (0.5 points)

    Note: In the following, we ignore any casting needed. It gets messier with casting, but the principle of the code remains the same.

    p |= 1 /* sets the auxiliary bit */

    p &= ~1 /* resets the auxiliary bit */

    (p & 1) /* holds the contents of the auxiliary bit */

  3. Background: Consider the choice between full-scale garbage collection, on the one hand, and maintaining reference counts for automatic deallocation on the other hand.

    a) Under what circumstances will reference counts fail to allow automatic reclamation of storage where garbage collection will succede. (The answer, curiously, is related to the fact that the Unix/Linux file system uses reference counts to determine when regular files can be deallocated, but directories must be explicitly deleted using rmdir.) (0.5 points)

    Reference counts fail when there is a circular pattern in the linkage. If a links to b, and b links to c, and c links to b, the values of the reference counts are: a - unknown, b - 2, and c, 1. If we now delete the link from a to b, the reference counts for b and c are both 1, because each points to the other. Items b and c are now unreachable, because the link from a was the last link from the outside world. Items b and c will not be deleted because their reference counts are nonzero, but they cannot be reached to delete the remaining links. Therefore, they are unreclaimable by using reference counts.

    In Unix, you cannot use the rm command to delete links to directories precisely to avoid this circumstance. You must use rmdir which guarantees to remove the self-link (.) and the back link (..) before removing the link to the indicated directory.

    b) Describe the circumstances under which an application that uses garbage collection could still suffer from a memory leak. Your description should include a discussion of example data structures that could lead to this problem. (0.5 points)

    Suppose I have an items with a previous field because sometimes I need to look at the old item after replacing it with a new item. Here's my code to replace the item with a new item.

            /* forgot this line: item->previous = NULL; */
            temp = newitem();
            temp->previous = item;
            item = temp;
    

    Without the commented out line, this creates an unbounded list of items. The commented out line is purely extra code to ensure that only one old item is retained. It plays no role in the correctness of the code from any other point of view, so it is easy to leave out.