Assignment 10, due Nov 7

Part of the homework for CS:2630, Fall 2019
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On every assignment, write your name legibly as it appears on your University ID card! Homework is due on paper in discussion section on Thursday. Some parts of assignments may be submitted on-line and completed in discussion section. Exceptions will be made only by advance arrangement (excepting "acts of God"). Late work must be turned in to the TA's mailbox (ask the CS receptionist in 14 MLH for help). Never push homework under someone's door!

  1. Background: Look at the IEEE floating point format documented in chapter 11. Imagine a very similar floatin point format for 16-bit floating point values, with the following structure:

    15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
    s exp mant

    As with the IEEE format, the maximum exponent value (here, 1111) is reserved for NaN (not a number), and the minimum value (here, 0000) is usef for un-normalized values. Otherwise, the exponent is biased so that 0111 means 0.

    As with the IEEE format, there is a hidden bit, so the normalized mantissa 01001010010 represents a mantissa value of 1.01001010010, and as with the IEEE format, the hidden bit is zero when the exponent has its minimimum value of 0000.

    a) What is the binary representation of the smallest positive nonzero value in this number system, and give an algebraic expression in decimal for the value it represents, along with a decimal expression in scientific notation for that value, to the appropriate number of significant figures. (0.5 points)

    b) What is the binary representation of the largest positive legitimate value in this number system, and give an algebraic expression in decimal for the value it represents, along with a decimal expression in scientific notation for that value, to the appropriate number of significant figures. (0.5 points)

  2. Background: Consider the problem of adding two fixed point numbers, one in R3 with 4 places right of the point, one in R4 with 7 places right of the point, where we want the sum in R5 with 4 places right of the point. We could do this:
            MOVE    R5,R4		; bug fixed from original version
            SR      R5,--?--
            ADD     R5,R5,R3
    

    a) Give the shift count that should be used on the SR instruction to replace the --?--. (0.5 points)

    b) Give the instruction and its operand(s) that should be added to the above code (and indicate where this goes relative to the SR instruction) so that the result of the SR is rounded and not truncated. Note: There are two ways to do this, one with an instruction before the SR and one with a different instruction after the SR. (0.5 points)

    Background: The vector dot product is the sum of the products of corresponding vector elements. We can write this in C as:

    float dotprod( const float * a, const float * b, int len ) {
        float acc = 0.0;
        while (len > 0) {
            acc = acc + ((*a) * (*b));
            a = a + 1;
            b = b + 1;
            len = len - 1;
        }
        return acc;
    }
    

    A problem: Write the equivalent SMAL Hawk code. (1.0 points)