Assignment 9, due Oct 26

Part of the homework for 22C:60, Fall 2007
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Always, on every assignment, please write your name legibly as it appears on your University ID and on the class list! All assignments will be due at the start of class on the day indicated (usually a Friday). The only exceptions to this rule will be by advance arrangement unless there is what insurance companies call "an act of God" - something outside your control. Homework must be turned in on paper and in class! Late work may be turned in to the teaching assistant's mailbox, but see the late work policy. Never push late work under someone's door!

Homework

Background: The hardware of many computers uses the following multiplciation algorithm:
```
  unsigned long int times( unsigned int multiplier, unsigned int multiplicand)
  {
      unsigned long int mq = multiplier;
      int i;   
      for (i == BITS_PER_INT; i > 0; i--) {
          if ((mq & 1) == 1) {
              mq = (mq + (multiplicand << BITS_PER_INT)) >> 1;
          } else {
              mq = mq >> 1;
          }
      }
      return mq;
  }
```
Here, we assume that type long int offers at least twice the number of bits in a normal int, and that BITS_PER_INT tells how many bits are in an int. On a 16 bit computer, BITS_PER_INT would be 16 and long int variables would be 32-bits long. The variable mq above would be a register.
This multiplication algorithm seems totally different from the algorithm given at the start of chapter 9.
a) Show the values of the variables in this algorithm before the first iteration and after each of the iterations of the loop. Show the values in binary and decimal, starting from a multiplier of ten and a multiplicand of one hundred. (1/2 point)
b) Explain how this algorithm differs from the original and why. The following sub-questions may help you find a short sweet answer to this question: Does it add partial products? If so, how and in what order? What benefits does the change confer? Answers to these questions will earn partial credit, while there is a short sweet answer that will earn full credit using less total ink. (1/2 point)
c) Code this algorithm in Hawk assembly language with BITS_PER_INT equal to 32. (1/2 point)
Hints: You will not need to do any shifts for mq + (multiplicand << BITS_PER_INT) because the shift amount is one full register. However, the final shift of each iteration is a 64-bit right shift. You will need to read into the section on division to see how to do this!
A problem: Find the fastest code to do an unsigned 32-bit multiply by 1000 (decimal) on the Hawk. To do this, you will need to:
a) Find the fastest code to multiply by 1000 by a sequence of shifts and adds to sum the partial products. (1/2 point)
b) Find the fastest code to multiply by 1000 by a sequence of multiplies to multiply by factors of 1000 -- obviously, you want to multiply by the largest factos you can efficiently deal with, so not all factors will be primes. (1/2 point)
c) Compare the number of instructions in the two approaches. (no credit, since it is trivial)
A problem: Much of modern cryptography rests on manipulation of really big numbers. For example, the RSA cryptogsystem requires manipulation of 512-bit numbers. Consider the problem of shifting a 512-bit number one bit to the left. Assume that the number is originally in RAM at the address pointed to by R3, stored least-significant word first. Write code to shift it and put it back into RAM. Speed counts more than size for code like this, so use all the tricks you can to make it fast. (1/2 point)