This exam is worth 1/5 of the final grade (20 points; allocate 2 minutes per point).
variables src1, src2, dst: addresses initialized by CPU count: operand count initialized by CPU op: operation specifier set by CPU (add, subtract, etc.) repeat if count > 0 M[dst] = f(op, M[src1], M[src2]) src1 = src1 + 1; src2 = src2 + 1; dst = dst + 1; count = count - 1 endif foreverPart A: Rewrite the above so it accomplishes one register transfer per assignment while conforming to the restriction on the number of memory accesses per clock cycle. (2.0 points)
Part B: Rewrite the code from part A to indicate which register transfers can be accomplished in parallel (group parallel transfers using brackets), and then indicate how many clock cycles the code requires per iteration. (1.5 points)
Part C: Sketch the register transfer logic diagram that implements your solution to part B. (2.5 points)
Part D: Given the constraint forbidding parallel memory accesses, can pipelining be applied to this coprocessor to speed its operation? Why or why not? (1.0 points)
Part A: Write instruction-level pseudocode to correctly compute the following on this machine, assuming that the values of the variables I, X and Y are all stored in registers at the start of the computation; I holds an integer, X and Y are memory addresses, and memory is word-addressable:
I = I + 1; X[I] = Y[I] + 2Do not attempt to optimize your code! Write it out with no-ops filling all delay slots! (3.0 points)
Part B: Rewrite your answer to part A overlapping as many computations as possible so that use of no-ops is minimized. (2.0 points)
Part C: Which of the following instructions could be implemented using this pipeline model:
Part A: How many memory references could one instruction require in order to execute to completion? From this, can you conclude how many memory ports a pipelined CPU for this instruction set would require? (1.0 point)
Part B: Propose a pipeline structure for this machine, indicating, for each stage, the purpose of that stage. For example, what computations it performs, what memory references it performs, and what registers it accesses. (4.0 points)
Part C: Given that this machine has 16 registers and a 32 bit word, propose a reasonable instruction format for this instruction set. (1.0 points)