1) Consider the problem of interlocking a pipeline to eliminate the need for a user to be aware of operand and branch delay slots. In the lectures, we mostly talked in terms of interlocks that compared one pipeline stage with its immediate predecessor in order to determine whether the predecessor pipeline stages should be stalled.
Give an example of an instruction set feature that requires comparison of non-adjacent pipeline stages, and clearly document, in the context of a conventional set of pipeline stages, how this changes things.
2) Consider what happens to pipelined architectures when memory cycle times are significantly slower than the speed with which registers separated by simple combinational functions can be clocked. How would you change a pipelined architecture to accomodate this? If possible, identify two distinctly different paths a designer can take to deal with such problems (do you reduce the clock speed to match the memory speed or something else).
3) With a superscalar machine (one where multiple instructions are issued with each clock cycle), what new interlock problems are introduced and what kind of logic would you have to add in order to solve them?