Consider using these switch modules to build a system with 4 CPUs and 4 memory modules. There are two ways to do this:
-/\- -/\- |_| |_| -/ \- -/ \- CPU ---| |-| |- CPU --\ /-------\ /--- M CPU ---|___|-|___|- CPU ---\/-- --\/---- M |_| |_| \ / CPU ---| |-| |- -/\- X -/\- CPU ---|___|-|___|- -/ \- / \ -/ \- | | | | CPU --\ /- -\ /--- M M M M M CPU ---\/---------\/---- MProblem A: What is the difference in switching delay between CPU and M for these two approaches?
Problem B: How do these two approaches scale? Address both switching delay and number of components in your answer, as a function of N, the number of CPUs (equal to the number of Memory modules).
Problem C: Both systems require that there be some specialization in each crossbar swtich to customize it for its setting in the interconnection system. Clearly identify the aspects of this specialization that differ in the two interconnection schemes!
? |_| |_| CPU --c--| |-| |- CPU --a--|___|-|___|- c |_| |_| CPU --h--| |-| |- CPU --e--|___|-|___|- ? | | | | ???cache??? | | | | M M M MIn either case, we add 4 caches; in one case, one per CPU, and in the other case, one per memory. Contrast these! What problems does each pose for the cache designer, and what are the potential benefits of each approach.