Startlingly enough, my crystal ball indicates that word sizes of 3, 9 and 27 bits will be the next great craze, suggesting that instead of quatrenary trees, we should be using trinary trees for carry propagation. (OK, you don't believe me? The truth is, you can find plenty of worked examples with quatrenary trees, so I can't use that for an exercise!)
Part A: Design the carry-lookahead unit for a fast adder based on trinary trees. This will have inputs Pi, Gi, Pi+1, Gi+1, Pi+2, and Gi+2 from the three subtrees above it, and input Ci from the right; it will have outputs Ci+1 and Ci+2 to the left and center subtree above it, and it will have outputs P and G to its parent in the carry lookahead tree.
This circuit has 7 inputs; as a result, a truth table specification is unlikely to be useful. Specify this with either a logic diagram and a set of boolean equations.
Part B: Give an Iowa Logic Specification Language specification for this part.
Part C: Consider a 27 bit adder using this circuit. A[0..26] are the individual one-bit adders at the leaves of the tree. Draw those parts of the tree connecting A[15..20] to the root, including all relevant carry connections and the logic to compute carry out from the entire adder. Simply use elipsis to indicate connections to the leaves and subtrees that this assignment allowed you to omit.
circuit inc( range word ); -- parts I: inc( 1..4 ) makes I into a 4 bit incrementor -- if I.cin is one, I.in is incremented by one and passed to I.out -- if I.cin is zero, I.in is output unchanged to I.out -- I.cout gives the carry out integer diff = 1 - first(word); circuit build( integer index, integer pin ); -- recursively builds most of the circuit! integer diff = index - pin; -- all inputs and outputs are to points defined globally in inc parts g: and( index ); inv: xor; if pin < last(word) then c: build( index + 1; pin + 1 ); endif; wires in( pin ) to inv.in(1); g.out to inv.in(2); inv.out to out( pin ); for i in first(word) .. (pin - 1) do in(i) to g.in( i + diff ); endfor; cin to g.in(index); end; -- build inputs in(word), cin; outputs out(word); cout; parts inv: xor; g: and( size(word) + 1 ); if size(word) > 1 then body: build( 2, first(word) + 1 ); endif; wires -- first bit data paths in( first(word) ) to inv.in(1); cin to inv.in(2); inv.out to out( first(word) ); -- other bits are connected in build -- carry out logic cin to g.in( size(word) + 1 ); for i in word do in(i) to g.in( i + diff ); endfor; g.out to cout; end;(The above circuit works quite well under the logic simulator!)
A Question: If we assume a constant delay per gate and no cost for interconnecting wires, this circuit would take a constant time to increment a number regardless of the number of bits.
Explain why no physical realization of this circuit will break O(log n). To do this, you will have to identify the component or components that pose special problems, and you will have to explain what those problems are that lead to the indicated performance.
Part A: Assuming that, prior to multiplying, the multiplicand is in M[sp], assume the multiplier is in t, and assume the accumulator is zero. What new data paths and functional operations must be added to the machine to allow multiplication?
Part B: Give the microinstruction sequence you would need for a 16 bit multipy. Note that no loop counter is available, so your code will include 16 repeats of the same microinstructions. You need only show one repeat of your code if you document the repeating pattern properly.