To continue our brief review, we will quickly go over a set of fundamental register transfer components and then review the process of compiling code from a high level programming language to register transfer form.
A sigle bit signal in a digital system is generally represented as a line, the conventional schematic symbol for a wire:
one bit ----------------------When multiple bits are involved, there are a number of equivalent notations:
D ----------------------
1
D ----------------------
2
D ----------------------
3
is the same as
D =========/============ (thickness indicates
1-3 3 multiple wires)
or the same as
D ---------/------------ (careful use of a
1-3 3 count indicates
multiple wires)
Making a distinction between "fat lines" and "thin lines" is helpful,
but it takes more work to draw the pictures this way.
Where there is only one line-thickness, slashes and wire counts
are needed on each line representing a bundle, and this must be
repeated as needed to make it clear that some particular line in
a schematic diagram represents more than one wire.
When "fat line" notation is used for bundles, particularly when
there is a standard word size, no "wire count" or slash is needed
for those bundles where the count is obvious from context.
Where it is necessary to show a bundle being broken into its individual wires, the following notation is common:
_
| _
D ----| | _
1 | |---- D |
D ----|===/=====| 1 |---- D
2 | 3 |==========/====| 2
D ----| |_ 2 |---- D
3 _| |_ 3
In using this notation, particularly when showing a bundle being
broken into sub-bundles, it is important ot make it clear using
subscripts or wire names which wires go in which bundle!
A register is a collection of identical flipflops all sharing a common clock. For example, we migh construct a 3 bit positive pulse-triggered latch from 3 positive pulse-triggered D latches as:
D D D
2 1 0
| _______ | _______ | _______
| | | | | | | | |
--|D Q|-- --|D Q|-- --|D Q|--
| _ | | | _ | | | _ | |
| _| |_ | | | _| |_ | | | _| |_ | |
| | | | | | | | |
--|C | | --|C | | --|C | |
| |_______| | | |_______| | | |_______| |
| | | | | |
C-o-------------|--o-------------|-- |
| | |
Q Q Q
2 1 0
Generally, we will avoid drawing out the gate-level components,
and instead, we will simply abbreviate this register as:
D
2-0
|
/3
________|________
| _ |
C---|>_| |_ R |
|_________________|
|
/3
|
Q
2-0
The commonly used notation clearly marks, on the bundled wires,
the number of conductors, while leaving it to the reader to infer
that the register must have that number of bits. External
connections are generally given names, with subscripts indicating
the number of conductors and their order. Thus, if the input is
labeled D2-0, the most significant bit is D2
and the least significant bit is D0.
It is also convenient to name the register itself. In the above figure, the register is named R, and the label could have been extended with subscripts, for example, R2-0 to indicate that R has 3 bits.
There is no universally accepted ordering for the numbering of the bits in a byte or word. It makes good sense to number them consecutively, but numbering may start at 0 or 1, and numbering may start at the most significant or least significant end. I prefer numbering starting at 0 from the least significant end, so bit i has the binary weight 2i if the word contains an integer. The numbering from the most significant end makes excellent sense for fixed point fractions, where bit i has the binary weight 2-i; this is how the designers of most computers of the 1940's and early 1950's thought, because they were motivated by problems representing the real numbers used in scientific computing and not by integers.
Designers, of course, are free to label things any way they find convenient, but generally, once a labeling system is found, it is best to stick to it! Mixing labeling schemes leads to chaos, so, for example, the label R0-2 could have been used in to name the register in the above figure, leaving the labels on the wires unchanged, but in the context of the labels on the inputs and outputs, this would be in very poor taste because it implies that R0 is the most significant bit, connected to the input D2 and output Q2. This kind of confusion is unavoidable where components from different designers or manufacturers must be interconnected, but it should be avoided whenever possible!
In theory, no digital system designer needs more than one type of edge triggered register, but optimal designs sometimes require more types. So, it is handy to keep at least the following 4 types in mind:
|
________|_______
| _ | positive pulse
---|>_| |_ | triggered
|________________| register
| |
| ________|_______
positive edge | _ |
triggered ---|>_| |
register |________________|
| |
________|_______ |
| _ _ | negative pulse
---|> |_| | triggered
|________________| register
| |
| ________|_______
negative edge | _ |
triggered ---|> |_ |
register |________________|
|
|
The pulse-triggered registers are made with type D latches,
while the edge-triggered registers are made with edge-triggered
or master-slave flipflops. A simple inverter on the clock line
into a positive pulse or edge triggered device will convert it
to a negative pulse or edge triggered device.
It is worth remembering that register transfer level designers rarely make any explicit use of the Qbar outputs of registers, but that these outputs are available at almost no cost. If a register is located, in the physical realization of the circuit, adjacent to some function that requires the inverted outputs, it is quite reasonable to extract them directly from the register without adding additional inverters. On the other hand, if the inverse is needed at some distance from the register, the cost of the added inverters is frequently less than the cost of the extra interconnections needed to move both the Q outputs and the Qbar outputs from the register.
When some logic element must process data from multiple sources, a multiplexor can be used to select the sourde. A 2-input 1-output multiplexor is composed of elements with the following truth table:
A B C | D ---------|--- 0 0 0 | 0 0 0 1 | 0 0 1 0 | 0 0 1 1 | 1 1 0 0 | 1 1 0 1 | 0 1 1 0 | 1 1 1 1 | 1Here, when the control input C is 0, the output D is equal to the input A, while when the control input is 1, the output is equal to input B. The gate-level realization of a multiplexor is:
____
A --------| \
| )O--
-----|____/ | _____
| ---O\ \
| ) )------- D
| ____ ---O/____/
B --|-----| \ |
| | )O--
| --|____/
| |
| o
| /_\
| |
C --o--
In register transfer diagrams, this function is frequently
drawn using one or the other of the following notations:
A B A B
| | | |
_|_____|_ _|_____|_
| 0 1 | \ 0 1 /
C ----| MUX | C ----\ /
|_________| \_____/
| |
| |
C C
This notation may be used for either single bit or multiple
bit multiplexors, with the presence of multiple bits indicated
by "bundle" notation on the interconnections and not by anything
within the box. The slope-sided notation is commonly used without
a label by designers who limit the use of this symbol to multiplexors
and use it for nothing else. Other designers use the slope-sided
notation for any function that combines multiple words to make one
word. When that notation is used, a function label is, of course,
required.
Multiplexors for more than 2 inputs may be made by cascading 2-input multiplexors, as:
A A A A
0 1 2 3
| | | |
_|_____|_ _|_____|_
\ 0 1 / \ 0 1 /
C ---- \ /-----------\ /
0 \_____/ \_____/
| |
| |
------ ------
| |
_|_____|_
\ 0 1 /
C -------------- \ /
1 \_____/
|
|
D
But, this adds delays. A multiplexor is a simple combinarional
circuit, so it is possible to create a multiplexor with any number
of inputs that has a maximum delay of 3 gate delays. An optimal
design for a 4-input multiplexor can be given as follows:
C C
1 0
| |
o-- o--
| _|_ | _|_
| \ / | \ /
| O | O
| | | | ___
-|--o----|--|--| \
-|--|----|--o--| )O----
A ----|--|----|--|--|___/ |
0 | | | | ___ | |
-|--o----|--|--| \ ---|
-|--|----o--|--| )O-- |___
A ----|--|----|--|--|___/ ------| \
1 | | | | ___ | )O---- D
-o--|----|--|--| \ ------|___/
-|--|----|--o--| )O-- |
A ----|--|----|--|--|___/ ---|
2 | | | | ___ | |
-o--|----|--|--| \ |
-|--|----o--|--| )O----
A ----|--|----|--|--|___/
3 | | | |
Both the multiplexor tree and the optimal design are likely
to be drawn as follows in a register-transfer-level logic diagram:
A A A A
0 1 2 3
| | | |
_|_____|_____|_____|_
\ 00 01 10 11/
C ----/----\ /
1-0 2 \_________________/
|
|
D
The dimension on the control input to this multiplexor can be
inferred from the number of inputs to the multiplexor, but the
order of the subscripts cannot. The latter is essential to
the determination of which combination of control inputs will
select what data input. The labels on the inputs can be given
in any system that is obvious, in context. Binary and hexadecimal
labels are common, and symbolic labels are sometimes used when
there is a binding of the bit patterns on the control input to
some set of symbols.
If some combinational logic element is to be replicated for each bit in a word, this can be easily represented as follows:
A B
| |
/ /
___|_________|___
| _ |
| F = A or B |
|_________________|
|
/
|
F
The above example indicates that the identical function is computed
for each pair of corresponding input bits in order to compute
the corresponding output. For a single bit i, the example
computation is as follows:
_____
A ------\ \
i ) OR )------- F
B -----O/____/ i
i
It is quite common to discover that, along some data path within a digital system, some selection must be made between combinational functions. Typically, this can always be done as follows, using a brute-force implementation:
A B
| |
---------o-----|---
| | |
| ---------o---|-----
| | | |
_|_____|_ _|_____|_
| x y | | x y |
| _ | | _ |
| z=x|y | | z=xy |
| | | |
|____z____| |____z____|
| |
------ ------
_|_____|_
\ 0 1 /
C -------------- \ /
\_____/
|
F
We will frequently abbreviate such circuits using the following
notation:
A B
| |
___|_________|___
| x y |
| _ |
| 0 z = x|y |
C --| _ |
| 1 z = xy |
| |
|________z________|
|
|
F
Here, we use the same convention as was used with multiplexors,
control inputs are shown from the left when data inputs are from
the top, or from the bottom when data inputs were from the side.
To increment a binary number, we must compute a value for each bit that depends both on that bit and on the less significant bits. Consider the following worked example, adding 1 to 10112:
0 0 1 1 - the carry bits
N 1 0 1 1
+ 1
-------------
N+1 0 1 1 0 0
Each column of this sum produces a 2-bit result, with values
ranging from 0 to 2. The least significant bit of each single-column
addition is the result bit for that column. The most significant
bit is the carry into the next most significant column. The result
of the increment operation always has the potential to require one
more bit than the input.
A simple truth table can be constructed to describe what is done for each column:
in + out n c | c s i i | i+1 i --------|---------- 0 0 | 0 0 0 1 | 0 1 1 0 | 0 1 1 1 | 1 0Here, we have labeled the carry inputs to bit i of the number as ci and the carry output from bit i as ci+1. It is easy to see the following relations from the truth table:
The full increment function on a 3 bit number can be implemented by a series of these 1-bit operations:
N N N
2 1 0
| | | 0
| __ | __ | ___|
__|_| | __|_| | __|_|
| n c| | | n c| | | n c|
| + | | | + | | | + |
|c_s__| | |c_s__| | |c_s__|
C __| | |__| | |__| |
3 | | |
| | |
S S S
2 1 0
At the register transfer level, we will typically abbrevaite
this as one of the following:
N N N
| | |
/ 3 / 3 / 3 C
_____|_____ _____|_____ _____|___|_
| | | | | |
| +1 | | +1 | | +1 |
|___________| |___________| |___________|
| | | |
/ 3 / 4 C / 3
| | |
S S S
In the leftmost example, the carry out is ignored! In the center
example, the carry out is incorporated into the output as a 4th
bit. In the rightmost example, the carry out is extracted as a
distinct output. The form used depends on what is needed in context!
In the 2 left examples above, the C input to the least significant bit of the adder chain has been ignored. We assume that it is one if it is not mentioned. In the rightmost example, the C input is brought out explicitly so that it may have a value other than 1.
We will also occasionally use the following form:
N
|
______|______
| x |
| 0: f = x |
C --| |
| 1: f = x+1 |
|______f______|
|
|
S
Here, we have treated the C input as a control input for selecting
between two functions, increment and do nothing.
A full adder for two binary numbers is described by the following truth table
in + out a b c | c s i i i | i+1 i ----------|---------- 0 0 0 | 0 0 0 0 1 | 0 1 0 1 0 | 0 1 0 1 1 | 1 0 1 0 0 | 0 1 1 0 1 | 1 0 1 1 0 | 1 0 1 1 1 | 1 1Here, aside from adding a new input, we have labeled things compatably with the increment example. Inspection gives the following implementation:
a b c
i i i
| | |
| __ | _______|
__|_| | __|_|
| n c| | | n c|
| + | | | + |
|c_s__| | |c_s__|
_____ | | | | |
/ /-- | --|-
---( ( | |
| \____\----|-------
| |
c s
i+1 i
Here, we have violated a standard rule of logic schematic diagrams:
Data should always flow from left to right and top to bottom. The
reason is, of course, that the convention from arithmetic, that
carry is propagated from right to left, has been allowed to dominate!
The one-bit increment components are referred to as half adders precisely because it takes two of them to make a full adder as shown above.
We will usually abbreviate adders as follows:
A B
| |
___|_________|___
| x y |
| |
| f = x + y |
| |
|________f________|
|
|
S
The options for treating carry out from such an adder and
the options for treating carry into the least significant bit
can be dealt with in the same way that they were for the incrementor.
If speed is of the essence, as it is in most high performance computer architectures, the "ripple carry" approach to computing the sum outlined above is not very good. The adder outlined above has a speed proportional to O(n) on an n bit word. There are practical ways of building an adder with O(log n) speed, and of course, optimal design methods can always be applied to give something with 3 gate delays, for an apparent speed of O(1). The latter is only illusion, though, because it requires gates with O(n) inputs and O(n) fanout, and the speed of such gates tends to fall of as O(log n)! As a result, O(log n) speed is the realistic goal for designers of fast arithmetic hardware.
Subtraction of binary numbers can be handled by adding the two's complement, and the two's complement is one plus the one's complement. This leads to the following naive design for a subtractor:
B
|
___|___
| |
| not |
|_______|
|
___|___
| |
| +1 |
A |_______|
| |
___|_________|___
| x y |
| |
| f = x + y |
| |
|________f________|
|
|
S
Noting that the Cin input to the adder can also be
used to add one to the sum, we can collapse this to:
B
|
___|___
| |
| not |
A |_______| 1
| | ____|
___|_______|___|_
| x y C |
| in|
| |
| f = x + y + C |
| in|
| C |
|___out__f________|
| |
| |
C S
This is a common way to realize a subtractor, and it has the
consequence that Cout=1 indicates that there was no
borrow and Cout=0 indicates that there was a borrow.
Of course, we can also simply design the truth table for the subtractor from first principles and implement it directly.
While adders and subtractors are useful and sufficient, it is common to design single functional units that can perform both jobs. Such a unit is usually referred to as an Arithmetic Logic Unit. Here is one typical design:
A A A
2 B 1 B 0 B
| 2 | 1 | 0
| | | | | |
F --|---o-|------|---o-|------|--- |
O | |_| | |_| | |_|
| |XOR| | |XOR| | |XOR|
| |___| | |___| | |___|
| __| | __| | __|
| | | | | |
F --|-|-o--------|-|-o--------|-|-
1 | | | __ | | | __ | | | ----F
| | |_| | | | |_| | | | |_| 2
| ||AND| | | ||AND| | | ||AND|
| ||___| | | ||___| | | ||___|
_|_|__| | _|_|__| | _|_|_|_
| a b c | | | a b c | | | a b c |
| + | | | + | | | + |
|c__s___| | |c__s___| | |c__s___|
| | |__| | |__| |
C -- | | |
out | | |
S S S
2 1 0
This example ALU computes 8 possible functions of A and B,
although only 6 are distinct from each other.
These are enumerated below:
F F F | operation 2 1 0 | ------------|------------------------------------ 0 0 0 | S = A xor B 0 0 1 | S = A equ B = A xor (not B) 0 1 0 | S = A + B 0 1 1 | S = (A - B) - 1 = A + (not B) 1 0 0 | S = A xor B 1 0 1 | S = A xor (not B) = A equ B 1 1 0 | S = A + B + 1 1 1 1 | S = A - B = A + (not B) + 1This ALU design is typical of many commercial ALU designs! One typical feature is that many of the control input combinations produce operations of doubtful utility. If some carefully selected and gates in an optimal version of the adder are given additional enable inputs, the result is a few more useful operations and many more useless ones, with the addition of only one or two additional control lines. Because of this, it useful systems almost always hide the ALU control lines from the user, and instead, run the user's operation code through a small ROM (table lookup) to select from among the useful ALU operations without exposing the user to the useless or redundant operations.