11. Hawk Special Instructions

Part of the Hawk Manual
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science


11.1. Special Instruction Format
11.2. Adjust Result of Arithmetic Operation
11.3. Add Without Change to the Condition Codes
11.4. Get and Set Coprocessor Registers
11.5. Get and Set Special CPU Registers (Privileged)

11.1. Format

Special Instruction Format

07060504 03020100 15141312 11100908
0 0 0 1 dst 0 - - - src

The special instructions are 16 bits each, specifying one special register (src) and one general purpose register (dst). The names src and dst are misleading in this context, because depending on the instruction, data may be transferred in either direction.

All of the instructions listed here are specialized and will only be needed under certain circumstances. The ADJUST instructions are useful to accelerate certain arithmetic operations, particularly those involved in decimal arithmetic. Similarly, the coprocessor instructions are useful only when the services of specialized coprocessors are needed.

The final instructions here, CPUGET and CPUSET are privileged instructions. They are only used in bootstrap an operating system code and may be safely ignored by most programmers.

11.2. Adjust Result of Arithmetic Operation

07060504 03020100 15141312 11100908                        
0 0 0 1 dst (nz) 0 1 0 1 src ADJUST dst,src r[dst] = r[dst]+adj[src]

ADJUST    NZVC unchanged

The ADJUST instruction adds an adjustment to the destination register without making any change to the processor status word. There are two groups of adjustments. The first group computes the adjustment to the destination as some function of the processor status word, while the second group is used to add small powers of two to the destination.

The following adjustments are available, specified by the src field of the instruction:

src =  0 0 1 0 BCD Correct result of ADD to produce a BCD sum.
(Subtract 6 from each nibble that corresponding to a zero bit in
the BCD carry field of the PSW.)
0 0 1 1 EX3 Correct result of ADD to produce an excess-3 sum.
(Add 3 to each nibble corresponding to a one bit in the BCD-carry
field of the PSW; subtract 3 from the other nibbles.)
0 1 0 0 CMSB Add the C bit from PSW to most significant bit.
0 1 0 1 SSQ Truncate quotient toward zero after SR
(Add (N ∧ V) to least significant bit.)
1 0 0 0 PLUS1 Add 1 without change to the condition codes.
1 0 0 1 PLUS2 Add 2 ...
1 0 1 0 PLUS4 Add 4
1 0 1 1 PLUS8 Add 8
1 1 0 0 PLUS16 Add 16
1 1 0 1 PLUS32 Add 32
1 1 1 0 PLUS64 Add 64
1 1 1 1 PLUS128 Add 128

To add the BCD numbers R4 to the BCD number in R3, use the following instruction sequence, assuming that R15 holds the constant 6666666616.

	ADD	R3,R3,R15	; add an extra 6 to each BCD digit
	ADD	R3,R3,R4	; -- primary add instruction --
	ADJUST	R3,BCD		; correct the result using the BCD carry bits

If R3 and R4 hold excess-3 decimal numbers, use the following instruction sequence to compute the excess-3 sum in R3:

	ADD	R3,R3,R4	; -- primary add instruction --
	ADJUST	R3,EX3		; correct the result using the BCD carry bits

These do not change the BCD carry or condition codes, so they work for higher-precision BCD or excess-3 arithmetic. For a 64-bit excess-3 sum of <R3,R4> and <R5,R6>, use:

	ADD	R3,R3,R5	; add least significant bits
	ADJUST	R3,EX3		; correct least significant bits
	ADDC	R4,R6		; add most significant bits
	ADJUST	R3,EX3		; correct most significant bits

To do a one-bit circular right shift to R1, use:

	SRU	R1,1		; shift right
	ADJUST	R1,CMSB		; roll the carry out into the high bit

To do a one-bit right shift to the 64-bit value in <R3,R4>, use:

	SRU	R3,1		; shift bits 31 to 0
	SRU	R4,1		; shift bits 63 to 32 (bit 32 goes into C)
	ADJUST	R1,CMSB		; move bit 32 to top of lower register

If you use a SR instruction to divide, for example, –10 by 8, the quotient will be –2 with a remainder of 6 — the remainder has the same sign as the divisor. Naive graduates of elementary school arithmetic expect a quotient of –1 and a remainder of –1 — the remainder has the same sign as the quotient. To divide a signed number in R3 by 8, following the naive rules, use:

	SR	R3,3		; divide R1 by 8 with simple truncation
	ADJUST	R3,SSQ 		; if N and V, add 1 to truncate toward zero

To Increment an index register without changing the conditon codes, as is sometimes needed in address arithmetic, use, for example:

        ADJUST  R3,PLUS4	; R3 now points to the next word

11.3. Add Without Change to the Condition Codes

07060504 03020100 15141312 11100908                        
0 0 0 1 dst (nz) 0 1 0 0 src (pc) PLUS dst,src r[dst] = r[dst]+r[src]

PLUS      NZVC unchanged

The PLUS instruction allows register-to-register addtion without changing the condition codes. This is occasionally useful, for example, when doing address arithmetic equivalent to the LEA instruction where the constant being added to a register is greater than can be represented in 16 bits. It is also useful if, for some reason, long 32-bit instructions must be avoided.

11.4. Get and Set Coprocessor Registers

07060504 03020100 15141312 11100908                        
0 0 0 1 dst (x) 0 0 1 1 src COGET dst,src r[dst] = co[src]
0 0 0 1 srcx (0) 0 0 1 0 src COSET dst,src co[src] = r[srcx]

COGET     NZVC depend on coprocessor currently selected
COSET NZVC unchanged

Hawk (but not Sparrowhawk) processors include special coprocessors. Each Hawk coprocessor may have 15 internal registers. The COSET (coprocessor set) instruction is used to set the value of a coprocessor register from a general purpose register, and COGET (coprocessor get) is used to get the value of a coprocessor register into a general purpose register. Coprocessors may also use the src field for operation selection; how this is done is up to that coprocessor. On the Sparrowhawk, COGET and COSET are unimplemented instructions, although operating systems may provide virtual coprocessor support.

Coprocessors operate asynchronously from the CPU. The src field of these instructions designates the coprocessor register, while the dst field designates the CPU register, with the opcode determining the direction of the data transfer. Typically, data transfers to the coprocessor via COSET initiate expensive computations that may proceed in parallel with execution of code on the CPU, while transfers from the coprocessor via COGET get results, possibly forcing the CPU to wait in the event that the coprocessor is not yet ready, and possibly initiating additional computation in the coprocessor. COGET also sets the condition codes as determined by the coprocessor being used.

Hawk systems may include up to seven coprocessors. Coprocessor register zero (COSTAT) is shared by all of them. Bits in COSTAT indicate which coprocessors are enabled and which is active. COGET r,COSTAT sets Z if the result is zero (unimplemented bits in COSTAT always read as zero). The other condition codes are reset. To test if a coprocessor is present, attempt to enable it and then see if the enable bit was successfully set. For example, to see if the floating point coprocessor is available, use:

        LIS     R3, FPENAB
        COSET   R3, COSTAT      ; attempt to enable floating point
        COGET   R3
        BZS     NOFLOAT	        ; branch if it was not enabled

Note that enabling a coprocessor does not select it for use. Disabling unneeded coprocessors may save power but may cause loss of data from the coprocessor's registers.

If the coprocessor selected by COSTAT is not present or is currently disabled, use of the COGET or COSET instructions to operate on that coprocessor will cause a coprocessor trap. It is always legal to reference COSTAT (coprocessor register zero); thus, it is always legal to deselect a coprocessor in the event that a non-existant coprocessor was accidentally selected.

11.5. Get and Set Special CPU Registers (Privileged)

07060504 03020100 15141312 11100908                        
0 0 0 1 dst (pc) 0 0 0 1 src CPUGET dst,src r[dst] = cpu[src]
0 0 0 1 srcx (0) 0 0 0 0 src CPUSET dst,src cpu[src] = r[srcx]

NZVC unchanged

CPUGET and CPUSET are privileged instructions. If they are used when the level field of the PSW is 1111 will cause a privilege violation trap. These instructions manipulate the set of up to 16 special registers inside the CPU. These are:

src =  0 0 0 0 PSW the processor status word
0 0 0 1 TPC the trap program counter
0 0 1 0 TMA the trap memory address
0 0 1 1 TSV the trap save register
1 0 0 0 CYC the cycle-count register

Some CPUs may include more registers. Special registers 4 through 7 are reserved for the MMU interface.

The PSW is more fully described elsewhere. When there is a trap, the program counter is saved in the TPC register. After a trap caused by memory addressing, the TMA register holds the virtual memory address that caused the trap.

The TSV register has no hardware-defined use. It may be loaded and stored by software. It is reserved for use by trap-service routines, where it is needed in order to save and restore registers after a trap.

The CYC register increments with every memory reference (fetch, load or store), making it useful for performance measurement. A counter may be added to count CPU clock cycles. Others registers may give access to CPU internals for hardware diagnostics.

If CPUGET is used with dst = 0, the program counter is loaded from the given CPU register, and as a side effect, the level field of the PSW is set to the old-level field. As a result, CPUGET R0,TPC serves as the return from trap. Assemblers should provide the mnemonic RTT (return from trap) for this:

07060504 03020100 15141312 11100908                        
0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 RTT pc = tpc; level = prior