11. Hawk Special Instructions
11.1. Special Instruction Format
11.2. Adjust Result of Arithmetic Operation
11.3. Add Without Change to the Condition Codes
11.4. Get and Set Coprocessor Registers
11.5. Get and Set Special CPU Registers (Privileged)
|0 0 0 1||dst||0 - - -||src|
The special instructions are 16 bits each, specifying one special register (src) and one general purpose register (dst). The names src and dst are misleading in this context, because depending on the instruction, data may be transferred in either direction.
All of the instructions listed here are specialized and will only be needed under certain circumstances. The ADJUST instructions are useful to accelerate certain arithmetic operations, particularly those involved in decimal arithmetic. Similarly, the coprocessor instructions are useful only when the services of specialized coprocessors are needed.
The final instructions here, CPUGET and CPUSET are privileged instructions. They are only used in bootstrap an operating system code and may be safely ignored by most programmers.
|0 0 0 1||dst (nz)||0 1 0 1||src||ADJUST||dst,src||r[dst] = r[dst]+adj[src]|
The ADJUST instruction adds an adjustment to the destination register without making any change to the processor status word. There are two groups of adjustments. The first group computes the adjustment to the destination as some function of the processor status word, while the second group is used to add small powers of two to the destination.
The following adjustments are available, specified by the src field of the instruction:
|src =||0 0 1 0||—||BCD||Correct result of ADD to produce a BCD sum.|
| (Subtract 6 from each nibble that corresponding to a zero bit in
the BCD carry field of the PSW.)
|0 0 1 1||—||EX3||Correct result of ADD to produce an excess-3 sum.|
| (Add 3 to each nibble corresponding to a one bit in the BCD-carry
field of the PSW; subtract 3 from the other nibbles.)
|0 1 0 0||—||CMSB||Add the C bit from PSW to most significant bit.|
|0 1 0 1||—||SSQ||Truncate quotient toward zero after SR|
|(Add (N ∧ V) to least significant bit.)|
|1 0 0 0||—||PLUS1||Add 1 without change to the condition codes.|
|1 0 0 1||—||PLUS2||Add 2 ...|
|1 0 1 0||—||PLUS4||Add 4|
|1 0 1 1||—||PLUS8||Add 8|
|1 1 0 0||—||PLUS16||Add 16|
|1 1 0 1||—||PLUS32||Add 32|
|1 1 1 0||—||PLUS64||Add 64|
|1 1 1 1||—||PLUS128||Add 128|
To add the BCD numbers R4 to the BCD number in R3, use the following instruction sequence, assuming that R15 holds the constant 6666666616.
ADD R3,R3,R15 ; add an extra 6 to each BCD digit ADD R3,R3,R4 ; -- primary add instruction -- ADJUST R3,BCD ; correct the result using the BCD carry bits
If R3 and R4 hold excess-3 decimal numbers, use the following instruction sequence to compute the excess-3 sum in R3:
ADD R3,R3,R4 ; -- primary add instruction -- ADJUST R3,EX3 ; correct the result using the BCD carry bits
These do not change the BCD carry or condition codes, so they work for higher-precision BCD or excess-3 arithmetic. For a 64-bit excess-3 sum of <R3,R4> and <R5,R6>, use:
ADD R3,R3,R5 ; add least significant bits ADJUST R3,EX3 ; correct least significant bits ADDC R4,R6 ; add most significant bits ADJUST R3,EX3 ; correct most significant bits
To do a one-bit circular right shift to R1, use:
SRU R1,1 ; shift right ADJUST R1,CMSB ; roll the carry out into the high bit
To do a one-bit right shift to the 64-bit value in <R3,R4>, use:
SRU R3,1 ; shift bits 31 to 0 SRU R4,1 ; shift bits 63 to 32 (bit 32 goes into C) ADJUST R1,CMSB ; move bit 32 to top of lower register
If you use a SR instruction to divide, for example, –10 by 8, the quotient will be –2 with a remainder of 6 — the remainder has the same sign as the divisor. Naive graduates of elementary school arithmetic expect a quotient of –1 and a remainder of –1 — the remainder has the same sign as the quotient. To divide a signed number in R3 by 8, following the naive rules, use:
SR R3,3 ; divide R1 by 8 with simple truncation ADJUST R3,SSQ ; if N and V, add 1 to truncate toward zero
To Increment an index register without changing the conditon codes, as is sometimes needed in address arithmetic, use, for example:
ADJUST R3,PLUS4 ; R3 now points to the next word
|0 0 0 1||dst (nz)||0 1 0 0||src (pc)||PLUS||dst,src||r[dst] = r[dst]+r[src]|
The PLUS instruction allows register-to-register addtion without changing the condition codes. This is occasionally useful, for example, when doing address arithmetic equivalent to the LEA instruction where the constant being added to a register is greater than can be represented in 16 bits. It is also useful if, for some reason, long 32-bit instructions must be avoided.
|0 0 0 1||dst (x)||0 0 1 1||src||COGET||dst,src||r[dst] = co[src]|
|0 0 0 1||srcx (0)||0 0 1 0||src||COSET||dst,src||co[src] = r[srcx]|
|COGET||NZVC depend on coprocessor currently selected|
Hawk (but not Sparrowhawk) processors include special coprocessors. Each Hawk coprocessor may have 15 internal registers. The COSET (coprocessor set) instruction is used to set the value of a coprocessor register from a general purpose register, and COGET (coprocessor get) is used to get the value of a coprocessor register into a general purpose register. Coprocessors may also use the src field for operation selection; how this is done is up to that coprocessor. On the Sparrowhawk, COGET and COSET are unimplemented instructions, although operating systems may provide virtual coprocessor support.
Coprocessors operate asynchronously from the CPU. The src field of these instructions designates the coprocessor register, while the dst field designates the CPU register, with the opcode determining the direction of the data transfer. Typically, data transfers to the coprocessor via COSET initiate expensive computations that may proceed in parallel with execution of code on the CPU, while transfers from the coprocessor via COGET get results, possibly forcing the CPU to wait in the event that the coprocessor is not yet ready, and possibly initiating additional computation in the coprocessor. COGET also sets the condition codes as determined by the coprocessor being used.
Hawk systems may include up to seven coprocessors. Coprocessor register zero (COSTAT) is shared by all of them. Bits in COSTAT indicate which coprocessors are enabled and which is active. COGET r,COSTAT sets Z if the result is zero (unimplemented bits in COSTAT always read as zero). The other condition codes are reset. To test if a coprocessor is present, attempt to enable it and then see if the enable bit was successfully set. For example, to see if the floating point coprocessor is available, use:
LIS R3, FPENAB COSET R3, COSTAT ; attempt to enable floating point COGET R3 BZS NOFLOAT ; branch if it was not enabled
Note that enabling a coprocessor does not select it for use. Disabling unneeded coprocessors may save power but may cause loss of data from the coprocessor's registers.
If the coprocessor selected
by COSTAT is not present or is currently disabled, use of
the COGET or
COSET instructions to operate on that coprocessor will
cause a coprocessor trap.
It is always legal to reference COSTAT (coprocessor register
zero); thus, it is always legal to deselect a coprocessor in the event
that a non-existant coprocessor was accidentally selected.
|0 0 0 1||dst (pc)||0 0 0 1||src||CPUGET||dst,src||r[dst] = cpu[src]|
|0 0 0 1||srcx (0)||0 0 0 0||src||CPUSET||dst,src||cpu[src] = r[srcx]|
| CPUGET |
CPUGET and CPUSET are privileged instructions. If they are used when the level field of the PSW is 1111 will cause a privilege violation trap. These instructions manipulate the set of up to 16 special registers inside the CPU. These are:
|src =||0 0 0 0||—||PSW||the processor status word|
|0 0 0 1||—||TPC||the trap program counter|
|0 0 1 0||—||TMA||the trap memory address|
|0 0 1 1||—||TSV||the trap save register|
|1 0 0 0||—||CYC||the cycle-count register|
Some CPUs may include more registers. Special registers 4 through 7 are reserved for the MMU interface.
The PSW is more fully described elsewhere. When there is a trap, the program counter is saved in the TPC register. After a trap caused by memory addressing, the TMA register holds the virtual memory address that caused the trap.
The TSV register has no hardware-defined use. It may be loaded and stored by software. It is reserved for use by trap-service routines, where it is needed in order to save and restore registers after a trap.
The CYC register increments with every memory reference (fetch, load or store), making it useful for performance measurement. A counter may be added to count CPU clock cycles. Others registers may give access to CPU internals for hardware diagnostics.
If CPUGET is used with dst = 0, the program counter is loaded from the given CPU register, and as a side effect, the level field of the PSW is set to the old-level field. As a result, CPUGET R0,TPC serves as the return from trap. Assemblers should provide the mnemonic RTT (return from trap) for this:
|0 0 0 1||0 0 0 0||0 0 0 1||0 0 0 1||RTT||pc = tpc; level = prior|