11. Special Instructions
Part of
the Hawk Manual
|
11.1. Special Instruction Format
11.2. Adjust Result of Arithmetic Operation
11.3. Add Without Change to the Condition Codes
11.4. Get and Set Coprocessor Registers
11.5. Get and Set Special CPU Registers (Privileged)
07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | 15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | |
0 0 0 1 | dst | 0 - - - | src |
The special instructions are 16 bits each, specifying a special register src and a general purpose register dst. The names src and dst are misleading in this context, because some of these instructions transfer data in the other direction.
The final instructions here, CPUGET and CPUSET are privileged. They are only used in operating system code and may be safely ignored by most programmers.
07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | 15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | ||||||
0 0 0 1 | dst (nz) | 0 1 0 1 | src | ADJUST | dst,src | r[dst] = r[dst]+adj[src] |
ADJUST | NZVC unchanged | ||
ADJUST adds an adjustment to the destination register without changing to the processor status word (see Section 1.3.4.1). There are two groups of adjustments, the first computes the adjustment to the destination as a function of the processor status word, while the second adds small powers of two to the destination. The src field specifies the adjustment:
src = | 0 0 1 0 | — | BCD | Correct result of ADD to produce a binary-coded decimal sum. | |
(Subtract 6 from each nibble that corresponding to a zero bit in
the BCD carry field of the PSW.) | |||||
0 0 1 1 | — | EX3 | Correct result of ADD to produce an excess-3 sum. | ||
(Add 3 to each nibble corresponding to a one bit in the BCD-carry
field of the PSW; subtract 3 from the other nibbles.) | |||||
0 1 0 0 | — | CMSB | Add the C bit from PSW to most significant bit. | ||
0 1 0 1 | — | SSQ | Truncate quotient toward zero after SR | ||
(Add (N ∧ V) to least significant bit.) | |||||
1 0 0 0 | — | PLUS1 | Add 1 without change to the condition codes. | ||
1 0 0 1 | — | PLUS2 | Add 2 ... | ||
1 0 1 0 | — | PLUS4 | Add 4 | ||
1 0 1 1 | — | PLUS8 | Add 8 | ||
1 1 0 0 | — | PLUS16 | Add 16 | ||
1 1 0 1 | — | PLUS32 | Add 32 | ||
1 1 1 0 | — | PLUS64 | Add 64 | ||
1 1 1 1 | — | PLUS128 | Add 128 | ||
The hawk.h header file should define all of the above symbols. To add the 8-digit binary-coded decimal number r[4] to the BCD number in r[3], use the following instruction sequence:
LIW R15,#66666666 ADD R3,R3,R15 ; add an extra 6 to each BCD digit ADD R3,R3,R4 ; -- do the actual addition -- ADJUST R3,BCD ; correct the result using the BCD carry bits
To add the 8-digit excess-three decimal number r[4] to the BCD number in r[3], use the following instruction sequence:
ADD R3,R3,R4 ; -- primary add instruction -- ADJUST R3,EX3 ; correct the result using the BCD carry bits
ADJUST does not change the BCD carry or condition codes, so it can be used for higher-precision BCD or excess-3 arithmetic. To increment a 16-digit (64 bit) binary-coded decimal value in r[3-4] by 100, use the following instruction sequence:
LIW R15,#66666666 ADD R3,R3,R15 ADD R4,R4,R15 ; add sixes to all digits ADDI R3,R3,#100 ; -- primary add to low 32 bits -- ADJUST R3,EX3 ; correct low digits ADDC R4,R0 ; -- propagate carry to high bits -- ADJUST R3,EX3 ; correct high digits
In the above examples, note the use of LIW to load the 32-bit constant (see Section 5.2). Note also that ADJUST can be used after ADD (see Section 8.2), ADDI (see Section 3.2), and ADDSI (see Section 9.4).
To do a one-bit circular right shift (see Section 6.3) to r[3], use:
SRU R3,1 ; shift right ADJUST R3,CMSB ; roll the carry out into the high bit
To do a one-bit right shift to the 64-bit value in r[3-4], use:
SRU R3,1 ; shift bits 31 to 0 SRU R4,1 ; shift bits 63 to 32 (bit 32 goes into C) ADJUST R3,CMSB ; move bit 32 to top of lower register
SR can be used to divide by powers of 2 (see Section 6.3), but for negative two's complement dividends, non-integer quotients will be truncated toward the more negative value. This puts the remainder between zero and the divisor, so –3/2 is –2 with a remainder of 1, Many expect the quotient to be truncated toward zero, where –3/2 is –1 with a remainder of –1.
ADJUST dst,SSQ (shift signed quotient) adjusts shift results to correspond to common expectations. For example, to divide r[3] by 8, use:
SR R3,3 ; divide R1 by 8 with simple truncation ADJUST R3,SSQ ; make the quotient truncate toward zero
To Increment an index register without changing the conditon codes, as is sometimes needed in address arithmetic, use, for example:
ADJUST R3,PLUS4 ; R3 now points to the next word
Note that ADDSI can only be used to add values up to 8 (see Section 9.4). If the effect on the conditin codes is not important, ADJUST is more compact than ADDI and may be faster (see Section 3.2). Thus ADJUST dst,PLUS16 may be a useful alternative to ADDI dst,dst,16. It is particularly useful on the Sparrowhawk where there is no ADDI instruction (see Chapter 16).
07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | 15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | ||||||
0 0 0 1 | dst (nz) | 0 1 0 0 | src (pc) | PLUS | dst,src | r[dst] = r[dst]+r[src] |
PLUS | NZVC unchanged | ||
PLUS allows register-to-register addtion without changing the condition codes. For example, when LEA (see Section 3.2) cannot be used because the constant being added is greater than can be represented in 16 bits.
PLUS is essential in macros that emulate long Hawk memory-reference instructions on the Sparrowhawk (see Chapter 16).
07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | 15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | ||||||
0 0 0 1 | dst (x) | 0 0 1 1 | src | COGET | dst,src | r[dst] = co[src] | |||||||||||||||
0 0 0 1 | srcx (0) | 0 0 1 0 | src | COSET | dst,src | co[src] = r[srcx] |
COGET | NZVC depend on coprocessor currently selected | ||
COSET | NZVC unchanged | ||
Hawk processors may include include special coprocessors. Each Hawk coprocessor may have 15 internal registers. The COSET (coprocessor set) instruction is used to set the value of a coprocessor register from a general purpose register, and COGET (coprocessor get) is used to get the value of a coprocessor register into a general purpose register. Coprocessors may also use the src field for operation selection; how this is done is up to that coprocessor.
On the Sparrowhawk (see Chapter 16), COGET and COSET are unimplemented instructions; if they are used, they will trigger an instruction trap (see Chapter 13).
Coprocessors operate asynchronously from the CPU. The src field of these instructions designates the coprocessor register, while the dst field designates the CPU register, with the opcode determining the direction of the data transfer. Typically, data transfers to the coprocessor via COSET initiate expensive computations that may proceed in parallel with execution of code on the CPU, while transfers from the coprocessor via COGET get results, possibly forcing the CPU to wait in the event that the coprocessor is not yet ready, and possibly initiating additional computation in the coprocessor. COGET also sets the condition codes as determined by the coprocessor being used.
Hawk systems may include up to seven coprocessors. Coprocessor register zero (COSTAT) is shared by all of them. (see Section 1.3.3) Bits in COSTAT control which coprocessors are enabled and which is active. COGET r,COSTAT sets Z if the result is zero (unimplemented bits in COSTAT always read as zero). The other condition codes are reset.
Coprocessor registers other than COSTAT refers to registers in the particular coprocessor selected by the select field of COSTAT. If the selected coprocessor is not present or not enabled, attempting to coprocessor registers other than COSTAT will cause a coprocessor trap (see Chapter 13).
Note that enabling a coprocessor does not select it for use. Disabling unneeded coprocessors may save power but may cause loss of data from the coprocessor's registers.
07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | 15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | ||||||
0 0 0 1 | dst (pc) | 0 0 0 1 | src | CPUGET | dst,src | r[dst] = cpu[src] | |||||||||||||||
0 0 0 1 | srcx (0) | 0 0 0 0 | src | CPUSET | dst,src | cpu[src] = r[srcx] |
CPUGET CPUSET | NZVC unchanged | ||
CPUGET and CPUSET are privileged instructions. If they are used when the level field of the PSW is 1111 (see Section 1.3.4.1). will cause a privilege violation trap (see Chapter 13). These instructions manipulate the set of up to 16 special registers inside the CPU (see Section 1.3.4). These are:
src = | 0 0 0 0 | — | PSW | the processor status word (see Section 1.3.4.1). | |
0 0 0 1 | — | TPC | the trap program counter (see Section 1.3.4.2). | ||
0 0 1 0 | — | TMA | the trap memory address (see Section 1.3.4.3). | ||
0 0 1 1 | — | TSV | the trap save register (see Section 1.3.4.4). | ||
1 0 0 0 | — | CYC | the cycle-count register (see Section 1.3.4.5). | ||
The hawk.h header file should define all of the above symbols. Some CPUs may include more special registers. Special registers 4 through 7 are reserved for the MMU interface.
The PSW is more fully described elsewhere. When there is a trap, the program counter is saved in the TPC register. After a trap caused by memory addressing, the TMA register holds the virtual memory address that caused the trap.
The TSV register has no hardware-defined use. It may be loaded and stored by software. It is reserved for use by trap-service routines, where it is needed in order to save and restore registers after a trap.
The CYC register increments with every memory reference (fetch, load or store), making it useful for performance measurement. A counter may be added to count CPU clock cycles. Others registers may give access to CPU internals for hardware diagnostics.
CPUSET moves data from a general purpose register to a CPU special register. If the srcx field is zero, referencing r[0], this refers to the constant zero, so any CPU special register can be zeroed with a single instruction.
CPUGET moves data from a CPU special register to a general purpose register. If the dst field is zero, referencing r[0], this refers to the program counter. CPUGET R0,src, as a side effect, sets level field of the PSW to the old-level field. As a result, CPUGET R0,TPC serves as the return from trap. Assemblers should support RTT (return from trap) as a synonym for CPUGET R0,TPC.
07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | 15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | ||||||
0 0 0 1 | 0 0 0 0 | 0 0 0 1 | 0 0 0 1 | RTT | pc = tpc; level = prior |