15. The Floating-Point Coprocessor

Part of the Hawk Manual
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Contents

15.1. Floating-Point Registers
15.1.1. Coprocessor Status Register — 0
15.1.2. Floating-Point-Low Register — 1
15.1.3. Floating-Point Accumulators — 2 and 3
15.2. COGET Operations
15.2.1. Negate — op = 0102
15.2.2. Absolute Value — op = 0112
15.3. COSET Operations
15.3.1. Convert Integer to Floating — op = 0102
15.3.2. Square Root — op = 0112
15.3.3. Add — op = 1002
15.3.4. Subtract — op = 1012
15.3.5. Multiply — op = 1102
15.3.6. Divide — op = 1112
15.4. Examples


15.1. Floating-Point Registers

The floating-point coprocessor shares the COSTAT register with other coprocessors (see Section 1.3.3), and it has 3 other interface registers that can also be referenced using COGET and COSET (see Section 11.4). to allow direct access to two 64-bit floating-point accumulators.

The following constants should be defined by the assembly header file float.h to support access to these registers:

;COSTAT =       0       ; defined in hawk.h
FPLOW   =:      1       ; the low half of long floating operands
FPA0    =:      2       ; floating-point accumulator 0
FPA1    =:      3       ; floating-point accumulator 1

15.1.1. Coprocessor Status Register — 0

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
unused sl u 0 0 1 unused en u

From the point of view of floating-point coprocessor only inspects a subset of the bits in COSTAT (see Section 1.3.3). The floating-point coprocessor is coprocessor number 1. Thus it is enabled when COSTAT:1 is set in the enable field (enable), and it is selected when COSTAT:10:8 (select) is set to 001.

COSTAT:12 (sl) is part of the coprocessor operation field (coop). If this is reset, the coprocessor uses a short (32-bit) floating-point representation. If it is set, it uses the long (64-bit) representation. The other bits in coop are unused, although they may be used in enhanced floating-point coprocessors.

The following constants should be defined by the assembly header file float.h to support use of the floating-point coprocessor; these can be added together (or ored together) to create a value to be placed in COSTAT:

FPENAB  =:      #0002   ; enable the floating-point unit
FPSEL   =:      #0100   ; select the floating-point unit
FPLONG  =:      #1000   ; operate in long format
FPSHORT =:      #0000   ; operate in short format (default)

FPENBIT =:      1       ; bit number of FPENAB

Strictly speaking, there is no reason to explicitly specify short format, since it is the default, but an explicit statement of short mode makes programs easier to read.

To enable and select the floating-point coprocessor for long floating operations while turning off any other coprocessors in the system, use the following instruction sequence:

        LIL     R1,FPENAB+FPSEL+FPLONG
        COSET   R1,COSTAT

For applications that use multiple coprocessors concurrently, to select the already enabled floating-point coprocessor and set it in short mode, use the following instruction sequence:

        COGET   R1,COSTAT
        TRUNC   R1,8
        ADDI    R1,R1,FPSEL+FPSHORT
        COSET   R1,COSTAT

To test if the floating-point coprocessor is currently enabled, use

        COGET   R1,COSTAT
        BITTST	R1,FPENBIT
        BBR	NOFP

15.1.2. Floating-Point-Low Register — 1

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
mantissa:31:0

The floating-point low (FPLOW) register holds the least significant 32-bits of a long floating-point operand. These are the low 32 bits of the mantissa. The contents of FPLOW are not defined when operating in 32-bit mode.

In 64-bit mode, FPLOW must be set before using COSET to set or operate on the high half of a floating-point accumulator, and it is automatically set as a side effect of using COGET to read the low half of a floating-point accumulator. COGET Rd,FPLOW has an undefined effect on the condition codes.

15.1.3. Floating-Point Accumulators — 2 and 3

The floating-point coprocessor contains two 64-bit floating-point accumulators, FPA0 and FPA1. The apparent format of these registers appears different in short floating-point mode, where they appear to be only 32 bits. In fact, the hardware always retains an 11 bit exponent and a 52-bit mantissa, at minimum, with the extra bits suppressed in short mode. IEEE standard floating-point format is supported. This means that the 53-bit mantissa is usually 54 bits, with a hidden one-bit (bit 52) except in the case of un-normalized numbers, where this hidden bit is zero, signified by a zero exponent.

Short Format

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
s exp:7:0 mantissa:51:29

In short mode, the floating-point operands are 32 bits, and the floating low register is not used. IEEE floating-point format is used. In short format, the 11-bit exponent is shortened to just 8 bits, and only the most significant 24 bits of the mantissa (including the hidden bit defined by the IEEE format) are used.

Long Format

31302928 27262524 23222120 19181716 15141312 11100908 07060504 03020100
s exp:10:0 mantissa:51:32

In long mode, the floating-point operand registers are 64 bits, and the floating low bits are used to access the least significant 32 bits of the operand. IEEE floating-point format is used. In long format, the full 11-bit exponent is used, and the most significant 21 bits of the mantissa (including the hidden bit defined by the IEEE format) are given here, while the least significant 32 bits are found in FPLOW. To load FPA1 from with a 64-bit floating-point number stored in r[3] and r[4], least significant half first, use:

        COSET   R3,FPLOW
        COSET   R4,FPA1         ; uses FPLOW

The corresponding code to get a 64-bit number from FPA1 into r[3] and r[4] is:

        COGET   R4,FPA1         ; sets FPLOW
        COGET   R3,FPLOW

Conversion between short and long format may be done by changing the value of the coprocessor operation field in COSTAT between the time a value is put into a floating-point operand register and the time it is retrieved.

COGET     N = r[dst]:31  — result is negative                    
Z = (r[dst] = 0)  — result is zero
V = 0
C = (exp = 111111111112) — not a number

In both long and short format, COGET sets the condition codes after reading FPA0 or FPA1 to report on the entire floating-point number, including that part (if any) loaded into FPLOW. N and Z indicate a negative or zero value, while C is used to report values that are NANs (not a number) or infinity. The overflow conditiion code, V, is always reset. As a result, the signed comparison instructions such as BGT and BLE will correctly report the relationship of the operand to zero.

15.2 COGET Operations

Several simple floating-point operations are initiated by getting coprocessor register numbers 4 through 15. That is, the coprocessor register number is used partially to select one of the floating-point accumulators and partly as a coprocessor operation code.

07060504 03020100 15141312 11100908                        
0 0 0 1 dst (0) 0 0 1 1 op fr COGET dst,src r[dstx] = op(co[fr])

To conform with the register numbers described in Section 15.1, op=0002 gives access to COSTAT and FPLO, and op=0012 gives access to FPA(fr) (that is, FPA(0) and FPA(1)). This leaves a total of 6 additional operations that can be initiated by COGET to operate on the selected floating point accumulator.

The following operations done by COGET are defined. None of these change the contents of the floating-point accumulators. These instructions may stall the CPU, preventing it from executing additional instructions if the floating-point unit is busy with a previously initiated computation. Because the floating-point accumulators have register numbers 2 and 3, the definitions provided in float.h are off by two from the actual values of the floating-point op field. This allows these values to be added to the accumulator numbers to form the operation code:

FPNEG   =:      2       ; op = 010 -- negate
FPABS   =:      4       ; op = 011 -- absolute-value

15.2.1. Negate — op = 0102

COGET dst,FPAx+FPNEG retrieves the negated floating-point value of a floating-point accumulator, setting the condition codes to report on the result (see Section 1.3.3). In long mode, FPLOW is also set to hold the low bits of the result.

For example, in long mode, the value in FPA(0) may negated and loaded into the double register r[3]-r[4] as follows:

        COGET   R4,FPA1+FPNEG   ; high half
        COGET   R3,FPLOW        ; low half

The value in FPA(0) can be negated and moved it to FPA(1) via r[1], in either short or long mode, as follows:

        COGET   R1,FPA0+FPNEG
        COSET   R1,FPA1

To negate the short floating-point value in r[3] without using the floating-point unit, the following instruction sequence will work:

        CMP     R1,R1           ; this always sets C
        ADJUST  R3,CMSB         ; toggle the sign bit

15.2.2. Absolute Value — op = 0112

COGET dst,FPAx+FPABS retrieves the absolute value of a floating-point accumulator, setting the condition codes to report on the result (see Section 1.3.3). In long mode, FPLOW is also set to hold the low bits of the result.

For example, in long mode, to divide FPA0 by the absolute value of FPA1 this instruction sequence will work:

        COGET   R3,FPA1+FPABS
        COSET   R3,FPDIV+FPA0

To take the absolute value of a long floating-point value in R3-R4 without using the floating-point unit, the following instruction sequence will work:

        SL      R4,1
        SRU     R4,1            ; clear the sign bit

15.3. COSET Operations

Time consuming floating-point operations are initiated by setting coprocessor register numbers 4 through 15. That is, the coprocessor register number is used partially to select one of the floating-point accumulators and partly as a coprocessor operation code.

07060504 03020100 15141312 11100908                        
0 0 0 1 srcx (0) 0 0 1 0 op fr COSET dst,src op(co[fr], r[srcx])

To conform with the register numbers described in Section 15.1, op=0002 gives access to COSTAT and FPLO, and op=0012 gives access to FPA(fr) (that is, FPA(0) and FPA(1)). This leaves a total of 6 additional operations that can be initiated by COGET to operate on the selected floating point accumulator.

The following operations are done by COGET. All of these change the contents of the floating-point accumulators, they may stall the CPU if the floating-point unit is already busy, and they may stall later COGET instructions until the operation they initiate is complete. Because the floating-point accumulators have register numbers 2 and 3, the definitions provided in float.h are off by two from the actual values of the floating-point op field. This allows these values to be added to the accumulator numbers to form the operation code:

FPINT   =:      2       ; op = 010 -- convert integer to floating
FPSQRT  =:      4       ; op = 011 -- square root
FPADD   =:      6       ; op = 100 -- add
FPSUB   =:      8       ; op = 101 -- subtract
FPMUL   =:      10      ; op = 110 -- multiply
FPDIV   =:      12      ; op = 111 -- divide

15.3.1. Convert Integer to Floating — op = 0102

COSET src,FPINT+FPAx converts the integer operand to a normalized floating-point value in FPA(fr). In long mode, FPLOW must already hold the low 32 bits of the integer operand.

For example, if r[3] and r[4] contain a 64-bit signed integer, least significant word first, and if the floating-point unit is in long mode, the integer may be converted to floating-point in FPA(1) as follows:

        COSET   R3,FPLOW
        COSET   R4,FPINT+FPA1

If r[3] contains a 32-bit unsigned integer, and if the floating-point unit is in long mode, the integer may be converted to floating-point in FPA(1) as follows:

        COSET   R3,FPLOW
        COSET   R0,FPINT+FPA1

If register r[3] contains a 32-bit signed integer, and if the floating-point unit is in short mode, the integer may be converted to floating-point in FPA(1) as follows:

        COSET   R3,FPINT+FPA1

Floating to integer conversion, truncated or rounded, must be done by software.
 

15.3.2. Square Root — op = 0112

COSET src,FPSQRT+FPAx takes the square root of the operand and puts it in FPA(fr) In long mode, FPLOW must already hold the low 32 bits of the long floating operand.

For example, if r[3] contains a short floating-point number, and if the floating-point unit is in short mode, the square root may be taken in FPA(1) as follows:

        COSET   R3,FPSQRT+FPA1

15.3.3. Add — op = 1002

COSET src,FPADD+FPAx adds the floating-point operand to FPA(fr). In long mode, FPLOW must already hold the low 32 bits of the long floating operand.

For example, if r[3] and r[4] contain a long floating-point number (least significant word first), and if the floating-point unit is in long mode, the number may be added to FPA(1) as follows:

        COSET   R3,FPLOW
        COSET   R4,FPADD+FPA1

15.3.4. Subtract — op = 1012

COSET src,FPSUB+FPAx subtracts the floating-point operand from FPA(fr). In long mode, FPLOW must already hold the low 32 bits of the long floating operand.

For example, if r[3] contains a short floating-point number, and if the floating-point unit is in short mode, the number may be subtracted from FPA(1) as follows:

        COSET   R3,FPSUB+FPA1

Note that, unlike integer arithmetic, the subtract operation does not set the condition codes. To compare the two short floating-point numbers in r[3] and r[4], use this sequence:

        COSET   R3,FPA1
        COSET   R4,FPSUB+FPA1
        COGET   R0,FPA1         ; sets the condition codes

15.3.5. Multiply — op = 1102

COSET src,FPMUL+FPAx multiplies FPA(fr) by the floating-point operand. In long mode, FPLOW must already hold the low 32 bits of the long floating operand.

For example, if r[3] and r[4] contain a long floating-point number (least significant word first), and if the floating-point unit is in long mode, FPA(1) may be multiplied by the number as follows:

        COSET   R3,FPLOW
        COSET   R4,FPMUL+FPA1

15.3.6. Floating Divide — op = 1112

COSET src,FPDIV+FPAx divides FPA(fr) by the floating-point operand. In long mode, FPLOW must already hold the low 32 bits of the long floating operand.

For example, if r[3] contains a short floating-point number, and if the floating-point unit is in short mode, FPA(1) may be divided by r[3] as follows:

        COSET   R3,FPDIV+FPA1

15.4. Examples

Consider computing y = ax2 + bx + c, where y, a, b, c and x are short floating-point numbers stored in r[3] to r[7], in that order. Also, assume that coprocessors are turned off when not in use. The following code works:

        LIL     R3,FPENAB+FPSEL+FPSHORT
        COSET   R3,COSTAT       ; turn on floating coprocessor

        COSET   R4,FPA1         ; FPA1 = a
        COSET   R7,FPMUL+FPA1   ; FPA1 = ax
        COSET   R5,FPADD+FPA1   ; FPA1 = ax + b
        COSET   R7,FPMUL+FPA1   ; FPA1 = (ax + b)x
        COSET   R6,FPADD+FPA1   ; FPA1 = (ax + b)x + c
        COGET   R3,FPA1

        COSET   R0,COSTAT       ; turn off floating coprocessor

Subroutines that use the floating-point coprocessor can be written so that they restore the coprocessor status word to its former state on return, and so that they do not turn off any other coprocessors that may be in use while they turn on and select the floating-point coprocessor. The following code illustrates this, using a variable indexed off of R2 to hold the saved floating-point status, and using R3 and R4 as temporaries:

        COGET   R3,COSTAT
        STORE   R3,R2,SVCOSTAT  ; save old COSTAT
        TRUNC   R3,8            ; clear coop and select fields
        LIL     R4,FPENAB + FPSEL ...
        OR      R3,R4           ; set enable, select and mode bits
        COSET   R3,COSTAT       ; update COSTAT

        ... code using floating point ...
        
        LOAD    R3,R2,SVCOSTAT
        COSET   R3,COSTAT       ; restore COSTAT

Many interrupt and trap service routines make no use of the floating-point coprocessor and return to the same code that was interrupted or that caused the trap. For such trap service routines, there is no need to take any action with regard to the floating-point coprocessor. When, however, a trap or interrupt service routine needs to do floating-point computation or performs a context switch, the entire state of the floating-point coprocessor must be saved.

Saving and restoring the state of the coprocessor in an interrupt service routine requires care. If the coprocessor is not enabled, only the status needs saving. If enabled, it must be selected before accessing the registers to save them. Saving the registers in long mode is always safe. Note that FPLOW must always be saved because it need not represent the low half of either accumulator at the time of interrupt.

Here is code to save the entire coprocessor state into registers R8 through R13:

        COGET   R8,COSTAT       ; COSTAT into R8
        BITTST  R8,FPENBIT
        BBR     NOFPSV          ; if floating-point coprocessor enabled
        EXTB    R9,R8,R0        ;   move just the enable bits
        ADDI    R9,R9,FPSEL+FPLONG
        COSET   R9,COSTAT       ;   select floting-point long mode
        COGET   R9,FPLOW        ;   FPLOW into R9
        COGET   R11,FPA0
        COGET   R10,FPLOW       ;   FPA0 into R10 (low) and R11 (high)
        COGET   R13,FPA1
        COGET   R12,FPLOW       ;   FPA1 into R12 (low) and R13 (high)
NOFPSV:                         ; endif

The registers cannot be restored in the order they were saved because of COSET side effects and because the value in COSTAT during saving is not the same as the saved value of COSTAT.

In addition to defining values for fields of COSTAT, float.h should define the F directive to store a word in memory holding a floating point value, and the LIF instruction, equivalent to LIW (see Section 5.2). but with a floating point operand. The following examples should work:

        F       3.1415          ; like W, but with a floating value
        F       3E2             ;   equivalent to 300.0
        F       +1.0e-2         ;   equivalent to 0.01
        LIF     -123.45E+7      ; like LIW, but with a floating value

Decimal floating-point values have an optional sign on both mantissa and optional exponent, which comes after the letter E or e. The fractional part of the mantissa is also optional.