Assignment 10, due April 10Solutions
Part of
the homework for CS:2630, Spring 2015
|
1 = 000000000001 010000000000 000000000000 (exp = 1, mantissa = 1/2) 10 = 000000000100 010100000000 000000000000 (exp = 4, mantissa = 5/8) 0.1 = 111111111101 011001100110 011001100110 (exp = -3, mantissa = 8/10)
a) Given a floating point number in IEEE format in R3, write code to extract the exponent from that number and convert it to PDP-8 format in the least significant 12 bits of R4, leaving R3 unchanged. You may use R5 as a scratch register. Assume that the number is neither unnormalized nor a NaN. (0.5 points)
Some preliminary work: Let's look at the exponent encoding used for 0.5: In IEEE format, the exponent is 01111110 (126), while in PDP-8 format, the exponent is as given above (1). So, we can convert IEEE exponents to PDP-8 exponents by subtracting 125.MOVE R4,R3 ; copy the number SL R4,1 ; discard the sign of the mantissa SRU R4,12 SRU R4,12 ; aligned the exponent as an 8-bit unsigned integer ADDI R4,R4,-125
b) Given a floating point number in IEEE format in R3, write code to extract the mantissa from that number and convert it to PDP-8 format in the least significant 24 bits of R3 (the most significant 8 bits of R3 must be set to zero). You may use R5 and R6 as scratch registers, if necessary.
The same logic used for the exponent field applies here. Really, all we need to do is recover the hidden bit and change from signed magnitude representation to two's complement representation.
MOVE R5,R3 ; keep a copy of the number for the sign bit SL R3,9 SRU R3,9 ; discard the exponent and sign LIW R6,#00800000 OR R3,R6 ; set hidden bit BITTST R5,31 BBR NOTNEG NEG R3,R2 ; if the number was negative, negate it NOTNEG:
sin x ≅ x – x3/6 + x5/120
A Problem: Given that the Hawk floating point coprocessor is already turned on, and given the value of x in R3, write Hawk code (not a subroutine, just straight line code) that computes the above approximation for sin x, leaving the result in R3. (1 point).
Note: You may need some place to store intermediate results. You can use R4 and up if necessary.
The first solution given here is done using brute-force methods, except that all integer constants were converted to floating point in advance and we compute the last term first; we use FPA0 to compute each term while accumulating the sum in FPA1:
COSET R3,FPA0 COSET R3,FPMUL+FPA0 COSET R3,FPMUL+FPA0 ; * COSET R3,FPMUL+FPA0 ; * COSET R3,FPMUL+FPA0 ; *- x**5 LIW R4,#42F00000 ; -- 120.0 COSET R4,FPDIV+FPA0 ; -- x**5/120 COGET R4,FPA0 ; * COSET R4,FPA1 ; -- accumulate x**5/120 COSET R3,FPA0 COSET R3,FPMUL+FPA0 ; * COSET R3,FPMUL+FPA0 ; *- x**3 LIW R4,#40C00000 ; -- 6.0 COSET R4,FPDIV+FPA0 ; -- x**3/6 COGET R4,FPA0 ; * COSET R4,FPSUB+FPA1 ; -- accumulate -x**3/6 + x**5/120 COSET R3,FPADD+FPA1 ; *- accumulate x - x**3/6 + x**5/120 COGET R3,FPA1 ; *If we do a bit of algebra first, we can do a better job:
x – x3/6 + x5/120 = x(1 – x2/6 + x4/120) = x(1 + x2/–6 + x4/120) = x(1 + x2/–6(1 + x2/–20) We also convert the constants and constant fractions to IEEE format:
1 = 3F80000016 –1/6 = BE2AAAAB16 –1/20 = BD4CCCCD16
COPUT R3,FPA0 COPUT R3,FPA0 + FPMUL ; -- x**2 LIW R4,#BD4CCCCD ; -- -1/20 COGET R5,FPA0 ; -- set aside a copy of x**2 COPUT R4,FPA0 + FPMUL ; -- x**2/-20 LIW R6,#3F800000 ; -- 1.0 COPUT R6,FPA0 + FPADD ; -- 1 + x**2/-20 LIW R4,#BE2AAAAB ; -- -1/6 COPUT R3,FPA0 + FPMUL ; -- x**2(1 + x**2/120) COPUT R4,FPA0 + FPMUL ; *- x**2/-6(1 + x**2/120) COPUT R6,FPA0 + FPADD ; *- 1 + x**2/6(1 + x**2/120) COPUT R3,FPA0 + FPMUL ; *- x(1 + x**2/6(1 + x**2/120)) COGET R3,FPA0 ; *The original was 20 machine instructions, while this version is only 16. The improvement is actually greater than that because, whenever a computationally intensive coprocessor instruction is followed by another coprocessor instruction that operates on the same floating point accumulator, the second instruction will almost certainly have to wait. These instructions are marked with stars in the comment fields above. There are 9 such instructions in the first solution, but only 4 in the second, so the second may be significantly faster.
FF100010 |
| Parallel-port data register | ||||||||||||||||
FF100004 |
| Parallel-port status and control register | ||||||||||||||||
IE = interrupt enable (control) | ||||||||||||||||||
ER = error (status) | ||||||||||||||||||
DR = direction (control, in = 0) | ||||||||||||||||||
RD = data ready (status) |
Parallel ports were frequently bidirectional, able to serve both as input or output ports, hence the addition of a DR control bit to set the data transfer direction. As an input port (DR = 0), RD = 1 indicates that the data register contains new input data; reading the data register will reset RD. As an output port (DR = 1), RD = 1 indicates that the data register is ready for new output data; writing the data register will reset RD. (1 point)
A problem: Write Smal Hawk code for a PUTPAR routine that outputs one 8-bit byte to the parallel port. This should not use interrupts, it should set the direction to output, wait for ready, and then transfer data to the device.
First, here is a straightforward solution that will probably work correctly most of the time, at least if nothing complicated is going on.
; Parallel Port (PP) constant definitions: PPBASE = #FF100010 ; base of I/O register block PPDATA = 0 ; displacement of data register PPSTAT = 4 ; displacement of control register PPRDY = 0 ; bit number of ready status bit PPDIR = 5 ; bit number of direction control bit PUTPAR: ; given R3 = ch, the byte to output ; first, setup to address the parallel port LIL R4,PPBASE ; -- index all device regs from R4 ; second, set the direction to 1 (output) LIS R5,1<<PPDIR STORE R5,R4,PPSTAT ; third, wait for device ready PUTPPOLL: LOAD R5,R4,PPSTAT BITTST R5,PPRDY BBR PUTPPOLL ; finally, output the data and return STORE R3,R4,PPDATA JUMPS R1The weak point in the straightforward solution is that, as it sets the direction to in, it also resets all the other control bits; this is not necessarily a good idea, as some of these might matter. As a result, carefully written I/O drivers frequently contain code to set bits far more carefully. Here is an example rewrite of step 2 above:
; second, set the direction to 1 (output) LOAD R5,R4,PPSTAT BITTST R5,PPDIR ; -- only change direction if necessary BBS PUTPOUT ; if (status.dir != 1) { LIS R6,1<<PPDIR OR R5,R6 ; -- preserves all other control bits STORE R5,R4,PPSTAT ; status.dir = 1 PUTPOUT: ; }