22C:18, Lecture 26, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

A Large Example

Up to this point, we have looked fairly informally at how a person can hand translate high-level-language code to assembly language. The time has come to take a more formal look at this problem. We will look at this problem through the perspective of a very simple example language, the language of a simple RPN calculator. This calculator has the following commands:

E - enter;
push zero on the stack.

0-9 - any digit d;
multiply the top item on the stack by 10 and add d.

P - print;
pop the top item off the stack and print it.

+- - operators;
combine the top two items on the stack using the indicated arithmetic operator.
The use of this calculator is illustrated in the following table:
     Input  Output  Comments
      EP      0
      E5P     5
      E53P    53      Eddd pushes the decimal number ddd
      EEPP    0 0
      E5E3PP  3 5
      E5E3+P  8       5 + 3 = 8
      EE5PP   5 0
      EE5-P   -5      0 - 5 = -5
A Pascal implementation of this calculator is trivial, assuming that the ends of strings are marked by a distinctive character, endofstring:
	procedure calculate(var line: string);
        const stacksize = 80;
        var stack: array[0..stacksize] of integer;
            sp: 0..stacksize;
            i: 0..stringsize;
        begin
            sp := 0;
            i := 0;
            while s[i] <> endofstring do begin
                case s[i] of
                'E':begin
                        stack[sp] := 0;
                        sp := sp + 1;
                    end;
                '0'..'9':
                    stack[sp-1] := stack[sp-1] * 10
                                 + (ord(s[i]) - ord('0'));
                'P':begin
                        sp := sp - 1;
                        write(stack[sp]);
                    end;
                '+':begin
                        sp := sp - 1;
                        stack[sp-1] := stack[sp-1] + stack[sp];
                    end;
                '-':begin
                        sp := sp - 1;
                        stack[sp-1] := stack[sp-1] - stack[sp];
                    end;
                end;
		i := i + 1;
            end;
        end;
The above routine does no error checking, something that should be remedied in any practical implementation!

From a user's perspective, it would be most convenient to allow a programmer to enter a full line of input before the calculator does any computation, starting the calculations anew with each line of input. The Hawk monitor provides a routine, KBGETS, to read a null-terminated string from the keyboard. We can rewrite the above in terms of KBGETS, along with calls to various other monitor routines to provide for better output format to produce the following main program:

        program Calculator(input,output)
        var line: string;

            procedure clear;
            var i; int;
            begin
                for i = 0 to 70 do dspch(' ');
            end;

        begin { main program }
            repeat
		dspat(10,2);
		dspch('>');  { output a prompt }
                kbgets(line);
		dspat(10,3); { clear previous output }
		clear;
		dspat(10,3); { setup for output }
                calculate(line);
                dspat(10,2); { clear previous input }
                clear;
            forever;
        end;
This main program can be translated to a Smal Hawk program as follows:
	TITLE	Calculator main program
	USE	"/group/22c018/hawk.macs"
	S	START

; put the procedure call stack in unused memory
	EXT	UNUSED
PSTACK: W	UNUSED

; linkage for external procedures in the monitor
	EXT	DSPINI,DSPAT,DSPCH,KBGETS
PDSPINI:W	DSPINI
PDSPAT:	W	DSPAT
PDSPCH:	W	DSPCH
PKBGETS:W	KBGETS

; linkage for external procedures in this program 
	EXT	CALC
PCALC:	W	CALC

; the line buffer
	COMMON	LINE,80
PLINE:	W	LINE

; the main program
START:	LOAD	R2,PSTACK	; set up calling stack
	LOAD	R1,PDSPINI
	JSRS	R1,R1		; initialize display
LOOP:
	LIS	R3,10
	LIS	R4,2
	LOAD	R1,PDSPAT
	JSRS	R1,R1		; at (10,2)
	LIS	R3,'>'
	LOAD	R1,PDSPCH
	JSRS	R1,R1		; output prompt
	LOAD	R3,PLINE
	LOAD	R1,PKBGETS
	JSRS	R1,R1		; get line
	LIS	R3,10
	LIS	R4,3
	LOAD	R1,PDSPAT
	JSRS	R1,R1		; at (10,3)
	JSR	R1,CLEAR	; erase old output, if any
	LIS	R3,10
	LIS	R4,3
	LOAD	R1,PDSPAT
	JSRS	R1,R1		; at (10,3)
	LOAD	R3,PLINE
	LOAD	R1,PCALC
	JSRS	R1,R1		; calculate on the line
	LIS	R3,10
	LIS	R4,2
	LOAD	R1,PDSPAT
	JSRS	R1,R1		; at (10,2)
	JSR	R1,CLEAR	; erase old input, if any
	BR	LOOP		;   and then get next line

;-------------------------------
CLEAR:		; routine to clear 70 characters
		; may wipe out R3-8
		; does not use R2
	MOVE	R8,R1		; set aside return address
	LIL	R7,70
CLEARL:
	LIS	R3," "
	LOAD	R1,PDSPCH
	JSRS	R1,R1		; output blank
	ADDSI	R7,-1		; count it
	BGT	CLEARL;		;   iterate if more blanks
	JUMPS	R8		; return

	END
In coding the calculate procedure, one of the most interesting design decisions that comes up is where to put the stack. In high level languages such as C or Pascal, we were forced to put the stack in an explicitly declared array, and we can do this in assembly language. In assembly language, we have a new option, however, that of using the procedure calling stack, as illustrated in the following code:
	TITLE	Calculator
	USE	"/group/22c018/hawk.macs"

; linkage for external procedures in the monitor
	EXT	DSPCH,DSPST,DSPDEC
PDSPCH:	W	DSPCH
PDSPST:	W	DSPST
PDSPDEC:W	DSPDEC

;-------------------------------
	INT	CALC
CALC:			; procedure to handle one line of calculator input
			; expects R3 = pointer to line
			; uses R2 for the calculator stack and called AR's!
	MOVE	R8,R1		; save return address
	MOVE	R9,R3		; save pointer to command string
	MOVE	R10,R2		; save pointer to stack base
LOOP:
	LOADS	R3,R9
	EXTB	R3,R3,R9	; get a character
	BZR	NONNULL		
	MOVE	R2,R10		; restore stack (clean off unpopped stuff)
	JUMPS	R8		; return if null

NONNULL:
	LEA	R4,JUMPTAB	; setup for case statement
	ADDSL	R3,R4,2		; index into jumptab by the character
	LOADS	R3,R3		;   get the jump table entry
	JUMPS	R3		;   go process the entry

;-------------------------------
; jump table with one entry for each ASCII code
; because the table is complete, no bounds checking is needed
	ALIGN	4
JUMPTAB:
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; control
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ;  chars
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; control
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ;  chars
	W	SPACE,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ;  !"#$%&'
	W	ERROR,ERROR,ERROR,PLUS, ERROR,MINUS,ERROR,ERROR ; ()*+,-./
	W	DIGIT,DIGIT,DIGIT,DIGIT,DIGIT,DIGIT,DIGIT,DIGIT ; 01234567
	W	DIGIT,DIGIT,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; 89:;<=>?
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ENTER,ERROR,ERROR ; @ABCDEFG
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; HIJKLMNO
	W	PRINT,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; PQRSTUVW
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; XYZ[\]^_
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ENTER,ERROR,ERROR ; `abcdefg
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; hijklmno
	W	PRINT,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; pqrstuvw
	W	ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; xyz{|}~ 

;-------------------------------
; Each of the following blocks of code handles one calculator command
; The following are not procedures; the code for Each calculator command
; ends by advancing the input pointer and going to the loop top!
; On entry to each of these, the following registers are in use:
;
;	R2  - points to the first free word beyond the stack top
;	R8  - the return address
;	R9  - points to the current calculator command
;	R10 - points to the bottom word on the the calculator stack
;-------------------------------
SPACE:
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP
;-------------------------------
ERROR:
	LEA	R3,ERMSG1
	LOAD	R1,DSPST
	JSRS	R1,R1		; output error message prefix
	LOADS	R3,R9
	EXTB	R3,R3,R9	; get the character
	LOAD	R1,DSPCH
	JSRS	R1,R1		; output it
	LEA	R3,ERMSG2
	LOAD	R1,DSPST
	JSRS	R1,R1		; output error message suffix
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP

ERMSG1:	ASCII	"Error '",0
ERMSG2:	ASCII	"' ",0
	ALIGN	2
;-------------------------------
PLUS:
	ADDSI	R2,-4		; pop the stack
	CMP	R2,R10		; check for underflow!
	BLE	ERROR		;   complain if underflow
	LOADS	R3,R2		; get old (popped) stack top
	LOAD	R4,R2,-4	; get item below it (current top item)
	ADD	R4,R4,R3	;   add them
	STORE	R4,R2,-4	; put the result back on top of the stack
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP
;-------------------------------
MINUS:
	ADDSI	R2,-4		; pop the stack
	CMP	R2,R10		; check for underflow!
	BLE	ERROR		;   complain if underflow
	LOADS	R3,R2		; get old (popped) stack top
	LOAD	R4,R2,-4	; get item below it (current top item)
	SUB	R4,R4,R3	;   subtract them
	STORE	R4,R2,-4	; put the result back on top of the stack
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP
;-------------------------------
DIGIT:	
	CMP	R2,R10		; check for empty stack!
	BLE	ERROR		;   complain if empty
	LOAD	R4,R2,-4	; item on stack top
	LOADS	R3,R9		; the character needs getting again
	EXTB	R3,R3,R9	;   so get it (known to be a digit)
	ADDI	R3,-"0"		;   convert it to an integer
	ADDSL	R4,R4,2		; multiply stack top by 5
	ADDSL	R4,R3,1		; multiply by 2 and add digit
	STORE	R4,R2,-4	; return to stack top
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP
;-------------------------------
ENTER:
	STORES	0,R2		; push zero
	ADDSI	R2,4
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP
;-------------------------------
PRINT:
	CMP	R2,R10		; check for empty stack!
	BLE	ERROR		;   complain if empty
	ADDSI	R2,-4		; pop
	LOADSCC	R3,R2		; check sign
	BNR	PRINTP
	LIS	R3,"-"		; if negative
	LOAD	R1,DSPCH
	JSRS	R1,R1		;   print - first
	LOADS	R3,R2		; then recover the number
	NEG	R3,R3		;   and negate it
PRINTP:
	CLR	R4
	LOAD	R1,DSPDEC
	JSRS	R1,R1		; print in field width zero
	LIS	R3," "
	LOAD	R1,DSPCH
	JSRS	R1,R1		; print trailing blank
	ADDSI	R9,1		; advance pointer into command string
	JUMP	LOOP

	END
Note that the above code includes error checking for stack underflow (two many pop operations) and for illegal calculator commands, but it does not give any explanation with its error messages, nor does it check for stack overflow.

Checking for stack overflow is unneeded so long as the maximum length of an input string is known. In our main program, we assumed that the maximum was 80 characters, so the maximum calculator program would be 80 consecutive push operations. So long as the stack is at least 80 words long, this program cannot produce a stack overflow.

In fact, the maximum command length is not known because our calculator uses the KBGETS monitor routine, and KBGETS shares a well known flaw with the standard C library routine gets() -- it does nothing to limit the length of a string to that of the string variable it is passed! This is a major flaw in the standard C library; it has been the source of many system failures, and if you experiment with this calculator program, you will discover that it behaves strangely for excessively long input lines.

The case statement in the Pascal code above has been translated to an indexed indirect jump through the jump table in the calculator. This is very fast, and it gains speed by having an exhaustive jump table, thus eliminating any need for run-time bounds checking.