Lecture 29, Control Structures

Part of the notes for CS:4980:1
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

The do construct

The Kestrel grammar for statements and loops, as given, looks like this:

<statement> ::= <do end>
             |  <if>
             |  <case>
             |  <loop>
             |  <exception handler>
             |  <raise>
             |  <procedure call>
             |  <assignment>

<do end> ::= "do"
                 <block>
             "end"

<loop> ::= <while loop>
         |  <until loop>
         |  <for loop>

<until loop> ::= "do"
                     <block>
                 "until" <expression>

This part of the grammar is organized for discussion purposes, with gratutitous nonterminals such as <do end>, <loop> and <until loop> added simply so that each section of the language definition could be headed up by the relevant parts of the grammar. The resulting grammar is unnecessarily difficult to parse. Specifically, how do you distinguish between <do end> and <until loop>? A recursive descent parser using one-symbol lookahead cannot distinguish these.

Fortunately, we can rewrite the grammar. Consider this variation:

<statement> ::= <do>
             |  <if>
             |  <case>
             |  <loop>
             |  <while loop>
             |  <for loop>
             |  <exception handler>
             |  <raise>
             |  <procedure call>
             |  <assignment>

<do> ::= "do" <block> ( "end" | "until" <expression> )

Here, we combined <do end> and <until loop> into a single syntactic category named <do after the keyword that introduces both of the original categories. The difference between the two original categories has is now expressed in the choice of endings for the block.

We can use this directly to write code to compile do loops and do-end blocks. In the following, we assume that class statement in file statement.cpp includes statement::compile(e) as well as a number of local functions, one for each variant form of statement. It calls the appropriate local function depending on the keyword that begins the statement. Here, we give just one of these local functions:

compile_do( Environment e ) {
	// BNF <do> ::= "do"
	//                   <block>
        //              ( "end" | "until" <expression> )

	// assert lex_this = "do"
	lex_advance();

	// label the top of the loop, just in case it iterates
	Label top = cg_newlabel();
	cg_label( top );

	// handle the statement's body
	Block body = Block::compile( e );

	// discard any local variables allocated in the block
	cg_popl( body->size );

	if (lex_iskey( lex_this, KEY_END )) {
		// it is a do-end block
		lex_advance();

	} else if (lex_iskey( lex_this, KEY_WHILE )) {
		// it is a do-until block
		lex_advance();

		// push the value of the expression on the stack
		(void) expression::compile( e );

		// pop and test the value of the expression
		cg_bfalse( top );
	} else {
		lex_gotbutwant( lex_this, ER_WANT_WHILE );
	}
}

In the above code, we assumed that all entry points to the code generator are prefixed cg_. The code generator is assumed to export the type Label, and the generator function cg_newlabel() produces an unbounded series of new labels. The obvious interface to the code generator would have values of type Label be the actual textual labels used in the assembly code, but that messy to implement. It is far easier to have values of type Label represented as integers, so cg_newlabel() just increments a hidden label counter and returns its value.

The cg_label(l) operator takes a label and generates a symbolic label in the assembly code. For example, cg_label(129) might generate the label L129: in the compiler's output in ARM assembly language. Similarly, The cg_br(l) operator generates assembly code equivalent to the high-level BR operation defined in our abstract machine code. Thus, cg_br(129) would generate the ARM branch instruction B L129.

The second assumption we made above is that the Block::compile() factory method creates a block and pushes all of the local variables of that block onto the stack, but it does not pop them. The need to pop the local variables depends on the context of the block. Blocks that are bodies of subroutines have their local variables popped by the subroutine return instruction. Blocks that are bodies of if statements or loops must have their local variables popped by explicit pop operations. Here, we used the cg_popl() operation, the equivalent of the POPL instruction in the high-level stack architecture understood by our compiler.

The above code does not need to check the start set because the main Statement::compile() routine guarantees that it will not call compile_do() except when the current lexeme is do. The code does not check the follow set because all statements have the same follow set, so this is better done just once at the end of Statement::compile() after this code returns.

While Loops

Here is similar code for while loops, useful to contrast with the above

compile_do( Environment e ) {
	// BNF <while> ::= "while" <expression> [ "do" ]
	//                      <block>
        //                   "end"

	// assert lex_this = "while"
	lex_advance();

	// label the top of the loop, and prepare to label the bottom
	Label top = cg_newlabel();
	Label bottom = cg_newlabel();
	cg_label( top );

	// push the value of the expression on the stack
	(void) expression::compile( e );

	// pop and test the value of the expression for loop exit
	cg_bfalse( top );

	// handle the statement's body
	Block body = Block::compile( e );

	// discard any local variables allocated in the block
	cg_popl( body->size );

	// branch to the top of the loop and label the bottom
	cg_br( top );
	cg_label( bottom );

	if (lex_iskey( lex_this, KEY_END )) {
		lex_advance();

	} else {
		lex_gotbutwant( lex_this, ER_WANT_WHILE );
	}
}

Stack Top Caches and Booleans

If the code genrator does not use a stack-top cache, then code genration for the expressions used for loop exit and if statements will be extremely poor. Code for while a>b for example, will end up something like this:

PUSH a
   push a onto stack in memory
PUSH b
   push b onto stack in memory
GT
   pop a and b into registers ra and rb
   if ra > rb
      push true onto stack in memory
   else
      push false onto stack in memory
   end
BFALSE
   pop boolean into r
   if r = false
      exit loop
   end

In the above, we've described how each instruction in the code generator's abstract virtual machine translates to machine code without giving the actual machine code. We know that we ought to be able to generate code like this:

PUSH a
   load a into ra
PUSH b
   load b into rb
GT
   compare a and b, leaving results in the condition codes
BFALSE
   if condition codes indicate a <= b
      exit loop
   end

The first part of the above is simple using a stack-top cache, but the problem is, how can we get the second part to work as well? The answer is, we need to extend our concept of a stack-top cache to include the possibility that the condition codes hold the value on the top of the stack.

We have the following six comparison operators: > <= >= < = /=. Each of these compiles to the same ARM CMP instruction that subtracts its second operand from its first, discards the difference, and leaves the result of the comparison in the condition codes. The difference between these 6 comparisons lies not in how the condition codes get set, but in how they are interpreted.

The 4 condition code flags on the ARM (and its ancestor, the PDP-11) are:

These 4 bits are easy to compute in hardware, but their use for comparison is not immediately obvious. Fortunately, the desiners of the PDP-11 and later processors that use these condition codes always provide a useful set of 14 conditional branch (or conditional execution) instructions. For example, the ARM supports these branches, all named in terms of the relationship between the two operands of the preceeding CMP instruction:

All of the above apply to the comparison of signed 32-bit integer values.

What this implies is that when the boolean expression a>b is compiled on a machine using a stack-top cache, the effect is to generate a CMP instruction, recording that this consumed the two top elements of the stack from registers, and recording that the stack top is now stored in the condition codes in such a form that BGT will be used to branch on true and BLE will be used to branch on false.

If we compile a<=b in the same context, we generate the same CMP instruction, but we record that that the stack in the condition codes is saved in a form where BLE is branch on true and BGT is branch on false.

In sum, the state of our stack top cache now includes the following: