Lecture 41, Case Statements Revisited, Bounds Checking
Part of
the notes for 22C:196:002 (CS:4908:0002)
|
Consider this case statement, a bit more complex than the example in Lecture 39:
select i from case 1, 3, 5 .. 7: a; case 2,8 .. 10: b; else c; end;
There are two obvious ways to compile it. One is to make it equivalent to this code using an if-else-if cascade:
if (i = 1)/\(i = 3)/\((i >= 5)\/(i <= 7)) then a; else if (i = 2)/\((i >= 8)\/(i <= 10)) then b; else c; end end;
The other is to compile it to something like this, using a jump table:
ldr r3,[fp,#i] @ get operand i (a local var) cmp r3,#lower @ check lower bound blt else cmp r3,#upper @ check upper bound bgt else adr r4,table ldr pc,[r4,r3,lsl#2]@ indirect indexed jump through table case1: ---- code for a ---- b end case2: ---- code for b ---- b end lower = 1 upper = 10 table = .-(lower<<2) @ address of element zero of table .word case1 @ table[1] .word case2 @ table[2] .word case1 @ table[3] .word else @ table[4] .word case1 @ table[5] .word case1 @ table[6] .word case1 @ table[7] .word case2 @ table[8] .word case2 @ table[9] .word case2 @ table[10] else: ---- code for c ---- end:
Note that, in this case, the jump-table implementation is definitely a better implementation than the if-else-if cascade. The table reaches all of the cases at exactly the same speed, while the if-else-if cascade is a linear search. The memory requirements for the if-else-if cascade are far greater than the jump table because it takes one comparison plus one conditioinal branch for each individual case. Only if the majority of the table entries were empty (that is, referred to the else clause instead of some specific case) would the if-else-if cascade begin to compete with the jump table.
Note in the above code that both the jump table and the if-else-if cascade find uses for a range-check, that is, code equivalent to one of these:
That is, checking to see whether b is in the range from a to c (inclusive) or is outside that range. Range checking is also important in the context of assignment. In the assignment a=b, when a is a variable of a subrange type, we must check that the value of b is in that subrange. Similarly, when passing parameters to a subroutine, when the formal parameter is of a subrange type, the actual parameter must be checked to see if it is inside that subrange. The same applies to array subscripts.
Of course, all of these bounds checks, even the checks involved in case statements, may be optimized away by examination of the type of the value being used. If the type is itself constrained to be in a subrange that is stricter than (or the same as) the range being checked, there is no need for bounds checking. If just one bound on the type is within the allowed range, then checking of that bound is not needed.
Several computers designed in the 1950s and 1960s, before the invention of condition codes, had comparison instructions that operated as follows:
The typical use of this instruction was as follows:
CMP a,b BR ALESS ; executed if a < b BR ASAMEB ; executed if a = b BR ABIGG ; executed if a > b
A less than obvious use of this instruction was in bounds checking:
CMP a,max CMP a,min ; if a < max BR OUT ; if a = max or (a < max and a < min) BR OUT ; if a > max or (a < max and a = min) ; if a < max and a > min
The above code branches to OUT if a is in the range from min to max exclusive of the end points. This is the kind of trick you can fail to invent after years of using a machine.
The original Fortran language had a bizarre if statement that only makes sense in the context of this strange comparison instruction:
if (a) 10,20,30
This branches to the label 10 if a is less than zero, to the label 20 if a is zero, and to 30 if a is greater. Labels in Fortran were numeric, and if you left out a label in the above, it meant "fall through" to the next statement.
The ARM instruction set has an equally strange but delightfully optimal coding trick for bounds checking. Every ARM instruction can optionally be made contingent on the condition codes. As a result, you can write bounds checking code as follows:
cmp a,upper cmple lower,b ble inrange @ (a <= upper) and (lower <= a)
This kind of tricky code is obvious for register-to-register operations, but within the rather tricky constraints of the ARM instruciton set concerning immediate operands, they may also be used.