Lecture 41, Case Statements Revisited, Bounds Checking

Part of the notes for 22C:196:002 (CS:4908:0002)
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Case Statements Revisited

Consider this case statement, a bit more complex than the example in Lecture 39:

select i from
   case 1, 3, 5 .. 7:
      a;
   case 2,8 .. 10:
      b;
   else
      c;
end;

There are two obvious ways to compile it. One is to make it equivalent to this code using an if-else-if cascade:

if (i = 1)/\(i = 3)/\((i >= 5)\/(i <= 7)) then
   a;
else
   if (i = 2)/\((i >= 8)\/(i <= 10)) then
      b;
   else
      c;
   end
end;

The other is to compile it to something like this, using a jump table:

        ldr     r3,[fp,#i]      @ get operand i (a local var)

        cmp     r3,#lower       @ check lower bound
        blt     else
        cmp     r3,#upper       @ check upper bound
        bgt     else

        adr     r4,table
        ldr     pc,[r4,r3,lsl#2]@ indirect indexed jump through table
case1:
        ---- code for a ----
        b       end
case2:
        ---- code for b ----
        b       end
lower   =       1
upper   =       10
table   =       .-(lower<<2)    @ address of element zero of table
        .word   case1           @ table[1]
        .word   case2           @ table[2]
        .word   case1           @ table[3]
        .word   else            @ table[4]
        .word   case1           @ table[5]
        .word   case1           @ table[6]
        .word   case1           @ table[7]
        .word   case2           @ table[8]
        .word   case2           @ table[9]
        .word   case2           @ table[10]
else:
        ---- code for c ----
end:

Note that, in this case, the jump-table implementation is definitely a better implementation than the if-else-if cascade. The table reaches all of the cases at exactly the same speed, while the if-else-if cascade is a linear search. The memory requirements for the if-else-if cascade are far greater than the jump table because it takes one comparison plus one conditioinal branch for each individual case. Only if the majority of the table entries were empty (that is, referred to the else clause instead of some specific case) would the if-else-if cascade begin to compete with the jump table.

Bounds Checking

Note in the above code that both the jump table and the if-else-if cascade find uses for a range-check, that is, code equivalent to one of these:

if (a <= b)/\(b <= c)
if (b < a)\/(c < b)

That is, checking to see whether b is in the range from a to c (inclusive) or is outside that range. Range checking is also important in the context of assignment. In the assignment a=b, when a is a variable of a subrange type, we must check that the value of b is in that subrange. Similarly, when passing parameters to a subroutine, when the formal parameter is of a subrange type, the actual parameter must be checked to see if it is inside that subrange. The same applies to array subscripts.

Of course, all of these bounds checks, even the checks involved in case statements, may be optimized away by examination of the type of the value being used. If the type is itself constrained to be in a subrange that is stricter than (or the same as) the range being checked, there is no need for bounds checking. If just one bound on the type is within the allowed range, then checking of that bound is not needed.

Subrange Tricks with Antique Instruction Sets

Several computers designed in the 1950s and 1960s, before the invention of condition codes, had comparison instructions that operated as follows:

CMP a,b: if a < b then pc = pc + 1;
if a = b then pc = pc + 2;
if a > b then pc = pc + 3;

The typical use of this instruction was as follows:

CMP     a,b
BR      ALESS   ; executed if a < b
BR      ASAMEB  ; executed if a = b
BR      ABIGG   ; executed if a > b

A less than obvious use of this instruction was in bounds checking:

CMP     a,max
CMP     a,min   ; if a < max
BR      OUT     ; if a = max or (a < max and a < min)
BR      OUT     ; if a > max or (a < max and a = min)
                ; if a < max and a > min

The above code branches to OUT if a is in the range from min to max exclusive of the end points. This is the kind of trick you can fail to invent after years of using a machine.

The Fortran `if` Statement

The original Fortran language had a bizarre if statement that only makes sense in the context of this strange comparison instruction:

if (a) 10,20,30

This branches to the label 10 if a is less than zero, to the label 20 if a is zero, and to 30 if a is greater. Labels in Fortran were numeric, and if you left out a label in the above, it meant "fall through" to the next statement.

Efficient Bounds Checking on the ARM

The ARM instruction set has an equally strange but delightfully optimal coding trick for bounds checking. Every ARM instruction can optionally be made contingent on the condition codes. As a result, you can write bounds checking code as follows:

cmp     a,upper
cmple   lower,b
ble     inrange         @ (a <= upper) and (lower <= a)