Lecture 36, Exceptions

Part of the notes for CS:4908:0004 (22C:196:004)
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Consider this fragment of Kestrel code:
        
e: exception;
f: procedure
    if (conditiona) f; end;
    if (conditionb) raise e; end;
end

try
    f;
    if (conditionc) raise e; end;
handle e
    f;
end

How does this work? First, note that Kestrel exceptions must be declared, and the declaration must be visible from both the point where the exception is raised and the point where it is handled. The declaration itself creates a default handler for that exception one that terminates the program ungracefully.

A compiler designed for student use should have a nice default handler, perhaps one that outputs something like unhandled e exception, but the formal semantics of the language does not specify a particular behavior for unhandled exceptions. They are considered to be errors, and the formal semantics only specifies the behavior of correct programs.

All exception handling models share the following basic idea. At the start of the try block, the previous handler for the exception in question is saved (conceptually pushed on a stack), and the new handler is installed. At the end of the try block, the previous handler is restored. On entry to a handler, the previous handler is also restored, so that raising an exception within the handler cannot lead to a loop.

What is an exception? There are two or perhaps three models that are common in different programming languages:

  1. Each exception declaration defines a new object of type exception. To support this, when we compile a try block, we must know which exceptions we are installing handlers for.

  2. There is just one global exception object. Exception identifiers are actually members of an implicit global enumerated type that is extended with new members each time a new exception is declared. The handlers at the end of any try block are, in effect, a case-select construct that selects a handler based on the value selected from the enumeration. The Ada programming language used this model.

  3. There is just one global exception object. When you throw an exception, you pass an arbitraray object to the handler. The handler must be able to examine this object in order to determine its class. Many programmers use this model as if there was the constraint that the object thrown must be a character string, the name of the exception, but in languages like Python, this is not a requirement, merely a convention.

Kestrel try blocks do not delcare, up front, which handler is to be installed for that block, and the handlers at the end of the block may include an else handler for all other exceptions other than those named explicitly. These two language design features force us to use the second exception handling model.

Exception Objects

So what is an exception object? At the very minimum, an exception object must obviously record the address of the code for the handler. Otherwise, how could the raise statement possibly transfer control to the handler.

But, the handler runs in the context of some Kestrel block, and all Kestrel blocks must be presumed to be dynamically allocated, with, at the very least, a frame pointer allowing reference to variables in that block. Therefore, we must also include the correct value for the frame pointer. We could also include a saved stack pointer, but in Kestrel, the displacement from the stack pointer to the frame pointer is a constant that can be statically computed at every point in the code. Therefore, if the frame pointer is restored, we can immediately compute the correct stack pointer value and restore the stack pointer that way.

In sum, we can imagine the standard prologue for a Kestrel program as including something equivalent to the following declarations, all of which are invisible to the programmer:

TYPEexception: type record
    handler: var MACHINEword;
    frame: var MACHINEword;
end;
ENUMexception: enum ( );
VARexception: var TYPEexception;

That is, we have two implicit types, TYPEexception, the type of the single exception variable named here VARexception that is statically allocated and declared globally, and ENUMexception an enumeration type that is, anomalously, initially empty and that has members added to it each time there is a new exception declaraiton.

Actually, there should probably be some predefined exceptions. Consider, for example, RangeError a possible name for the exception raised when a program tries to assign a value to a variable where that value is outside the range of values permitted by the type of that variable. This exception should also apply when the value of an expression is outside the range of values permitted for that expression, for example, when used as an array subscript or as an actual parameter.

Raising an Excepition

The above provides us with enough information to write code to raise an exception.

RAISE e
Raise the exception e by first placing the value e in a designated special register (see below); the value e must be one of the constants of type ENUMexception. Second, the frame pointer is set to VARexception.frame and the program counter is set to VARexception.handler.

Also note that locations on the stack can themselves be statically allocated as variables in the activation record, since at any point in the code, the displacement of the stack pointer from the frame pointer can be computed as a static constant.

Handlers

Now, look at how we compile a try-handle statement that was given in the original example.

-- try SAVEexception: var TYPEexception; SAVEexcepiton = VARexception; -- save the old handler's identity VARexception.handler = HANDLE; VARexception.handler = fp; -- new handler is installed f; if (conditionc) -- raise e; SPECIALregister = e JUMP HANDLE end; -- handle e VARexcepiton = SAVEexception; -- restore the old handler's identity JUMP DONE HANDLE: VARexcepiton = SAVEexception; -- restore the old handler's identity if SPECIALregister = e f; else fp = VARexception.frame; JUMP VARexception.handler end; -- end DONE:

Note three things here: First, the old value of the global VARexcepiton is saved in a local variable SAVEexception. The programmer never explicitly mentions either of these variables, but actual RAM must be allocated for both the global variable and the save location for each and every try. Of course, this local variable may be deallocated as soon as it is used at the start of the first handler following the try block.

Second, the above code illustrates a sensible (but unnecessary) optimization of raise statements that are statically nested in the try block. In this case, the raise operation simply jumps to the handler instead of using the global exception variable. The reason for this is that the frame pointer is already correct and need not be restored, and we know the correct value for the program counter -- it is a normal assembly-time forward reference, one we have already used when we generated the code for the to arm the handler.

Third, the special register, referred to above as SPECIALregister. This can be any register, so long as it is not touched by the code to restore the old handler. If there are multiple handlers following a try, the if statement can be replaced by a case-select construct or it can be replaced by an if-else-if cascade. The else clause catches all other exceptions, and if the else is missing, the compiler implicitly throws the exception that was not handled here.

This does suggest that Kestrel is missing some features. There is no generic raise for use within an exception handler to re-throw the exception. In a handler for one of many exceptions, it would be nice to be able to re-throw whatever exception got us here the way the compiler does automatically in the case of a missing else.