Lecture 37, Exceptions

Part of the notes for CS:4980:1
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Exceptions in Kestrel

Consider this fragment of Kestrel code:

        
e: exception;
f: procedure
    if (conditiona) f; end;
    if (conditionb) raise e; end;
end

catch e in
    f;
    if (conditionc) raise e; end;
case e
    f;
end

How does this work? First, note that Kestrel exceptions must be declared, and the declaration must be visible from both the point where the exception is raised and the point where it is handled. The declaration itself creates a default handler for that exception one that terminates the program ungracefully.

A compiler designed for student use should have a nice default handler, perhaps one that outputs something like unhandled e exception, but the formal semantics of the language does not specify a particular behavior for unhandled exceptions. They are considered to be errors, and the formal semantics only specifies the behavior of correct programs.

All exception handling models share the following basic idea. At the start of the exception handler block, the previous handler for the exception in question is saved (conceptually pushed on a stack), and the new handler is installed. If an exception is raised, the previous handler is restored as control is transferred to the case for that exception.

Exceptions in Kestrel

What is an exception? There are two or perhaps three models that are common in different programming languages:

Each exception declaration defines a new object of type exception. To support this, when we compile a try-catch block, we must know which exceptions we are installing handlers for. Kestrel fillows this model, which is why the kestrel's exception handling block requires the exception that will be caught to be declared up front.
There is just one global exception object. Exception identifiers are actually members of an implicit global enumerated type that is extended with new members each time a new exception is declared. The handlers at the end of any try-catch block are, in effect, a case-select construct that selects a handler based on the value selected from the enumeration. The Ada programming language used this model.
There is just one global exception object. When you throw an exception, you pass an arbitraray object to the handler, or an object that is an arbitrary subclass of an exception class. The handler must be able to examine this object in order to determine the class (or the subclass). Java's throwable and exception classes fit this model. Many programmers use this model as if there was the constraint that the object thrown must be a character string, the name of the exception.

Exception Objects

What is an exception object? At the very minimum, an exception object must record the address of the code for the handler. This allows control to transfer to the exception handlere when it is raised.

The handler runs in the context of some block, and and regardless of whether we are talking about Kestrel or Java, blocks are blocks can be allocated dynamically, with, at the very least, a frame pointer allowing reference to variables in that block. Therefore, when control is transferred to a handler, the frame pointer must be restored to the correct value for that block, implying that the exception object must record the frame pointer for the handler's block along with the address of the handler.

It may also be necessary to include the stack pointer as part of the exception object, but in Kestrel and many other languages, the displacement from the stack pointer to the frame pointer is a constant that can be statically computed at every point in the code. Therefore, if the frame pointer is restored, we can immediately compute the correct stack pointer, or visa versa.

Implementing Exceptions in Java

We can imagine the standard prologue for a Kestrel program as including something equivalent to the following declarations, all of which are invisible to the programmer:

TYPEexception: type record
    handler: var MACHINEword;
    frame: var MACHINEword;
end;
range: var TYPEexception;

That is, we have an implicit type, TYPEexception, the type of the variable created for every exception. In addition, the globally predefined exception, range is implemented as a global variable holding one value of this implicitly defined type. Each additional exception declaration is implemented as another TYPEexception variable.

The type TYPEexception is not available to the Kestrel programmer, it is an internal type used only in the implementation of exceptions. The details of this type are known only to the code generator.

Raising an Excepition

The above provides us with enough information to write code for raise e. This will first push the address of the exception variable e onto the stack, and then transfer control to the handler referenced by that variable. We will do the control transfer with a new instruction in the abstract high-level architecture that serves as the interface to the code generator.

RAISE: Pop the address of the exception e from the stack, where e is a TYPEexception variable, and then restore the frame pointer, stack pointer and program counter from this variable.

Handlers

Now, look at how we compile an exception handler statement that was given in the original example.


--  catch e in
SAVEexception: var TYPEexception;
SAVEexcepiton = e; -- save the old handler
e.handler = HANDLE;
e.frame = fp;      -- new handler is installed
    f;
    if (conditionc) raise e; end;
--  case e
e = SAVEexception; -- restore the old handler's identity
BR DONE
HANDLE:
e = SAVEexception; -- restore the old handler's identity
    f;
--  end
DONE:

Note three things here: First, the old value of the exception object e is saved in a local variable SAVEexception. The programmer never explicitly mentions either of these variables, but actual RAM must be allocated for both the global variable and the save location, and the save location must be local to the context where the handler is installed. Of course, this local variable may be deallocated at the end of the block.

On exit from the first block of the exception handler statement, the previous handler is re-installed. The new handler was only in place briefly during this block.

Finally, on entry to the case that handles this exception, the old handler is also re-installed. In the event that this exception handler handles multiple exceptions, all of them are restored to their original state at exit from the first block and at the entrance to any of the cases.

Note that if an exception is raised directly in the first block of the exception handling statement, it can be compiled directly to a branch to the appropriate handler case. This optimization is an easy one and it can halve the overhead of exception handling.

If the only raise statement for an exception is local to the handler statement, and if that raise statement is replaced by a branch, there is no need to save and restore the exception variable. This optimization is more difficult because it involves global knowledge. There is an easy special case, though, when the exception is declared locally in a block that contains no procedure or function declarations, and where the only place that exception is raised is in a handler statement in that block. Using this optimization makes local exception handling as inexpensive as an if statement. Because of this, Kestrel does not have a break statement. If you need a multi-exit loop, just use a local exception.

The Kestrel exception mechanism is missing some features that exception handling in Java and C++ support. The biggest of these is that there is no way to catch an unknown exception. You must know the name of the exception you want to catch. The lack of a general catch mechanism is a price we pay for the potential efficiency of Kestrel exceptions. Java and C++ exceptions are slower.