G++ internals - Exception Handling

Node: Exception Handling Next: Free Store Prev: Copying Objects Up: Top

Exception Handling

Note, exception handling in g++ is still under development.

This section describes the mapping of C++ exceptions in the C++ front-end, into the back-end exception handling framework.

The basic mechanism of exception handling in the back-end is unwind-protect a la elisp. This is a general, robust, and language independent representation for exceptions.

The C++ front-end exceptions are mapping into the unwind-protect semantics by the C++ front-end. The mapping is describe below.

When -frtti is used, rtti is used to do exception object type checking, when it isn't used, the encoded name for the type of the object being thrown is used instead. All code that originates exceptions, even code that throws exceptions as a side effect, like dynamic casting, and all code that catches exceptions must be compiled with either -frtti, or -fno-rtti. It is not possible to mix rtti base exception handling objects with code that doesn't use rtti. The exceptions to this, are code that doesn't catch or throw exceptions, catch (...), and code that just rethrows an exception.

Currently we use the normal mangling used in building functions names (int's are "i", const char * is PCc) to build the non-rtti base type descriptors for exception handling. These descriptors are just plain NULL terminated strings, and internally they are passed around as char *.

In C++, all cleanups should be protected by exception regions. The region starts just after the reason why the cleanup is created has ended. For example, with an automatic variable, that has a constructor, it would be right after the constructor is run. The region ends just before the finalization is expanded. Since the backend may expand the cleanup multiple times along different paths, once for normal end of the region, once for non-local gotos, once for returns, etc, the backend must take special care to protect the finalization expansion, if the expansion is for any other reason than normal region end, and it is `inline' (it is inside the exception region). The backend can either choose to move them out of line, or it can created an exception region over the finalization to protect it, and in the handler associated with it, it would not run the finalization as it otherwise would have, but rather just rethrow to the outer handler, careful to skip the normal handler for the original region.

In Ada, they will use the more runtime intensive approach of having fewer regions, but at the cost of additional work at run time, to keep a list of things that need cleanups. When a variable has finished construction, they add the cleanup to the list, when the come to the end of the lifetime of the variable, the run the list down. If the take a hit before the section finishes normally, they examine the list for actions to perform. I hope they add this logic into the back-end, as it would be nice to get that alternative approach in C++.

On an rs6000, xlC stores exception objects on that stack, under the try block. When is unwinds down into a handler, the frame pointer is adjusted back to the normal value for the frame in which the handler resides, and the stack pointer is left unchanged from the time at which the object was thrown. This is so that there is always someplace for the exception object, and nothing can overwrite it, once we start throwing. The only bad part, is that the stack remains large.

The below points out some flaws in g++'s exception handling, as it now stands.

Only exact type matching or reference matching of throw types works when -fno-rtti is used. Only works on a SPARC (like Suns), i386, arm and rs6000 machines. Partial support is also in for alpha, hppa, m68k and mips machines, but a stack unwinder called __unwind_function has to be written, and added to libgcc2 for them. See below for details on __unwind_function. All completely constructed temps and local variables are cleaned up in all unwinded scopes. Completed parts of partially constructed objects are cleaned up with the exception that partially built arrays are not cleaned up as required. Don't expect exception handling to work right if you optimize, in fact the compiler will probably core dump. If two EH regions are the exact same size, the backend cannot tell which one is first. It punts by picking the last one, if they tie. This is usually right. We really should stick in a nop, if they are the same size.

When we invoke the copy constructor for an exception object because it is passed by value, and if we take a hit (exception) inside the copy constructor someplace, where do we go? I have tentatively choosen to not catch throws by the outer block at the same unwind level, if one exists, but rather to allow the frame to unwind into the next series of handlers, if any. If this is the wrong way to do it, we will need to protect the rest of the handler in some fashion. Maybe just changing the handler's handler to protect the whole series of handlers is the right way to go. This part is wrong. We should call terminate if an exception is thrown while doing things like trying to copy the exception object.

Exception specifications are handled syntax wise, but not semantic wise. build_exception_variant should sort the incoming list, so that is implements set compares, not exact list equality. Type smashing should smash exception specifications using set union.

Thrown objects are allocated on the heap, in the usual way, but they are never deleted. They should be deleted by the catch clauses. If one runs out of heap space, throwing an object will probably never work. This could be relaxed some by passing an __in_chrg parameter to track who has control over the exception object.

When the backend returns a value, it can create new exception regions that need protecting. The new region should rethrow the object in context of the last associated cleanup that ran to completion.

The __unwind_function takes a pointer to the throw handler, and is expected to pop the stack frame that was built to call it, as well as the frame underneath and then jump to the throw handler. It must not change the three registers allocated for the pointer to the exception object, the pointer to the type descriptor that identifies the type of the exception object, and the pointer to the code that threw. On hppa, these are %r5, %r6, %r7. On m68k these are a2, a3, a4. On mips they are s0, s1, s2. On Alpha these are $9, $10, $11. It takes about a day to write this routine, if someone wants to volunteer to write this routine for any architecture, exception support for that architecture will be added to g++. Please send in those code donations.

The backend must be extended to fully support exceptions. Right now there are a few hooks into the alpha exception handling backend that resides in the C++ frontend from that backend that allows exception handling to work in g++. An exception region is a segment of generated code that has a handler associated with it. The exception regions are denoted in the generated code as address ranges denoted by a starting PC value and an ending PC value of the region. Some of the limitations with this scheme are:

The backend replicates insns for such things as loop unrolling and function inlining. Right now, there are no hooks into the frontend's exception handling backend to handle the replication of insns. When replication happens, a new exception region descriptor needs to be generated for the new region.
The backend expects to be able to rearrange code, for things like jump optimization. Any rearranging of the code needs have exception region descriptors updated appropriately.
The backend can eliminate dead code. Any associated exception region descriptor that refers to fully contained code that has been eliminated should also be removed, although not doing this is harmless in terms of semantics.

The above is not meant to be exhaustive, but does include all things I have thought of so far. I am sure other limitations exist.

Next: Free Store Up: Top