//===----------------------------------------------------------------------===// // LLVM Exception Handling Changes //===----------------------------------------------------------------------===// 8/25/2004 - initial revision The LLVM invoke/unwind mechanism is totally adequate for representing EH in C and C++, but is lacking for languages like Java/C# and other common languages where random instructions can cause catchable exceptions to be thrown. In particular, consider the Java expression "X/Y". Because this operation throws an exception if Y is 0, the LLVM optimizer should not delete it. Also, if there is a try block in the method containing the divide, the exception must be caught by the local handler (and we have no way to represent this directly). Currently, to support languages like this (which are becoming MORE, not LESS common), the front-end has to emit explicit checking code: if (Y == 0) call ThrowNewDivideByZeroError(); // Turns into an invoke as necessary. X/Y; This bloats the code and is quite slow: it's not feasible to use hardware traps to implement these checks. Even with C and c++ exceptions, invoke is not wonderful in certain situations. For example, consider the following function: void foo() { X a, b, c, d, e, f; // Assume class X has a dtor. bar(); // bar can throw. } In this case, if bar throws, we have to run the dtors for f,e,d,c,b, and a in that order. However, if any of the dtors throws, we must call std::unexpected, which will probably terminate the program. This causes a dense sequence of single instruction BasicBlocks, that looks like this: invoke bar() to cont unwind %BarThrew BarThrew: invoke ~X(&F) to cont2 unwind %unexp cont2: invoke ~X(&E) to cont3 unwind %unexp cont3: invoke ~X(&D) to cont4 unwind %unexp cont4: invoke ~X(&C) to cont5 unwind %unexp cont5: invoke ~X(&B) to cont6 unwind %unexp cont6: invoke ~X(&A) to cont7 unwind %unexp cont7: unwind unexp: call std::unexpected() unwind This is clearly not good. Notice how all of the sequential blocks have the same exception destination. Finally, it is problematic in LLVM for there to be two call instructions: one that can throw and one that can't (invoke/call). Because there are two of them, and a lot of code wants to treat them generically, we have classes like CallSite that have grown to add a layer of abstraction. This layer slows the compiler down. Additionally "invoke" is a really poor name. //===----------------------------------------------------------------------===// // Suggested approach to eliminate the invoke instruction // I suggest that we extend LLVM to have follow the model that many Java and C# compilers do. Extend the BasicBlock class to allow for early exits from the middle of the basic block if an exception is thrown. This means that we can eliminate the invoke instruction (always using call), and have code like this: BarThrew: unwind to %unexp call ~X(&F) call ~X(&E) call ~X(&D) call ~X(&C) call ~X(&B) call ~X(&A) unwind unexp: call std::unexpected() unwind Note how the basic block itself has a single unwind destination that may be specified for it (if not specified, the unwind request is propagated outside of the function [aka, is not caught]). Note that this is an extremely nontrivial change to LLVM, which will have far-reaching effects on many aspects of the system. However, this is needed to be able to efficiently support languages like java. Given an unwind destination per-basic block, it's trivial to have catch blocks for divide instructions and other traping operations: Foo: unwind to %catchit %X = div int 1, 0 ret void %catchit: control comes here Notice also that the CFG simplify pass will not be able to merge basic blocks in some situations. In particular, if two different blocks have different handlers, they will have to be kept separate. In particular, the following two blocks could not be merged: A: unwind to %handler1 call foo() br label %B B: unwind to %handler2 call foo() br label %C This corresponds roughly to regions in different try blocks. Note that despite the fact that we can't merge all blocks like this, we can merge a LOT more than we can in the current system (where each potentially trapping operation is REQUIRED to become a new basic block. //===----------------------------------------------------------------------===// // Indicating what can trap and what cannot // The one thing that the above description has lost is the ability to mark what can and cannot trap. In particular, we need to be able to represent what operations have defined results (i.e., the cause an unwind) if they trap. The most straight-forward way to implement this is to add a bit to each instruction that can possibly cause traps (div/rem/load/store/call/etc...). If the bit is set to true and the exception traps at runtime, control is require to be reflected in a stack unwind operation. In practice, this means that the runtime needs to be able to find the handler in scope when a trap happens. Note that this bit is sufficient to represent many possible scenarios. In particular, a Java compiler would mark just about every load, store and other exception inducing operation as traping. If a load is marked potentially trapping, the optimizer is required to preserve it (even if its value is not used) unless it can prove that it dynamically cannot trap. In practice, this means that standard LLVM analyses would be used to prove that exceptions cannot happen, then clear the bit. As the bits are cleared, exception handlers can be deleted and dead loads (for example) can also be removed. This bit also exposes a lot of flexibility and control to the front-end. In particular, gcc supports the -fhonor-snans flag, which would enable the trap bit on (basically) all floating point operations. The presense of this flag would trigger generation of the fcomi instruction instead of fucomi (on X86) for example, and would inhibit optimizations that would break the semantics of the program. To recap, if the "trap" bit is set on an instruction, this means that any exception (either an explicit unwind or a hardware trap) will allow the runtime library to find the address of the unwind handler. If the bit is clear, this means that the behavior of the program is undefined if an exception is thrown, and that the runtime library probably won't be able to find the address of an unwind handler. This bit can be cleared if the source language itself does not define what happens in the case of a hardware trap (like C), or if static analysis can prove that no trap or exception can occur (e.g. the -prune-eh pass).