//===----------------------------------------------------------------------===//
//                       LLVM Exception Handling Changes
//===----------------------------------------------------------------------===//

8/25/2004 - initial revision

The LLVM invoke/unwind mechanism is totally adequate for representing EH in C
and C++, but is lacking for languages like Java/C# and other common languages
where random instructions can cause catchable exceptions to be thrown.  In
particular, consider the Java expression "X/Y".  Because this operation throws
an exception if Y is 0, the LLVM optimizer should not delete it.  Also, if
there is a try block in the method containing the divide, the exception must
be caught by the local handler (and we have no way to represent this directly).

Currently, to support languages like this (which are becoming MORE, not LESS
common), the front-end has to emit explicit checking code:

  if (Y == 0)
    call ThrowNewDivideByZeroError();  // Turns into an invoke as necessary.
  X/Y;

This bloats the code and is quite slow: it's not feasible to use hardware traps
to implement these checks.

Even with C and c++ exceptions, invoke is not wonderful in certain situations.
For example, consider the following function:

void foo() {
  X a, b, c, d, e, f;   // Assume class X has a dtor.
  bar();                // bar can throw.
}

In this case, if bar throws, we have to run the dtors for f,e,d,c,b, and a in
that order.  However, if any of the dtors throws, we must call std::unexpected,
which will probably terminate the program.  This causes a dense sequence of
single instruction BasicBlocks, that looks like this:

   invoke bar() to cont unwind %BarThrew

BarThrew:
   invoke ~X(&F) to cont2 unwind %unexp
cont2:
   invoke ~X(&E) to cont3 unwind %unexp
cont3:
   invoke ~X(&D) to cont4 unwind %unexp
cont4:
   invoke ~X(&C) to cont5 unwind %unexp
cont5:
   invoke ~X(&B) to cont6 unwind %unexp
cont6:
   invoke ~X(&A) to cont7 unwind %unexp
cont7:
   unwind
unexp:
   call std::unexpected()
   unwind


This is clearly not good.  Notice how all of the sequential blocks have the
same exception destination.

Finally, it is problematic in LLVM for there to be two call instructions: one
that can throw and one that can't (invoke/call).  Because there are two of them,
and a lot of code wants to treat them generically, we have classes like CallSite
that have grown to add a layer of abstraction.  This layer slows the compiler
down.  Additionally "invoke" is a really poor name.


//===----------------------------------------------------------------------===//
// Suggested approach to eliminate the invoke instruction
//

I suggest that we extend LLVM to have follow the model that many Java and C#
compilers do.  Extend the BasicBlock class to allow for early exits from the
middle of the basic block if an exception is thrown.  This means that we can
eliminate the invoke instruction (always using call), and have code like this:

BarThrew: unwind to %unexp
   call ~X(&F)
   call ~X(&E)
   call ~X(&D)
   call ~X(&C)
   call ~X(&B)
   call ~X(&A)
   unwind
unexp:
   call std::unexpected()
   unwind

Note how the basic block itself has a single unwind destination that may be
specified for it (if not specified, the unwind request is propagated outside
of the function [aka, is not caught]).

Note that this is an extremely nontrivial change to LLVM, which will have
far-reaching effects on many aspects of the system.  However, this is needed
to be able to efficiently support languages like java.  Given an unwind
destination per-basic block, it's trivial to have catch blocks for divide 
instructions and other traping operations:

Foo: unwind to %catchit
  %X = div int 1, 0
  ret void
%catchit:
  control comes here

Notice also that the CFG simplify pass will not be able to merge basic
blocks in some situations.  In particular, if two different blocks have
different handlers, they will have to be kept separate.  In particular,
the following two blocks could not be merged:

A: unwind to %handler1
   call foo()
   br label %B

B: unwind to %handler2
   call foo()
   br label %C

This corresponds roughly to regions in different try blocks.  Note that
despite the fact that we can't merge all blocks like this, we can merge
a LOT more than we can in the current system (where each potentially
trapping operation is REQUIRED to become a new basic block.


//===----------------------------------------------------------------------===//
// Indicating what can trap and what cannot
//

The one thing that the above description has lost is the ability to mark what
can and cannot trap.  In particular, we need to be able to represent what
operations have defined results (i.e., the cause an unwind) if they trap.  The
most straight-forward way to implement this is to add a bit to each instruction
that can possibly cause traps (div/rem/load/store/call/etc...).  If the bit is
set to true and the exception traps at runtime, control is require to be
reflected in a stack unwind operation.  In practice, this means that the
runtime needs to be able to find the handler in scope when a trap happens.

Note that this bit is sufficient to represent many possible scenarios.  In
particular, a Java compiler would mark just about every load, store and other
exception inducing operation as traping.  If a load is marked potentially
trapping, the optimizer is required to preserve it (even if its value is not
used) unless it can prove that it dynamically cannot trap.  In practice, this
means that standard LLVM analyses would be used to prove that exceptions
cannot happen, then clear the bit.  As the bits are cleared, exception handlers
can be deleted and dead loads (for example) can also be removed.

This bit also exposes a lot of flexibility and control to the front-end.  In
particular, gcc supports the -fhonor-snans flag, which would enable the trap
bit on (basically) all floating point operations.  The presense of this flag
would trigger generation of the fcomi instruction instead of fucomi (on X86)
for example, and would inhibit optimizations that would break the semantics of
the program.

To recap, if the "trap" bit is set on an instruction, this means that any
exception (either an explicit unwind or a hardware trap) will allow the
runtime library to find the address of the unwind handler.  If the bit is
clear, this means that the behavior of the program is undefined if an
exception is thrown, and that the runtime library probably won't be able to
find the address of an unwind handler.  This bit can be cleared if the source
language itself does not define what happens in the case of a hardware trap
(like C), or if static analysis can prove that no trap or exception can occur
(e.g. the -prune-eh pass).