//===----------------------------------------------------------------------===//
// Representing and optimizing arithmetic based on integer overflow
//===----------------------------------------------------------------------===//

Feb 7, 2009 - Initial Revision

This note talks about a way we could model, reason about and optimize based on
integer overflow information.

LLVM 2.5 supports several sorts of overflow handling:

1) all integer arithmetic operations are guaranteed to be well defined in
   overflow situations, behaving according to 2's complement integer semantics.
2) The getelementptr instruction is always assumed to be undefined on overflow.
3) LLVM has intrinsics for doing an integer computation and determining whether
   or not it overflowed.  These intrinsics return a result and a bool
   indicating whether overflow happened.

This set of capability is an important first step, but it misses several
opportunities and has at least one problem.  The correctness problem that we
theoretically suffer but does not seem to impact code in practice is that we
"raise" arithmetic (which has defined overflow semantics) to getelementptr
instructions in some cases (which is undefined on overflow).  This could
manifest in a miscompilation in contrived cases.

Opportunities we are missing:

1) In C, signed integers are undefined on overflow.  Capturing this property
   is very important for loop optimizers in some cases (e.g. PR3328).  For 
   example, if overflow is defined, then you can't analyze the trip count of 
   a loop like "for (i = 0; i <= N; ++i)" because if N is maxint, the loop 
   iterates infinitely.  If overflow is undefined, it is easy to analyze this 
   trip count.  Once we support this, we would also want to support the GCC 
   -fwrapv option.
2) We currently don't have a good way to model saturation.  Saturated
   arithmetic is somewhat rare for scalars, but is very common in vector ISAs.
   Adding this support to the IR would allow elimination of a bunch of
   target-specific intrinsics (they could lower to normal arithmetic instead).


Note that the equivalent of -ftrapv is currently implementable with our
existing "do arithmetic and tell me if it overflowed" support.  I strongly
believe that this is lowering problem that the front-end should take care of.

//===----------------------------------------------------------------------===//
// Implementation Approach
//===----------------------------------------------------------------------===//

In my opinion, the only reasonable way to implement this is as a per-operation
tag that specifies the overflow behavior of operations (add/sub/mul/div/gep).
As such, I'd suggest the following changes:

1. Split add/sub/mul into add/fadd, sub/fsub, mul/fmul operations.  Doing this
   will also improve consistency with the div and rem instructions.
2. Introduce a new "overflow behavior" enum, and stick it in the integer
   operation's "SubClassData" field.  The overflow behavior should have these
   possibilities:
     Wrapping          // Op is fully defined to use 2's complement overflow.
     UndefinedSigned   // Op is undefined on signed overflow
     UndefinedUnsigned // Op is undefined on unsigned overflow
     SaturatedSigned   // Op saturates on signed overflow
     SaturatedUnsigned // Op saturates on unsigned overflow
   
   GEP should not support saturation, but the other operations should.
3. Introduce support to the CBE and legalizer to lower these into horrible code
   sequences that are valid but slow.  Over time, this can be improved.
4. Update optimizations to only apply when safe: for example, optimizations
   that assume GEP overflow is undefined should be updated to check the flag.
   Update other optimizations to propagate overflow behavior, the default
   (wrapping) is always safe to preserve for the current optimizers.
5. Change front-ends to start tagging operations more aggressively (e.g. signed
   integers as UndefinedSigned when not in -fwrapv mode.
6. Implement lots of new optimizations (e.g. x*8/8 is safe when the mul and div
   are undefined on overflow)!  Also infer cases where overflow can't happen
   and convert defined-on-overflow operations to undefined-on-overflow when it
   is provable it can't overflow (e.g. for (i = 0; i < N; ++i) the ++ can't
   overflow).

I'd strongly suggest tackling just the wrapping/unsigned behaviors in the first
pass and adding saturation as a later pass.  Many optimizations in (e.g.
instcombine) are not safe for saturation, and getting them exactly right will
require some hacking.  An initial cut would just have InstCombine's "visitAdd"
method just refuse to touch saturated adds though.