//===----------------------------------------------------------------------===// // Representing and optimizing arithmetic based on integer overflow //===----------------------------------------------------------------------===// Feb 7, 2009 - Initial Revision This note talks about a way we could model, reason about and optimize based on integer overflow information. LLVM 2.5 supports several sorts of overflow handling: 1) all integer arithmetic operations are guaranteed to be well defined in overflow situations, behaving according to 2's complement integer semantics. 2) The getelementptr instruction is always assumed to be undefined on overflow. 3) LLVM has intrinsics for doing an integer computation and determining whether or not it overflowed. These intrinsics return a result and a bool indicating whether overflow happened. This set of capability is an important first step, but it misses several opportunities and has at least one problem. The correctness problem that we theoretically suffer but does not seem to impact code in practice is that we "raise" arithmetic (which has defined overflow semantics) to getelementptr instructions in some cases (which is undefined on overflow). This could manifest in a miscompilation in contrived cases. Opportunities we are missing: 1) In C, signed integers are undefined on overflow. Capturing this property is very important for loop optimizers in some cases (e.g. PR3328). For example, if overflow is defined, then you can't analyze the trip count of a loop like "for (i = 0; i <= N; ++i)" because if N is maxint, the loop iterates infinitely. If overflow is undefined, it is easy to analyze this trip count. Once we support this, we would also want to support the GCC -fwrapv option. 2) We currently don't have a good way to model saturation. Saturated arithmetic is somewhat rare for scalars, but is very common in vector ISAs. Adding this support to the IR would allow elimination of a bunch of target-specific intrinsics (they could lower to normal arithmetic instead). Note that the equivalent of -ftrapv is currently implementable with our existing "do arithmetic and tell me if it overflowed" support. I strongly believe that this is lowering problem that the front-end should take care of. //===----------------------------------------------------------------------===// // Implementation Approach //===----------------------------------------------------------------------===// In my opinion, the only reasonable way to implement this is as a per-operation tag that specifies the overflow behavior of operations (add/sub/mul/div/gep). As such, I'd suggest the following changes: 1. Split add/sub/mul into add/fadd, sub/fsub, mul/fmul operations. Doing this will also improve consistency with the div and rem instructions. 2. Introduce a new "overflow behavior" enum, and stick it in the integer operation's "SubClassData" field. The overflow behavior should have these possibilities: Wrapping // Op is fully defined to use 2's complement overflow. UndefinedSigned // Op is undefined on signed overflow UndefinedUnsigned // Op is undefined on unsigned overflow SaturatedSigned // Op saturates on signed overflow SaturatedUnsigned // Op saturates on unsigned overflow GEP should not support saturation, but the other operations should. 3. Introduce support to the CBE and legalizer to lower these into horrible code sequences that are valid but slow. Over time, this can be improved. 4. Update optimizations to only apply when safe: for example, optimizations that assume GEP overflow is undefined should be updated to check the flag. Update other optimizations to propagate overflow behavior, the default (wrapping) is always safe to preserve for the current optimizers. 5. Change front-ends to start tagging operations more aggressively (e.g. signed integers as UndefinedSigned when not in -fwrapv mode. 6. Implement lots of new optimizations (e.g. x*8/8 is safe when the mul and div are undefined on overflow)! Also infer cases where overflow can't happen and convert defined-on-overflow operations to undefined-on-overflow when it is provable it can't overflow (e.g. for (i = 0; i < N; ++i) the ++ can't overflow). I'd strongly suggest tackling just the wrapping/unsigned behaviors in the first pass and adding saturation as a later pass. Many optimizations in (e.g. instcombine) are not safe for saturation, and getting them exactly right will require some hacking. An initial cut would just have InstCombine's "visitAdd" method just refuse to touch saturated adds though.