//===----------------------------------------------------------------------===// // Loop Optimizer Notes //===----------------------------------------------------------------------===// 8/18/2004 - Initial revision 10/25/2004 - Added example Loop optimizations are a very important class of program transformation for many programs, and they are also very diverse in many ways. To support them effectively, we need first-class support for them in the PassManager, and we need to provide the right infrastructure for making them trivial to implement. In particular, most loop optimizations need alias analysis support, and almost all (but not all) operate on the loop nest from inner loops to outer loops (bottom-up on the loop hierarchy tree). This argues for the following support: //===----------------------------------------------------------------------===// // The LoopPass Class The LoopPass class is the base class used by all passes that conform to the requirements imposed here. Deriving from LoopPass makes a transformation a "loop optimization" in the LLVM system. The LoopPass class requires the runOnLoop method to be implemented, and can optional implement the runOnFunctionBody method, with the following signatures. bool runOnLoop(Loop &L, LoopPassManager &LPM); bool runOnFunctionBody(Function &F, LoopPassManager &LPM); The runOnLoop method is defined to return true if a change to the code was made, and the runOnLoop method is allowed to loop at the specified loop and any (inner) loops of it (but not at code outside of the loop). This means that a loop interchange pass, for example, would take action when run on the outer loop, not when run on the inner loop. The runOnFunctionBody is used by passes that want to see the whole function as well as loop bodies. A LoopFusion pass would be an example of this. The LPM argument is used for transformations that need to modify the loop nest. For example, if the LoopUnrolling pass completely unrolls a loop, it might modify the program, then return with "return LPM.removeLoop(L);", to inform the LPM that the loop has been eliminated from the nesting tree. The LPM also needs to be informed of all other changes to the loop nest, including changes loop fission optimizations (like unswitching), loop fusion optimizations, etc. //===----------------------------------------------------------------------===// // The LoopPassManager Class The LoopPassManager class is a bit tricky to write, because it has to accept very aggressive changes to the loop nest, and keep making progress. In particular, loop transformations often feed into each other and can require iteration of certain components to work well. To support this, the LPM should probably maintain an ordered worklist of loops to process (a priority queue?), ordered by their depth in the nesting tree. Because we require LoopPasses to only inspect and modify contained loops (not containing loops) of the one they are being run on, they shouldn't be able to modify any loop in the queue, so the queue can't be invalidated by transformations. The other tricky part is that the LoopPassManager, with some (potentially sizable) list of LoopPasses to run, needs to be able to interrupt the sequence of passes at any time (due to a loop being deleted) and a LoopPass should also be able to request that a loop be reprocessed from the beginning (causing it to be pushed onto the front of the priority queue). LoopPasses could obviously cause infinite loops with this sort of behavior (always marking the same loop to be reprocessed), but this should not be used commonly, and only be used when substantial surgery is performed on a loop. //===----------------------------------------------------------------------===// // Important LoopPass Analyses An important analysis pass is the LoopAliasAnalysis pass. This pass provides an AliasSetTracker object for the current loop, as well as exposing the normal AA and Mod/Ref functions. This interface provides a concise and easy way to inspect and deal with alias information in the current loop, and is implemented the same way that the LICM pass currently does has it. In this framework, dependence analysis is just another LLVM analysis that can (like AliasAnalysis) implemented in many different ways, but all provide the same analysis interface. Unlike The AliasAnalysis interface though, DependenceAnalysis should probably specify the representation of the analysis result instead of using virtual methods to access the representation (this how the CallGraph class will operate in the future as well). //===----------------------------------------------------------------------===// // Important Loop Optimizations The current LLVM loop transformations should be upgraded and enhanced to work with this framework, which will probably substantially simplify them. For example the LICM pass can be broken up into the following passes: LICM (hoisting and sinking of invariants) LoopPromote (promote memory locations to registers), and LoopAliasAnalysis (provide the AliasSetTracker and AliasAnalysis interface for all loop opts). In addition, the indvar simplify pass, unroller, and loop unswitcher will all trivially fit into this model. To make the LoopUnswitcher actually useful (which it is not now, due to the lack of this infrastructure), we need passes such as LoopSimplifyCFG and LoopInstCombine. These passes operate just like their FunctionPass counterparts, but do so a Loop at a time. Once the basics are implemented, we can implement some other transformations, such as transformation of memcpy/memset loops into their equivalent function, unimodular loop transformation support, etc. //===----------------------------------------------------------------------===// // Secondary loop optimizations 179.art has a loop that looks like this: for (i = A; i < B; ++i) if (i == C && ...) That should be changed to: if (C >= A && C < B && ...) { i = C; stuff } ------- Occurs often in some programs: Turn: for(i = 0, NotFound = 1; NotFound && i