//===----------------------------------------------------------------------===// // Explicitly Managed Stack Frames in LLVM //===----------------------------------------------------------------------===// Sep 5, 2004 - Initial revision Some languages are not content to use the program stack to run their programs: for whatever reason, they want to manage their stacks explicitly (common reasons include garbage collected closures, and languages that permit forking stacks, though there are probably countless others). As of this writing, LLVM provides no mechanisms to implement this, and this is a deficiency in LLVM. This note describes how support for correct tail calls permits the program to explicit manage its own stack in an arbitrary way (assuming that that machine stack still exists, and any calls to external C functions use it). //===----------------------------------------------------------------------===// // The idea // The idea is trivially simple: given guaranteed efficient tail call support, all calls must use the explicitly managed stack, and all calls are transformed by the front-end into tail calls. This first transformation is trivial, as is the second. The idea is to transform functions to use continuations instead of standard calls for non-tail function calls in the program. Consider this pseudo-function: int foo() { int X = 4 int Y = bar(14); return X*Y; } Assuming the function is unoptimized, the call to bar is not a tail-call: it does not occur in the tail position. Also, the call to bar may well access the value contained by X through language-specific means. Because of this, the stack frame for foo must be explicitly manages (e.g. by a garbage collector). It is straight-forward to transform foo into the following function: int foo() { fooSF = gcalloc { X: int; Y: int } fooSF->X = 4; fooSF->Y = bar(14); return fooSF->X * fooSF->Y; } This successfully decouples the storage of X from the stack to an explicitly managed piece of memory (in this case the GC heap), but the call to bar still occurs on the system stack (e.g. 14 and the return addrss in foo may be pushed onto the stack). To rectify this, we transform the function into continuation passing style, a well known process that I will not describe here in any more details (google can tell you lots about it, I'm sure). Transformed, this function might look like this: int foo(continuation C) { fooSF = gcalloc { C: continuation; X: int; Y: int } fooSF->C = C; fooSF->X = 4; continuation Next = { foo2, fooSF }; return bar(Next, 14); } int foo2(continuation C, int Y) { fooSF = C->SF; fooSF->Y = Y; continuation Next = fooSF->C; return Next->F(Next, fooSF->X*fooSF->Y); } To someone unfamiliar with CPS, the transformation above will look very strange. However, note that now both calls in the resultant functions occur as tail calls. The result of applying these two transformations to the whole program will result in a program that runs in a constant amount of system stack space, and whose program stack can be managed by any means required by the source-language.