//===----------------------------------------------------------------------===// // LLVM Inline Asm Support //===----------------------------------------------------------------------===// 11/20/2005 - initial revision 12/23/2006 - Various minor tweaks 12/26/2006 - Major edits. Native support for inline assembly is a tricky issue, but one which LLVM certainly needs to support. The 'asm' keyword is used for many purposes in GCC, including (but probably not limited to): 1. Defining "top-level" asm blocks at file scope. 2. Adding inline asm blocks to functions. 3. Assigning explicit registers to automatic variables. 4. Renaming global variables and functions to a specific name. 5. Assigning (and reserving) global variables for specific physical registers. This proposal is concerned with addressing issues #1-3. LLVM already supports #4, and #5 will be left to a separate proposal. //===----------------------------------------------------------------------===// // Adding File-Scope Asm Declarations to LLVM //===----------------------------------------------------------------------===// File scope asm blocks can be handled with a single string that is declared with syntax like this: module asm "abc" To handle this, Module can have a single std::string to hold file-scope asms. When linking, or if a .c file has multiple file-scope asms, they can be concatenated together to form a single one. Note that no attempt is made to preserve locality between file-scope asm blocks and the functions that they are defined around. //===----------------------------------------------------------------------===// // Adding inline asm Expressions to LLVM //===----------------------------------------------------------------------===// To support inline assembly blocks, I propose adding a new LLVM InlineAsm class, which is a unique'd top-level object. In a .ll file, an InlineAsm looks like this: void(int) asm [sideeffect] "xyz", "r" This declares an inline asm block that has 'xyz' as the asm string, and 'r' as the constraint list. Since InlineAsm's are Value*'s in the LLVM system, they have types, and (like globals and functions) the value implicitly has a pointer appended to it (thus the actual type is "void(int)*" in this case). Inline asm strings may be declared as 'sideeffect', in the same sense as GCC inline asms are volatile. InlineAsms may *only* be used as the callee operand of a call instruction, in the same way that intrinsic functions may only be used that way. It is not legal to do anything else with an inline asm, just like it is not legal to do anything else with an intrinsic function address. Because inline asms are used as callees, any transformation that treats a call to an unknown callee will handle them conservatively. Note that InlineAsm does not derive from GlobalValue (nor any other class below Value), thus is has no linkage and does not participate in linking at all. //===----------------------------------------------------------------------===// // The Asm String //===----------------------------------------------------------------------===// The format of the asm string is the same as that used by the LLVM asmwriter generator. Operands are denoted with $num, '$' characters are escaped with $$, and asm variants are denoted with {x|y|z}. This syntax differs from GNU asm syntax, which uses %'s for escaping. This implies that the C-frontend will need to do some simple translations of %0/%1 -> $0/$1 or something, and escapes need to be translated. //===----------------------------------------------------------------------===// // The Constraint List //===----------------------------------------------------------------------===// The constraint list is a canonicalized and unified version of the GNU C Compiler constraint list for a target. For example, consider this i386 inline asm: asm volatile ("xorl %%eax, %%eax \n\t" "xorl %%esi, %%esi \n\t" "addl %1, %0 \n\t" "addl %2, %0 \n\t" "addl %3, %0 \n\t" "addl %4, %0 \n\t" "addl %5, %0 \n\t" "addl %6, %0" : "+r" (out) : "r" (a), "r" (b), "r" (c), "g" (d), "g" (e), "g" (f) : "%eax", "%esi"); First, the "+" constraint is expanded out into two constraints: an '=' constraint and an input constraint of "0". The clobbers are stripped to their bare register names and prefixed with '~' to indicate that they are clobbers. Finally, the entire list of constraints is unified together into one comma- separated string, yielding "=r,0,r,r,r,g,g,g,~eax,~esi". As is standard with the GNU convention, output constraints must start with = as their first character. Note that we also want to define a way to assign a value to any *named* register without using the GNU constraints. It should be possible to assign an input to 'r13' on PowerPC (for example). GCC targets that have constraint letters for single registers (e.g. a,b,c,d on X86) should be transformed to refer to the single register, instead of making the LLVM backend decode it. //===----------------------------------------------------------------------===// // The Type //===----------------------------------------------------------------------===// The type is formed by adding one return value per register output constraint, and one argument per input constraint. Memory outputs are implemented by adding an argument that specifies the address of the memory to update. The result type of the example above is: int(int,int,int,int,int,int,int) Note that this approach depends on MultipleReturnValues.txt to be implemented, as inline-asms may have multiple register outputs (but see below). //===----------------------------------------------------------------------===// // Variables with Explicitly Assigned Registers //===----------------------------------------------------------------------===// GCC inline asm does not have the ability to assign an expression to an arbitrary machine register. Because we do/will (see above) this means that we can translate code like this: register int *p1 asm ("r0") = x; register int *p2 asm ("r1") = y; register int *result asm ("r0"); asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2)); into a single asm block that specifies the input and output registers as part of the constraint list. //===----------------------------------------------------------------------===// // Implementation in the Code Generator //===----------------------------------------------------------------------===// Inline asm blocks are directly turned into SelectionDAG nodes and MachineInstrs with the appropriate number of inputs and outputs. //===----------------------------------------------------------------------===// // Raising inline-asm blocks //===----------------------------------------------------------------------===// It would be nice for Targets to be able to implement a TargetInlineAsm interface which the front-end and optimizer can use to understand inline asms for the target. One particularly interesting use for this interface is to 'raise' inline asms that are simple enough to LLVM code. Many inline asms are very stereotyped. Inline asm is commonly used for atomic memory access, use of instructions (like popcnt or ctlz) that don't have a direct C analogue, etc. It would be better to raise these to LLVM code and intrinsics where possible. This is particularly important for the JIT, which will not understand inline asm blocks. Note that (if sufficient trickiness is desired) the JIT could actually extract inline asm blocks out into their own functions, send them through a the static compiler, an assembler, then dynamically load the result. This would be a bit of work to get right, but would allow the JIT to (slowly) handle un'raisable' inline asm correctly. //===----------------------------------------------------------------------===// // Implementation in the short-term //===----------------------------------------------------------------------===// In the short-term, it would be nice to get inline asm that is operational, even if it is not optimal. To do this, we can make the following concessions: 1. Instead of adding support for multiple return values, we can define a new output type (denoted with a leading == instead of =). These outputs indicate a register output constraint. Unlike normal register output constraints though, this one is not returned. Instead, a memory address is passed in and and implicit store to that address is issued after the asm block. 2. Instead of teaching the LLVM code generators about all of the register constraints that GCC has, we can have the initial stages of the codegen actually do complete register assignment for the asm. For constraints we do directly support (e.g. "r") we can choose to not force specific regs, letting the RA do its thing. The TargetInlineAsm interface should indicate a register class (if we have one) or a list of elements (if we don't) to use for each target register class. This approach also makes it so that each RA doesn't have to handle complexities like '&' and others. In time, the implementation can be refined to allow better code to be generated.