//===----------------------------------------------------------------------===//
//                          LLVM Inline Asm Support
//===----------------------------------------------------------------------===//

11/20/2005 - initial revision
12/23/2006 - Various minor tweaks
12/26/2006 - Major edits.

Native support for inline assembly is a tricky issue, but one which LLVM
certainly needs to support.  The 'asm' keyword is used for many purposes in GCC,
including (but probably not limited to):

1. Defining "top-level" asm blocks at file scope.
2. Adding inline asm blocks to functions.
3. Assigning explicit registers to automatic variables.
4. Renaming global variables and functions to a specific name.
5. Assigning (and reserving) global variables for specific physical registers.

This proposal is concerned with addressing issues #1-3.  LLVM already supports
#4, and #5 will be left to a separate proposal.


//===----------------------------------------------------------------------===//
//  Adding File-Scope Asm Declarations to LLVM
//===----------------------------------------------------------------------===//

File scope asm blocks can be handled with a single string that is declared with
syntax like this:

module asm "abc"

To handle this, Module can have a single std::string to hold file-scope asms.
When linking, or if a .c file has multiple file-scope asms, they can be 
concatenated together to form a single one.  Note that no attempt is made to
preserve locality between file-scope asm blocks and the functions that they are
defined around.


//===----------------------------------------------------------------------===//
//  Adding inline asm Expressions to LLVM
//===----------------------------------------------------------------------===//

To support inline assembly blocks, I propose adding a new LLVM InlineAsm class,
which is a unique'd top-level object.  In a .ll file, an InlineAsm looks like 
this:

   void(int) asm [sideeffect] "xyz", "r"

This declares an inline asm block that has 'xyz' as the asm string, and 'r' as 
the constraint list.  Since InlineAsm's are Value*'s in the LLVM system, they 
have types, and (like globals and functions) the value implicitly has a pointer
appended to it (thus the actual type is "void(int)*" in this case).  Inline asm
strings may be declared as 'sideeffect', in the same sense as GCC inline asms
are volatile.

InlineAsms may *only* be used as the callee operand of a call instruction, in
the same way that intrinsic functions may only be used that way.  It is not
legal to do anything else with an inline asm, just like it is not legal to do
anything else with an intrinsic function address.  Because inline asms are used
as callees, any transformation that treats a call to an unknown callee will
handle them conservatively.

Note that InlineAsm does not derive from GlobalValue (nor any other class below
Value), thus is has no linkage and does not participate in linking at all.

//===----------------------------------------------------------------------===//
//  The Asm String
//===----------------------------------------------------------------------===//

The format of the asm string is the same as that used by the LLVM asmwriter
generator.  Operands are denoted with $num, '$' characters are escaped with $$,
and asm variants are denoted with {x|y|z}.  This syntax differs from GNU asm
syntax, which uses %'s for escaping.  This implies that the C-frontend will need
to do some simple translations of %0/%1 -> $0/$1 or something, and escapes 
need to be translated.


//===----------------------------------------------------------------------===//
//  The Constraint List
//===----------------------------------------------------------------------===//

The constraint list is a canonicalized and unified version of the GNU C Compiler
constraint list for a target.  For example, consider this i386 inline asm:

  asm volatile ("xorl %%eax, %%eax      \n\t"
                "xorl %%esi, %%esi      \n\t"
                "addl %1, %0            \n\t"
                "addl %2, %0            \n\t"
                "addl %3, %0            \n\t"
                "addl %4, %0            \n\t"
                "addl %5, %0            \n\t"
                "addl %6, %0"
                : "+r" (out)
                : "r" (a), "r" (b), "r" (c), "g" (d), "g" (e), "g" (f)
                : "%eax", "%esi");

First, the "+" constraint is expanded out into two constraints: an '='
constraint and an input constraint of "0".  The clobbers are stripped to their
bare register names and prefixed with '~' to indicate that they are clobbers.
Finally, the entire list of constraints is unified together into one comma-
separated string, yielding "=r,0,r,r,r,g,g,g,~eax,~esi".

As is standard with the GNU convention, output constraints must start with = as
their first character.

Note that we also want to define a way to assign a value to any *named* register
without using the GNU constraints.  It should be possible to assign an input to
'r13' on PowerPC (for example).  GCC targets that have constraint letters for
single registers (e.g. a,b,c,d on X86) should be transformed to refer to the
single register, instead of making the LLVM backend decode it.


//===----------------------------------------------------------------------===//
//  The Type
//===----------------------------------------------------------------------===//

The type is formed by adding one return value per register output constraint,
and one argument per input constraint.  Memory outputs are implemented by
adding an argument that specifies the address of the memory to update.

The result type of the example above is:

int(int,int,int,int,int,int,int)

Note that this approach depends on MultipleReturnValues.txt to be implemented,
as inline-asms may have multiple register outputs (but see below).


//===----------------------------------------------------------------------===//
//  Variables with Explicitly Assigned Registers
//===----------------------------------------------------------------------===//

GCC inline asm does not have the ability to assign an expression to an arbitrary
machine register.  Because we do/will (see above) this means that we can 
translate code like this:

     register int *p1 asm ("r0") = x;
     register int *p2 asm ("r1") = y;
     register int *result asm ("r0");
     asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));

into a single asm block that specifies the input and output registers as part
of the constraint list.


//===----------------------------------------------------------------------===//
//  Implementation in the Code Generator
//===----------------------------------------------------------------------===//

Inline asm blocks are directly turned into SelectionDAG nodes and MachineInstrs
with the appropriate number of inputs and outputs.


//===----------------------------------------------------------------------===//
//  Raising inline-asm blocks
//===----------------------------------------------------------------------===//

It would be nice for Targets to be able to implement a TargetInlineAsm interface
which the front-end and optimizer can use to understand inline asms for the
target.   One particularly interesting use for this interface is to 'raise'
inline asms that are simple enough to LLVM code.

Many inline asms are very stereotyped.  Inline asm is commonly used for atomic
memory access, use of instructions (like popcnt or ctlz) that don't have a 
direct C analogue, etc.  It would be better to raise these to LLVM code and
intrinsics where possible.  This is particularly important for the JIT, which
will not understand inline asm blocks.

Note that (if sufficient trickiness is desired) the JIT could actually extract
inline asm blocks out into their own functions, send them through a the static
compiler, an assembler, then dynamically load the result.  This would be a bit
of work to get right, but would allow the JIT to (slowly) handle un'raisable'
inline asm correctly.


//===----------------------------------------------------------------------===//
//  Implementation in the short-term
//===----------------------------------------------------------------------===//

In the short-term, it would be nice to get inline asm that is operational, even
if it is not optimal.  To do this, we can make the following concessions:

1. Instead of adding support for multiple return values, we can define a new
   output type (denoted with a leading == instead of =).  These outputs indicate
   a register output constraint.  Unlike normal register output constraints
   though, this one is not returned.  Instead, a memory address is passed in and
   and implicit store to that address is issued after the asm block.
2. Instead of teaching the LLVM code generators about all of the register
   constraints that GCC has, we can have the initial stages of the codegen 
   actually do complete register assignment for the asm.  For constraints we
   do directly support (e.g. "r") we can choose to not force specific regs,
   letting the RA do its thing.  The TargetInlineAsm interface should indicate a 
   register class (if we have one) or a list of elements (if we don't) to use
   for each target register class.  This approach also makes it so that each RA 
   doesn't have to handle complexities like '&' and others.

In time, the implementation can be refined to allow better code to be generated.