The RTL representation of the code for a function is a doubly-linked chain of objects called insns. Insns are expressions with special codes that are used for no other purpose. Some insns are actual instructions; others represent dispatch tables for switch
statements; others represent labels to jump to or various sorts of declarative information.
In addition to its own specific data, each insn must have a unique id-number that distinguishes it from all other insns in the current function (after delayed branch scheduling, copies of an insn with the same id-number may be present in multiple places in a function, but these copies will always be identical and will only appear inside a sequence
), and chain pointers to the preceding and following insns. These three fields occupy the same position in every insn, independent of the expression code of the insn. They could be accessed with XEXP
and XINT
, but instead three special macros are always used:
INSN_UID (i)
PREV_INSN (i)
NEXT_INSN (i)
The first insn in the chain is obtained by calling get_insns
; the last insn is the result of calling get_last_insn
. Within the chain delimited by these insns, the NEXT_INSN
and PREV_INSN
pointers must always correspond: if insn is not the first insn,
NEXT_INSN (PREV_INSN (insn)) == insn
is always true and if insn is not the last insn,
PREV_INSN (NEXT_INSN (insn)) == insn
is always true.
After delay slot scheduling, some of the insns in the chain might be sequence
expressions, which contain a vector of insns. The value of NEXT_INSN
in all but the last of these insns is the next insn in the vector; the value of NEXT_INSN
of the last insn in the vector is the same as the value of NEXT_INSN
for the sequence
in which it is contained. Similar rules apply for PREV_INSN
.
This means that the above invariants are not necessarily true for insns inside sequence
expressions. Specifically, if insn is the first insn in a sequence
, NEXT_INSN (PREV_INSN (insn))
is the insn containing the sequence
expression, as is the value of PREV_INSN (NEXT_INSN (insn))
is insn is the last insn in the sequence
expression. You can use these expressions to find the containing sequence
expression.
Every insn has one of the following six expression codes:
insn
insn
is used for instructions that do not jump and do not do function calls. sequence
expressions are always contained in insns with code insn
even if one of those insns should jump or do function calls.
Insns with code insn
have four additional fields beyond the three mandatory ones listed above. These four are described in a table below.
jump_insn
jump_insn
is used for instructions that may jump (or, more generally, may contain label_ref
expressions). If there is an instruction to return from the current function, it is recorded as a jump_insn
.
jump_insn
insns have the same extra fields as insn
insns, accessed in the same way and in addition contain a field JUMP_LABEL
which is defined once jump optimization has completed.
For simple conditional and unconditional jumps, this field contains the code_label
to which this insn will (possibly conditionally) branch. In a more complex jump, JUMP_LABEL
records one of the labels that the insn refers to; the only way to find the others is to scan the entire body of the insn.
Return insns count as jumps, but since they do not refer to any labels, they have zero in the JUMP_LABEL
field.
call_insn
call_insn
is used for instructions that may do function calls. It is important to distinguish these instructions because they imply that certain registers and memory locations may be altered unpredictably.
call_insn
insns have the same extra fields as insn
insns, accessed in the same way and in addition contain a field CALL_INSN_FUNCTION_USAGE
, which contains a list (chain of expr_list
expressions) containing use
and clobber
expressions that denote hard registers used or clobbered by the called function. A register specified in a clobber
in this list is modified after the execution of the call_insn
, while a register in a clobber
in the body of the call_insn
is clobbered before the insn completes execution. clobber
expressions in this list augment registers specified in CALL_USED_REGISTERS
(see Register Basics).
code_label
code_label
insn represents a label that a jump insn can jump to. It contains two special fields of data in addition to the three standard ones. CODE_LABEL_NUMBER
is used to hold the label number, a number that identifies this label uniquely among all the labels in the compilation (not just in the current function). Ultimately, the label is represented in the assembler output as an assembler label, usually of the form `Ln' where n is the label number.
When a code_label
appears in an RTL expression, it normally appears within a label_ref
which represents the address of the label, as a number.
The field LABEL_NUSES
is only defined once the jump optimization phase is completed and contains the number of times this label is referenced in the current function.
barrier
volatile
functions, which do not return (e.g., exit
). They contain no information beyond the three standard fields.
note
note
insns are used to represent additional debugging and declarative information. They contain two nonstandard fields, an integer which is accessed with the macro NOTE_LINE_NUMBER
and a string accessed with NOTE_SOURCE_FILE
.
If NOTE_LINE_NUMBER
is positive, the note represents the position of a source line and NOTE_SOURCE_FILE
is the source file name that the line came from. These notes control generation of line number data in the assembler output.
Otherwise, NOTE_LINE_NUMBER
is not really a line number but a code with one of the following values (and NOTE_SOURCE_FILE
must contain a null pointer):
NOTE_INSN_DELETED
NOTE_INSN_BLOCK_BEG
NOTE_INSN_BLOCK_END
NOTE_INSN_LOOP_BEG
NOTE_INSN_LOOP_END
while
or for
loop. They enable the loop optimizer to find loops quickly.
NOTE_INSN_LOOP_CONT
continue
statements jump to.
NOTE_INSN_LOOP_VTOP
NOTE_INSN_FUNCTION_END
return
statements jump to (on machine where a single instruction does not suffice for returning). This note may be deleted by jump optimization.
NOTE_INSN_SETJMP
setjmp
or a related function. These codes are printed symbolically when they appear in debugging dumps.
The machine mode of an insn is normally VOIDmode
, but some phases use the mode for various purposes; for example, the reload pass sets it to HImode
if the insn needs reloading but not register elimination and QImode
if both are required. The common subexpression elimination pass sets the mode of an insn to QImode
when it is the first insn in a block that has already been processed.
Here is a table of the extra fields of insn
, jump_insn
and call_insn
insns:
PATTERN (i)
set
, call
, use
, clobber
, return
, asm_input
, asm_output
, addr_vec
, addr_diff_vec
, trap_if
, unspec
, unspec_volatile
, parallel
, or sequence
. If it is a parallel
, each element of the parallel
must be one these codes, except that parallel
expressions cannot be nested and addr_vec
and addr_diff_vec
are not permitted inside a parallel
expression.
INSN_CODE (i)
Such matching is never attempted and this field remains -1 on an insn whose pattern consists of a single use
, clobber
, asm_input
, addr_vec
or addr_diff_vec
expression.
Matching is also never attempted on insns that result from an asm
statement. These contain at least one asm_operands
expression. The function asm_noperands
returns a non-negative value for such insns.
In the debugging output, this field is printed as a number followed by a symbolic representation that locates the pattern in the `md' file as some small positive or negative offset from a named pattern.
LOG_LINKS (i)
insn_list
expressions) giving information about dependencies between instructions within a basic block. Neither a jump nor a label may come between the related insns.
REG_NOTES (i)
expr_list
and insn_list
expressions) giving miscellaneous information about the insn. It is often information pertaining to the registers used in this insn.
The LOG_LINKS
field of an insn is a chain of insn_list
expressions. Each of these has two operands: the first is an insn, and the second is another insn_list
expression (the next one in the chain). The last insn_list
in the chain has a null pointer as second operand. The significant thing about the chain is which insns appear in it (as first operands of insn_list
expressions). Their order is not significant.
This list is originally set up by the flow analysis pass; it is a null pointer until then. Flow only adds links for those data dependencies which can be used for instruction combination. For each insn, the flow analysis pass adds a link to insns which store into registers values that are used for the first time in this insn. The instruction scheduling pass adds extra links so that every dependence will be represented. Links represent data dependencies, antidependencies and output dependencies; the machine mode of the link distinguishes these three types: antidependencies have mode REG_DEP_ANTI
, output dependencies have mode REG_DEP_OUTPUT
, and data dependencies have mode VOIDmode
.
The REG_NOTES
field of an insn is a chain similar to the LOG_LINKS
field but it includes expr_list
expressions in addition to insn_list
expressions. There are several kinds of register notes, which are distinguished by the machine mode, which in a register note is really understood as being an enum reg_note
. The first operand op of the note is data whose meaning depends on the kind of note.
The macro REG_NOTE_KIND (x)
returns the kind of register note. Its counterpart, the macro PUT_REG_NOTE_KIND (x, newkind)
sets the register note type of x to be newkind.
Register notes are of three classes: They may say something about an input to an insn, they may say something about an output of an insn, or they may create a linkage between two insns. There are also a set of values that are only used in LOG_LINKS
.
These register notes annotate inputs to an insn:
REG_DEAD
This does not necessarily mean that the register op has no useful value after this insn since it may also be an output of the insn. In such a case, however, a REG_DEAD
note would be redundant and is usually not present until after the reload pass, but no code relies on this fact.
REG_INC
post_inc
, pre_inc
, post_dec
or pre_dec
expression.
REG_NONNEG
The REG_NONNEG
note is added to insns only if the machine description has a `decrement_and_branch_until_zero' pattern.
REG_NO_CONFLICT
Insns with this note are usually part of a block that begins with a clobber
insn specifying a multi-word pseudo register (which will be the output of the block), a group of insns that each set one word of the value and have the REG_NO_CONFLICT
note attached, and a final insn that copies the output to itself with an attached REG_EQUAL
note giving the expression being computed. This block is encapsulated with REG_LIBCALL
and REG_RETVAL
notes on the first and last insns, respectively.
REG_LABEL
code_label
, but is not a jump_insn
. The presence of this note allows jump optimization to be aware that op is, in fact, being used. The following notes describe attributes of outputs of an insn:
REG_EQUIV
REG_EQUAL
set
is a strict_low_part
expression, the note refers to the register that is contained in SUBREG_REG
of the subreg
expression.
For REG_EQUIV
, the register is equivalent to op throughout the entire function, and could validly be replaced in all its occurrences by op. (``Validly'' here refers to the data flow of the program; simple replacement may make some insns invalid.) For example, when a constant is loaded into a register that is never assigned any other value, this kind of note is used.
When a parameter is copied into a pseudo-register at entry to a function, a note of this kind records that the register is equivalent to the stack slot where the parameter was passed. Although in this case the register may be set by other insns, it is still valid to replace the register by the stack slot throughout the function.
In the case of REG_EQUAL
, the register that is set by this insn will be equal to op at run time at the end of this insn but not necessarily elsewhere in the function. In this case, op is typically an arithmetic expression. For example, when a sequence of insns such as a library call is used to perform an arithmetic operation, this kind of note is attached to the insn that produces or copies the final value.
These two notes are used in different ways by the compiler passes. REG_EQUAL
is used by passes prior to register allocation (such as common subexpression elimination and loop optimization) to tell them how to think of that value. REG_EQUIV
notes are used by register allocation to indicate that there is an available substitute expression (either a constant or a mem
expression for the location of a parameter on the stack) that may be used in place of a register if insufficient registers are available.
Except for stack homes for parameters, which are indicated by a REG_EQUIV
note and are not useful to the early optimization passes and pseudo registers that are equivalent to a memory location throughout there entire life, which is not detected until later in the compilation, all equivalences are initially indicated by an attached REG_EQUAL
note. In the early stages of register allocation, a REG_EQUAL
note is changed into a REG_EQUIV
note if op is a constant and the insn represents the only set of its destination register.
Thus, compiler passes prior to register allocation need only check for REG_EQUAL
notes and passes subsequent to register allocation need only check for REG_EQUIV
notes.
REG_UNUSED
REG_DEAD
note, which indicates that the value in an input will not be used subsequently. These two notes are independent; both may be present for the same register.
REG_WAS_0
note
; its absence implies nothing. These notes describe linkages between insns. They occur in pairs: one insn has one of a pair of notes that points to a second insn, which has the inverse note pointing back to the first insn.
REG_RETVAL
Loop optimization uses this note to treat such a sequence as a single operation for code motion purposes and flow analysis uses this note to delete such sequences whose results are dead.
A REG_EQUAL
note will also usually be attached to this insn to provide the expression being computed by the sequence.
REG_LIBCALL
REG_RETVAL
: it is placed on the first insn of a multi-insn sequence, and it points to the last one.
REG_CC_SETTER
REG_CC_USER
cc0
, the insns which set and use cc0
set and use cc0
are adjacent. However, when branch delay slot filling is done, this may no longer be true. In this case a REG_CC_USER
note will be placed on the insn setting cc0
to point to the insn using cc0
and a REG_CC_SETTER
note will be placed on the insn using cc0
to point to the insn setting cc0
.
These values are only used in the LOG_LINKS
field, and indicate the type of dependency that each link represents. Links which indicate a data dependence (a read after write dependence) do not use any code, they simply have mode VOIDmode
, and are printed without any descriptive text.
REG_DEP_ANTI
REG_DEP_OUTPUT
For convenience, the machine mode in an insn_list
or expr_list
is printed using these symbolic codes in debugging dumps.
The only difference between the expression codes insn_list
and expr_list
is that the first operand of an insn_list
is assumed to be an insn and is printed in debugging dumps as the insn's unique id; the first operand of an expr_list
is printed in the ordinary way as an expression.