Intermediate representation TREE
In Tiger, the Tree language is used as both HIR (High-Level Intermediate Representation) and LIR (Low-Level Intermediate Representation).
- In the industry there are different intermediate languages, here are some of them:
LLVM-IR use in Clang.
GIMPLE use in GCC.
JAVA Bytecode use in Java.
- Tree language
Here’s a description of the Tree language used as an intermediate representation in compilers, This language was designed by Andrew Appel.
The Tree language is a intermediate representation used in compilers to represent source code before it is transformed into machine code. It is a low-level language that is easier to manipulate and optimize than the original source code.
- HIR is defined by the following grammar:
(* Expressions creations. *) Exp = "const" int | "name" Label | "temp" Temp | "binop" Oper Exp Exp | "mem" Exp | "call" Exp [{Exp}] "call end" | "eseq" Stm Exp ; (* Statements creations. *) Stm = "move" Exp Exp | "sxp" Exp | "jump" Exp Label | "cjump" Relop Exp Exp Label Label | "seq" [{Stm}] "seq end" | "label" Label | "label" Label Literal ; (* Operators creations. *) Oper = "add" | "sub" | "mul" | "div" | "mod"; Relop = "eq" | "ne" | "lt" | "gt" | "le" | "ge"; Label = Ident; Temp = fp | rv | sp | Ident; fp = "fp" | "$fp"; sp = "sp" | "$sp"; rv = "rv" | "$v0"; Ident = "[$a-zA-Z_][$a-zA-Z_0-9]*"; (* In addition, the following alternative syntax for operators is supported *) Oper = "(+)" | "(-)" | "(*)" | "(/)" | "(%)"; Relop = "(=)" | "(<>)" | "(<)" | "(>)" | "(<=)" | "(>=)";
- List of trees Instructions
Here is the list of all instructions in the Tree language, and you can find more details in the book (Modern Compiler Implementation).
CONST(i)
:The integer constant
i
.
NAME(n)
:The symbolic constant
n
(corresponding to an assembly registry)
BINOP (o, e1,e2)
:The application of binary operator
o
to operandse1
ande2
.There are three types of operators and the integer arithmetic operators (
PLUS, MINUS, MUL, DIV
) the integer bitwise logical operators (AND, OR, XOR
) and the integer shift operators (LSHIFT, RSHIFT
).
MEM(e)
:The contents of wordSize bytes of memory starting at address
e
.
CALL(f, list)
:A procedure call: the application of function
f
to argument listlist
.
ESEQ(s, e)
:The statement
s
is evaluated for side effects, thene
is evaluated for a result.
MOVE(TEMP t, e)
:Evaluate
e
and move it into temporaryt
.
MOVE(MEM(e1), e2)
:Evaluate
e1
yielding addressa
. Then evaluatee2
, and store the result at the memory starting ata
.
EXP(e)
:Evaluate
e
and return the result.
JUMP(e, labels)
:Transfer control (Jump) to address
e
. The list of labelslabels
specifies all possible locations where the expressione
can be evaluated, which is necessary for performing dataflow analysis later on.
CJUMP(o, e1, e2, l1, l2)
:Evaluate expression
e1
ande2
, yielding valuesa
,b
.Then compare
a
,b
using the operatoro
. If the result is true , jump tol1
; otherwise jump tol2
.
SEQ(s1, s2, ..., sN)
:The statement
s1
followed bys2
and another and another.
LABEL(n)
:Define the constant value of name
n
. This is like a label definition in assembly language.
- Special Temporaries
Some of the temporaries have a special meaning.
fp
,$fp
The frame pointer.
sp
,$sp
The stack pointer. One cannot read beyond sp.
rv
,$v0
The result register. Functions should store their result there.