Intermediate representation TREE
In Tiger, the Tree language is used as both HIR (High-Level Intermediate Representation) and LIR (Low-Level Intermediate Representation).
- In the industry there are different intermediate languages, here are some of them:
LLVM-IR use in Clang.
GIMPLE use in GCC.
JAVA Bytecode use in Java.
- Tree language
Here’s a description of the Tree language used as an intermediate representation in compilers. This language was designed by Andrew Appel.
It is used in compilers to represent source code before it is transformed into machine code. It is a low-level language that is easier to manipulate and optimize than the original source code.
- HIR is defined by the following grammar:
(* Expressions creation. *) Exp = "const" int | "name" Label | "temp" Temp | "binop" Oper Exp Exp | "mem" Exp | "call" Exp [{Exp}] "call end" | "eseq" Stm Exp ; (* Statements creation. *) Stm = "move" Exp Exp | "sxp" Exp | "jump" Exp Label | "cjump" Relop Exp Exp Label Label | "seq" [{Stm}] "seq end" | "label" Label | "label" Label Literal ; (* Operators creation. *) Oper = "add" | "sub" | "mul" | "div" | "mod"; Relop = "eq" | "ne" | "lt" | "gt" | "le" | "ge"; Label = Ident; Temp = fp | rv | sp | Ident; fp = "fp" | "$fp"; sp = "sp" | "$sp"; rv = "rv" | "$v0"; Ident = "[$a-zA-Z_][$a-zA-Z_0-9]*"; (* In addition, the following alternative syntax for operators is supported *) Oper = "(+)" | "(-)" | "(*)" | "(/)" | "(%)"; Relop = "(=)" | "(<>)" | "(<)" | "(>)" | "(<=)" | "(>=)";
- List of trees Instructions
Here is the list of all instructions in the Tree language, and you can find more details in the book (Modern Compiler Implementation).
CONST(i)
:The integer constant
i
.
NAME(n)
:The symbolic constant
n
(corresponding to an assembly register)
BINOP (o, e1,e2)
:The application of binary operator
o
to operandse1
ande2
.There are three types of operators: the integer arithmetic operators (
PLUS, MINUS, MUL, DIV
) the integer bitwise logical operators (AND, OR, XOR
), and the integer shift operators (LSHIFT, RSHIFT
).
MEM(e)
:The contents of wordSize bytes of memory starting at address
e
.
CALL(f, list)
:A procedure call: the application of function
f
to argument listlist
.
ESEQ(s, e)
:The statement
s
is evaluated for side effects, thene
is evaluated for a result.
MOVE(TEMP t, e)
:Evaluate
e
and move it into temporaryt
.
MOVE(MEM(e1), e2)
:Evaluate
e1
yielding addressa
. Then evaluatee2
, and store the result at the memory starting ata
.
EXP(e)
:Evaluate
e
and return the result.
JUMP(e, labels)
:Transfer control (Jump) to address
e
. The list of labelslabels
specifies all possible locations where the expressione
can be evaluated, which is necessary for performing dataflow analysis later on.
CJUMP(o, e1, e2, l1, l2)
:Evaluate expressions
e1
ande2
, yielding valuesa
,b
.Then compare
a
,b
using the operatoro
. If the result is true , jump tol1
; otherwise jump tol2
.
SEQ(s1, s2, ..., sN)
:The statement
s1
followed bys2
and another and another.
LABEL(n)
:Define the constant value of name
n
. This is like a label definition in assembly language.
- Special Temporaries
Some of the temporaries have a special meaning.
fp
,$fp
The frame pointer.
sp
,$sp
The stack pointer. One cannot read beyond sp.
rv
,$v0
The result register. Functions should store their result there.