Intermediate representation TREE
In Tiger, the Tree language is used as both HIR (High-Level Intermediate Representation) and LIR (Low-Level Intermediate Representation).
- In the industry there are different intermediate languages, here are some of them:
LLVM-IR use in Clang.
GIMPLE use in GCC.
JAVA Bytecode use in Java.
- Tree language
Here is a description of the Tree language used as an intermediate representation in compilers. This language was designed by Andrew Appel.
It is used in compilers to represent source code before it is transformed into machine code. It is a low-level language that is easier to manipulate and optimize than the original source code.
- HIR is defined by the following grammar:
Note
The grammar is also applicable to
LIR
. It requires anHIR
program to verify specific constraints.(* Expressions creation. *) Exp = "const" int | "name" Label | "temp" Temp | "binop" Oper Exp Exp | "mem" Exp | "call" Exp [{Exp}] "call end" | "eseq" Stm Exp ; (* Statements creation. *) Stm = "move" Exp Exp | "sxp" Exp | "jump" Exp | "cjump" Relop Exp Exp Label Label | "seq" [{Stm}] "seq end" | "label" Label | "label" Label Literal ; (* Operators creation. *) Oper = "add" | "sub" | "mul" | "div" | "mod"; Relop = "eq" | "ne" | "lt" | "gt" | "le" | "ge"; Label = Ident; Temp = fp | rv | sp | Ident; fp = "fp" | "$fp"; sp = "sp" | "$sp"; rv = "rv" | "$v0"; Ident = "[$a-zA-Z_][$a-zA-Z_0-9]*"; (* In addition, the following alternative syntax for operators is supported *) Oper = "(+)" | "(-)" | "(*)" | "(/)" | "(%)"; Relop = "(=)" | "(<>)" | "(<)" | "(>)" | "(<=)" | "(>=)";
- List of trees Instructions
Here is the list of all instructions in the Tree language, and you can find more details in the book (Modern Compiler Implementation).
CONST(i)
:The integer constant
i
.
NAME(n)
:The symbolic constant
n
(corresponding to an assembly register)
BINOP (o, e1,e2)
:The application of binary operator
o
to operandse1
ande2
.There are three types of operators: the integer arithmetic operators (
PLUS, MINUS, MUL, DIV
) the integer bitwise logical operators (AND, OR, XOR
), and the integer shift operators (LSHIFT, RSHIFT
).
MEM(e)
:The contents of
word size
bytes of memory starting at addresse
.
CALL(f, list)
:A procedure call: the application of function
f
to arguments listlist
.
ESEQ(s, e)
:The statement
s
is evaluated for side effects, thene
is evaluated for a result.
MOVE(TEMP t, e)
:Evaluate
e
and move it into temporaryt
.
MOVE(MEM(e1), e2)
:Evaluate
e1
yielding addressa
. Then evaluatee2
, and store the result at the memory starting ata
.
EXP(e)
:Evaluate
e
and return the result.
JUMP(e)
:Transfer control (Jump) to address
e
.This instruction is simpler than Appel’s (
JUMP(e, labels)
).labels
specifies all possible locations where the expressione
can be evaluated, which is necessary for performing dataflow analysis later on.Our implementation assumes the expression
e
, used as a destination, is always straightforward.
CJUMP(o, e1, e2, l1, l2)
:Evaluate expressions
e1
ande2
, yielding valuesa
,b
.Then compare
a
,b
using the operatoro
. If the result is true , jump tol1
; otherwise jump tol2
.
SEQ(s1, s2, ..., sN)
:The statement
s1
followed bys2
and another and another.
LABEL(n)
:Define the constant value of name
n
. This is like a label definition in assembly language.
- Special Temporaries
Some of the temporaries have a special meaning.
fp
,$fp
The frame pointer.
sp
,$sp
The stack pointer. One cannot read beyond
sp
.
rv
,$v0
The result register. Functions should store their result there.