Intermediate representation TREE

In Tiger, the Tree language is used as both HIR (High-Level Intermediate Representation) and LIR (Low-Level Intermediate Representation).

In the industry there are different intermediate languages, here are some of them:
Tree language

Here’s a description of the Tree language used as an intermediate representation in compilers. This language was designed by Andrew Appel.

It is used in compilers to represent source code before it is transformed into machine code. It is a low-level language that is easier to manipulate and optimize than the original source code.

HIR is defined by the following grammar:
(* Expressions creation. *)
Exp = "const" int
 |  "name" Label
 |  "temp" Temp
 |  "binop" Oper Exp Exp
 |  "mem" Exp
 |  "call" Exp [{Exp}] "call end"
 |  "eseq" Stm Exp
;

(* Statements creation. *)
Stm = "move" Exp Exp
 |  "sxp" Exp
 |  "jump" Exp Label
 |  "cjump" Relop Exp Exp Label Label
 |  "seq" [{Stm}] "seq end"
 |  "label" Label
 |  "label" Label Literal
 ;

(* Operators creation. *)
Oper  = "add" | "sub" | "mul" | "div" | "mod";
Relop = "eq" | "ne" | "lt" | "gt" | "le" | "ge";
 
Label = Ident;
Temp = fp | rv | sp | Ident;
fp = "fp" | "$fp";
sp    = "sp" | "$sp";
rv = "rv" | "$v0";
Ident = "[$a-zA-Z_][$a-zA-Z_0-9]*";

(* In addition, the following alternative syntax for operators is supported *)
Oper = "(+)"  | "(-)" | "(*)" | "(/)" | "(%)";
Relop = "(=)" | "(<>)" | "(<)" | "(>)" | "(<=)" | "(>=)";
List of trees Instructions

Here is the list of all instructions in the Tree language, and you can find more details in the book (Modern Compiler Implementation).

CONST(i):
  • The integer constant i.

NAME(n):
  • The symbolic constant n (corresponding to an assembly register)

BINOP (o, e1,e2):
  • The application of binary operator o to operands e1 and e2.

  • There are three types of operators: the integer arithmetic operators (PLUS, MINUS, MUL, DIV) the integer bitwise logical operators (AND, OR, XOR), and the integer shift operators (LSHIFT, RSHIFT).

MEM(e):
  • The contents of wordSize bytes of memory starting at address e.

CALL(f, list):
  • A procedure call: the application of function f to argument list list.

ESEQ(s, e):
  • The statement s is evaluated for side effects, then e is evaluated for a result.

MOVE(TEMP t, e):
  • Evaluate e and move it into temporary t.

MOVE(MEM(e1), e2):
  • Evaluate e1 yielding address a. Then evaluate e2, and store the result at the memory starting at a.

EXP(e):
  • Evaluate e and return the result.

JUMP(e, labels):
  • Transfer control (Jump) to address e. The list of labels labels specifies all possible locations where the expression e can be evaluated, which is necessary for performing dataflow analysis later on.

CJUMP(o, e1, e2, l1, l2):
  • Evaluate expressions e1 and e2, yielding values a, b.

  • Then compare a, b using the operator o. If the result is true , jump to l1; otherwise jump to l2.

SEQ(s1, s2, ..., sN):
  • The statement s1 followed by s2 and another and another.

LABEL(n):
  • Define the constant value of name n. This is like a label definition in assembly language.

Special Temporaries

Some of the temporaries have a special meaning.

fp, $fp
  • The frame pointer.

sp, $sp
  • The stack pointer. One cannot read beyond sp.

rv, $v0
  • The result register. Functions should store their result there.