Intermediate representation TREE

In Tiger, the Tree language is used as both HIR (High-Level Intermediate Representation) and LIR (Low-Level Intermediate Representation).

In the industry there are different intermediate languages, here are some of them:
Tree language

Here is a description of the Tree language used as an intermediate representation in compilers. This language was designed by Andrew Appel.

It is used in compilers to represent source code before it is transformed into machine code. It is a low-level language that is easier to manipulate and optimize than the original source code.

HIR is defined by the following grammar:

Note

The grammar is also applicable to LIR. It requires an HIR program to verify specific constraints.

(* Expressions creation. *)
Exp = "const" int
 |  "name" Label
 |  "temp" Temp
 |  "binop" Oper Exp Exp
 |  "mem" Exp
 |  "call" Exp [{Exp}] "call end"
 |  "eseq" Stm Exp
;

(* Statements creation. *)
Stm = "move" Exp Exp
 |  "sxp" Exp
 |  "jump" Exp
 |  "cjump" Relop Exp Exp Label Label
 |  "seq" [{Stm}] "seq end"
 |  "label" Label
 |  "label" Label Literal
 ;

(* Operators creation. *)
Oper  = "add" | "sub" | "mul" | "div" | "mod";
Relop = "eq" | "ne" | "lt" | "gt" | "le" | "ge";
 
Label = Ident;
Temp = fp | rv | sp | Ident;
fp = "fp" | "$fp";
sp    = "sp" | "$sp";
rv = "rv" | "$v0";
Ident = "[$a-zA-Z_][$a-zA-Z_0-9]*";

(* In addition, the following alternative syntax for operators is supported *)
Oper = "(+)"  | "(-)" | "(*)" | "(/)" | "(%)";
Relop = "(=)" | "(<>)" | "(<)" | "(>)" | "(<=)" | "(>=)";
List of trees Instructions

Here is the list of all instructions in the Tree language, and you can find more details in the book (Modern Compiler Implementation).

CONST(i):
  • The integer constant i.

NAME(n):
  • The symbolic constant n (corresponding to an assembly register)

BINOP (o, e1,e2):
  • The application of binary operator o to operands e1 and e2.

  • There are three types of operators: the integer arithmetic operators (PLUS, MINUS, MUL, DIV) the integer bitwise logical operators (AND, OR, XOR), and the integer shift operators (LSHIFT, RSHIFT).

MEM(e):
  • The contents of word size bytes of memory starting at address e.

CALL(f, list):
  • A procedure call: the application of function f to arguments list list.

ESEQ(s, e):
  • The statement s is evaluated for side effects, then e is evaluated for a result.

MOVE(TEMP t, e):
  • Evaluate e and move it into temporary t.

MOVE(MEM(e1), e2):
  • Evaluate e1 yielding address a. Then evaluate e2, and store the result at the memory starting at a.

EXP(e):
  • Evaluate e and return the result.

JUMP(e):
  • Transfer control (Jump) to address e.

  • This instruction is simpler than Appel’s (JUMP(e, labels)). labels specifies all possible locations where the expression e can be evaluated, which is necessary for performing dataflow analysis later on.

  • Our implementation assumes the expression e, used as a destination, is always straightforward.

CJUMP(o, e1, e2, l1, l2):
  • Evaluate expressions e1 and e2, yielding values a, b.

  • Then compare a, b using the operator o. If the result is true , jump to l1; otherwise jump to l2.

SEQ(s1, s2, ..., sN):
  • The statement s1 followed by s2 and another and another.

LABEL(n):
  • Define the constant value of name n. This is like a label definition in assembly language.

Special Temporaries

Some of the temporaries have a special meaning.

fp, $fp
  • The frame pointer.

sp, $sp
  • The stack pointer. One cannot read beyond sp.

rv, $v0
  • The result register. Functions should store their result there.