TC-1 Lexing & Parsing Samples

The only information the compiler provides is about lexical and syntax errors. If there are no errors, the compiler shuts up, and exits successfully:

test01.tig

/* An array type and an array variable. */
let
  type  arrtype = array of int
  var arr1 : arrtype := arrtype [10] of 0
in
  arr1[2]
end

tc -X --parse test01.tig

$ tc -X --parse test01.tig

$ echo $?
0

If there are lexical errors, the exit status is 2, and an error message is output on the standard error output. Its format is standard and mandatory: file, (precise) location, and then the message (See Errors).

unterminated-comment.tig

1
 /* This comments starts at /* 2.2 */

tc -X --parse unterminated-comment.tig

$ tc -X --parse unterminated-comment.tig
unterminated-comment.tig:2.2-3.0: unexpected end of file in a comment
$ echo $?
2

If there are syntax errors, the exit status is set to 3:

type-nil.tig

let var a : nil := ()
in
  1
end

tc -X --parse type-nil.tig

$ tc -X --parse type-nil.tig
type-nil.tig:1.13-15: syntax error, unexpected nil, expecting identifier or _namety
$ echo $?
3

If there are errors which are non lexical, nor syntactic, the exit status is set to 1:

tc no_path/no_file.tig

$ tc no_path/no_file.tig
tc: cannot open `no_path/no_file.tig': No such file or directory
$ echo $?
1

The option --scan-trace, activates the debug mode of the lexer and allows to see what are the tokens produced by the lexer.

The option --parse-trace, which relies on Bison’s %debug and %printer directives, must work properly [1]:

a+a.tig

a + "a"

tc -X --parse-trace --parse a+a.tig

$ tc -X --parse-trace --parse a+a.tig
Parsing file: "a+a.tig"
Starting parse
Entering state 0
Reading a token
Next token is token identifier (a+a.tig:1.1: a)
Shifting token identifier (a+a.tig:1.1: a)
Entering state 2
Reading a token
Next token is token + (a+a.tig:1.3: )
Reducing stack 0 by rule 92 (line 612):
   $1 = token identifier (a+a.tig:1.1: a)
-> $$ = nterm varid (a+a.tig:1.1: a)
Entering state 35
Reducing stack 0 by rule 39 (line 407):
   $1 = nterm varid (a+a.tig:1.1: a)
-> $$ = nterm lvalue (a+a.tig:1.1: a)
Entering state 27
Next token is token + (a+a.tig:1.3: )
Reducing stack 0 by rule 36 (line 400):
   $1 = nterm lvalue (a+a.tig:1.1: a)
-> $$ = nterm exp.1 (a+a.tig:1.1: a)
Entering state 26
Reducing stack 0 by rule 35 (line 395):
   $1 = nterm exp.1 (a+a.tig:1.1: a)
-> $$ = nterm exp (a+a.tig:1.1: a)
Entering state 25
Next token is token + (a+a.tig:1.3: )
Shifting token + (a+a.tig:1.3: )
Entering state 76
Reading a token
Next token is token string (a+a.tig:1.5-7: a)
Shifting token string (a+a.tig:1.5-7: a)
Entering state 1
Reducing stack 0 by rule 4 (line 300):
   $1 = token string (a+a.tig:1.5-7: a)
-> $$ = nterm exp (a+a.tig:1.5-7: "a")
Entering state 122
Reading a token
Now at end of input.
Reducing stack 0 by rule 29 (line 380):
   $1 = nterm exp (a+a.tig:1.1: a)
   $2 = token + (a+a.tig:1.3: )
   $3 = nterm exp (a+a.tig:1.5-7: "a")
-> $$ = nterm exp (a+a.tig:1.1-7: a + "a")
Entering state 25
Now at end of input.
Reducing stack 0 by rule 1 (line 289):
   $1 = nterm exp (a+a.tig:1.1-7: a + "a")
-> $$ = nterm program (a+a.tig:1.1-7: )
Entering state 24
Now at end of input.
Shifting token end of file (:2.0: )
Entering state 65
Cleanup: popping token end of file (:2.0: )
Cleanup: popping nterm program (a+a.tig:1.1-7: )
Parsing string: function _main() = (_exp(0); ())
Starting parse
Entering state 0
Reading a token
Next token is token function (:1.1-8: )
Shifting token function (:1.1-8: )
Entering state 8
Reading a token
Next token is token identifier (:1.10-14: _main)
Shifting token identifier (:1.10-14: _main)
Entering state 45
Reading a token
Next token is token ( (:1.10-15: )
Shifting token ( (:1.10-15: )
Entering state 96
Reading a token
Next token is token ) (:1.10-16: )
Reducing stack 0 by rule 97 (line 632):
-> $$ = nterm funargs (:1.16: )
Entering state 148
Next token is token ) (:1.10-16: )
Shifting token ) (:1.10-16: )
Entering state 191
Reading a token
Next token is token = (:1.18: )
Reducing stack 0 by rule 88 (line 594):
-> $$ = nterm typeid.opt (:1.17: )
Entering state 220
Next token is token = (:1.18: )
Shifting token = (:1.18: )
Entering state 236
Reading a token
Next token is token ( (:1.20: )
Shifting token ( (:1.20: )
Entering state 12
Reading a token
Next token is token _exp (:1.20-24: )
Shifting token _exp (:1.20-24: )
Entering state 20
Reading a token
Next token is token ( (:1.20-25: )
Shifting token ( (:1.20-25: )
Entering state 61
Reading a token
Next token is token integer (:1.20-26: 0)
Shifting token integer (:1.20-26: 0)
Entering state 108
Reading a token
Next token is token ) (:1.20-27: )
Shifting token ) (:1.20-27: )
Entering state 167
Reducing stack 0 by rule 38 (line 402):
   $1 = token _exp (:1.20-24: )
   $2 = token ( (:1.20-25: )
   $3 = token integer (:1.20-26: 0)
   $4 = token ) (:1.20-27: )
-> $$ = nterm exp.1 (:1.20-27: a + "a")
Entering state 26
Reducing stack 0 by rule 35 (line 395):
   $1 = nterm exp.1 (:1.20-27: a + "a")
-> $$ = nterm exp (:1.20-27: a + "a")
Entering state 50
Reading a token
Next token is token ; (:1.20-28: )
Reducing stack 0 by rule 50 (line 434):
   $1 = nterm exp (:1.20-27: a + "a")
-> $$ = nterm exps.1 (:1.20-27: a + "a")
Entering state 51
Next token is token ; (:1.20-28: )
Shifting token ; (:1.20-28: )
Entering state 102
Reading a token
Next token is token ( (:1.30: )
Shifting token ( (:1.30: )
Entering state 12
Reading a token
Next token is token ) (:1.30-31: )
Reducing stack 0 by rule 54 (line 446):
-> $$ = nterm exps.0.2 (:1.31: )
Entering state 53
Next token is token ) (:1.30-31: )
Shifting token ) (:1.30-31: )
Entering state 103
Reducing stack 0 by rule 11 (line 325):
   $1 = token ( (:1.30: )
   $2 = nterm exps.0.2 (:1.31: )
   $3 = token ) (:1.30-31: )
-> $$ = nterm exp (:1.30-31: ())
Entering state 157
Reading a token
Next token is token ) (:1.30-32: )
Reducing stack 0 by rule 53 (line 441):
   $1 = nterm exps.1 (:1.20-27: a + "a")
   $2 = token ; (:1.20-28: )
   $3 = nterm exp (:1.30-31: ())
-> $$ = nterm exps.2 (:1.20-31: a + "a", ())
Entering state 52
Reducing stack 0 by rule 55 (line 447):
   $1 = nterm exps.2 (:1.20-31: a + "a", ())
-> $$ = nterm exps.0.2 (:1.20-31: a + "a", ())
Entering state 53
Next token is token ) (:1.30-32: )
Shifting token ) (:1.30-32: )
Entering state 103
Reducing stack 0 by rule 11 (line 325):
   $1 = token ( (:1.20: )
   $2 = nterm exps.0.2 (:1.20-31: a + "a", ())
   $3 = token ) (:1.30-32: )
-> $$ = nterm exp (:1.20-32: (
  a + "a";
  ()
))
Entering state 244
Reading a token
Now at end of input.
Reducing stack 0 by rule 95 (line 625):
   $1 = token function (:1.1-8: )
   $2 = token identifier (:1.10-14: _main)
   $3 = token ( (:1.10-15: )
   $4 = nterm funargs (:1.16: )
   $5 = token ) (:1.10-16: )
   $6 = nterm typeid.opt (:1.17: )
   $7 = token = (:1.18: )
   $8 = nterm exp (:1.20-32: (
  a + "a";
  ()
))
-> $$ = nterm fundec (:1.1-32: 
function _main() =
  (
    a + "a";
    ()
  ))
Entering state 37
Now at end of input.
Reducing stack 0 by rule 93 (line 620):
   $1 = nterm fundec (:1.1-32: 
function _main() =
  (
    a + "a";
    ()
  ))
-> $$ = nterm funchunk (:1.1-32: 
function _main() =
  (
    a + "a";
    ()
  ))
Entering state 36
Now at end of input.
Reducing stack 0 by rule 56 (line 466):
-> $$ = nterm chunks (:1.33: )
Entering state 85
Reducing stack 0 by rule 59 (line 470):
   $1 = nterm funchunk (:1.1-32: 
function _main() =
  (
    a + "a";
    ()
  ))
   $2 = nterm chunks (:1.33: )
-> $$ = nterm chunks (:1.1-32: 
function _main() =
  (
    a + "a";
    ()
  ))
Entering state 29
Reducing stack 0 by rule 2 (line 292):
   $1 = nterm chunks (:1.1-32: 
function _main() =
  (
    a + "a";
    ()
  ))
-> $$ = nterm program (:1.1-32: )
Entering state 24
Now at end of input.
Shifting token end of file (:1.32: )
Entering state 65
Cleanup: popping token end of file (:1.32: )
Cleanup: popping nterm program (:1.1-32: )
$ echo $?
0

Several points have to be noted: First, --parse is needed. Then, the program cannot see that the variable is not declared nor that there is a type checking error, since binding and type checking are yet to be implemented at this stage. Finally, the output might be slightly different, depending on the version of Bison you use. But what matters is that one can see the items "identifier" a and "string" a.