TC-1/2-Lexer FAQ
- Translating escapes in the scanner (or not)
Escapes in string can be translated at the scanning stage, or kept as is. That is, the string
"n"
can produce a tokenSTRING
with the semantic valuen
(translation) or\n
(no translation). You are free to choose your favorite implementation, but keep in mind that if you translate, you’ll have to untranslate later (i.e. convertn
back to\n
).We encourage you to do this translation, but the other solution is also correct, as long as the next steps of your compiler follow the same conventions as your input.
You must check for bad escapes whatever solution you choose (see Lexical Specifications).
- What values can be represented by an
int
? The set of valid integer values is the set of signed 32-bit integers in two’s complement, that is the integer interval \([-2^{31}, 2^{31}-1]\).
- What values can be represented by an integer literal?
Although an integer value can be any number in \([-2^{31}, 2^{31}-1]\), it is however not possible to represent the literal \(-2^{31} (= -2147483648)\) for technical reasons.
As the
-
operator (unary) is handled independently of integers, their value must be limited to \(2^{31}-1 (= 2147483647)\). Even if-2147483648
would represent a valid integer, it is scanned as-
&2147483648
. The latter is not a valid positive integer. It is however possible to create an integer value representing this number.To put it in a nutshell, in the following declarations the first one is not valid whereas the second one is:
/* invalid, produces a lexing error */ var i := -2147483648 /* valid, is correctly lexed */ var i := -2147483647 - 1