Compiler Construction Insights
Compiler Construction Insights
internal form
Code
generation
1
Syntax-Directed Translation in
Semantics Phase Goal: Intermediate Code Generation
The first method we present for the semantics phase is syntax- Another representation of the source code is generated, a so-
directed translation. called intermediate code representation
Goal 1: Semantic analysis: Generation of intermediate code has, among others, the
a) Check the program to find semantic errors, e.g. type errors, following advantages:
undefined variables, different number of actual and formal The internal form is:
parameters in a procedure, ....
+ machine-independent
hi i d d t
b) Gather information for the code generation phase, e.g.
var a: real; + not profiled for a certain language
b: integer
+ suitable for optimization
begin
a:= b; + can be used for interpreting
...
generates code for the transformation:
a := IntToReal(b); // Note: IntToReal is a function for changing
integers to a floating-point value.
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.7 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.8 TDDD55/TDDB44 Compiler Construction, 2011
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.9 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.10 TDDD55/TDDB44 Compiler Construction, 2011
When the expression is completed, the result is on the top of the stack.
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.11 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.12 TDDD55/TDDB44 Compiler Construction, 2011
2
Extending Polish/Postfix Notation Extending Polish/Postfix Notation
Assignment Statement Conditional Statement
Assignment We need to introduce the unconditional jump, JUMP, and the
conditional jump, JEQZ, Jump if EQual to Zero, and also we
:= binary operator,
need to specify the jump location, LABEL.
lowest priority for infix form,
L1 LABEL (or L1: )
uses the l-value for its first operand <label> JUMP
Example: <value> <label> JEQZ
Q
(value = 0 ⇒ false, otherwise ⇒ true)
x := 10 + k * 30
⇓ Example 1:
IF <expr> THEN <statement1> ELSE <statement2>
x 10 k 30 * + := gives us
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.13 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.14 TDDD55/TDDB44 Compiler Construction, 2011
gives
i us
a b + L1 JEQZ
c d - L2 JEQZ
x 10 := L3 JUMP
L2: y 20 := L4 JUMP
L1: z 30 := L3: L4:
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.15 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.16 TDDD55/TDDB44 Compiler Construction, 2011
Representing While
Suitable Data Structure for Postfix Code Abstract Syntax Trees (AST)
ASTs are a reduced variant of parse trees. A parse tree
while <expr> do <stat>
contains redundant information, see the figure below.
gives us
Example: Parse tree for
L2: <expr> L1 JEQZ <stat> L2 JUMP L1: a := b * c + d Abstract syntax tree for
a := b * c + d :
<assign>
Exercise
<id> := <expr>
:=
Translate the repeat
p and for statements to p
postfix notation.
a <expr> + <term> a +
Suitable data structure for postfix code <term> <factor>
* d
An array where label corresponds to index.
<term> * <factor> <id>
Array Elements: b c
<factor>
Operand – pointer to the symbol table. <id> d
Operator – a numeric code, for example, which does not collide with <id>
c
the symbol table index.
b
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.17 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.18 TDDD55/TDDB44 Compiler Construction, 2011
3
Properties of Abstract Syntax Trees Three-address Code and Quadruples
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.19 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.20 TDDD55/TDDB44 Compiler Construction, 2011
Control Structures Using Quadruples Procedure call Quad- op arg1 arg2 res
no
Example: Example:f(a1, a2, ..., an) 1 param a1
Quad-no op arg1 arg2 res
if a = b 2 param a2
1 = a b T1
then x := x + 1 ... ... ...
2 JEQZ T1 (6) †
else y := 20; n := an
3 + x 1 T2 Example: READ(X)
4 := T2 x n+1 call f n
Quadruples vs triples
Array-reference Triples (also called two-address code)
A[I] := B Quad- op arg1 arg2 res Triples Form:
no Example: A := B * C + D
[ ]= is called l-value,
1 []= A I T1
specifies the address to No temporary name!
2 := B T1
an element. In l-value Quadruples:
context we obtain − Temporary variables take up space in the symbol table.
storage adress from the + Good control over temporary
p y variables.
value of T1. + Easier to optimise and move code around.
Quad- op arg1 arg2 res
Triples:
B := A[I] no
1 =[] A I T2 − Know nothing about temporary variables.
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.23 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.24 TDDD55/TDDB44 Compiler Construction, 2011
4
Methods for Syntax-Directed Translation
1. Attribute Grammars 2. Syntax Directed Translation Scheme
There are two main methods: Describe the translation process using:
1. Attribute grammars,’attributed translation grammars’ a) a CFG
Describe the translation process using b) a number of semantic operations
a) CFG e.g. a rule: A → XYZ {semantic operation}
b) a number of attributes that are attached to terminal and Semantic operations are performed:
nonterminal symbols, and when reduction occurs (bottom-up), or
c) a number of semantic rules that are attached to the rules during expansion (top-down).
in the grammar which calculate the value of the attribute.
This method is a more procedural form of the previous one
(contains implementation details), which explicitly show the
evaluation order of semantic rules.
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.25 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.26 TDDD55/TDDB44 Compiler Construction, 2011
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.27 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.28 TDDD55/TDDB44 Compiler Construction, 2011
Parse Tree of 8 r1
of the reductions
... id 4 r6 d
}/* switch */; Productions Semantic operations Productions Semantic operations
a
}/* while */; 1 E → E1 + T {print(’+’)} 1 E → E1 + T {print(’+’)} id
2 | T . . .
}/* parser */; 2 | T . . . 3 T → T1 * F {print(’*’)} b
3 T → T1 * F {print(’*’)} 4 | F . . .
4 | F . . . 5 F → ( E ) . . .
5 F → ( E ) . . . 6 | id {print(id)}
6 | id {print(id)}
a b d * +
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.29 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.30 TDDD55/TDDB44 Compiler Construction, 2011
5
TDDD55 Compilers and Interpreters
TDDB44 Compiler Construction
Generating Quadruples
op opnd1 res
Generate Quadruples for if-then-else (2) Generate quadruples for if-then-else (3)
3. <ifclause> ::= if E then
Factorised grammar:
{ <ifclause>.quad = currentquad + 1;
1. <ifstmt> ::= <truepart> S2
// save address p of jump over S1 for later in <ifclause>.quad
2. <truepart> ::= <ifclause> S1 else GEN ( JEQZ, E.addr, 0, 0 );
// jump to S2. Target q+1 not known yet.
3. <ifclause> ::= if E then
}
2 <truepart> ::= <ifclause>
2. f S1 else
{ <truepart>.quad = currentquad + 1;
// save address q of jump over S2 for later
Attributes: GEN ( JUMP, 0, 0, 0 );
addr = address to the symbol table entry for result of E // jump over S2. Target r not known yet.
QUADRUPLE[ <ifclause>.quad ][ 2 ] = currentquad + 1;
quad = quadruple number // backpatch JEQZ target to q+1
}
3. <ifstmt> ::= <truepart> S2
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.35 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.36 TDDD55/TDDB44 Compiler Construction, 2011
6
Generate WHILE <E> DO <S>
Generate Quadruples for if-then-else (4) in: quadruples for Temp := <E>
Quadruples p: JEQZ Temp q+1 Jump over <S> if <E> false
quadruples for <S>
3. <ifclause> ::= if E then for a while q: JUMP in Jump to the loop-predicate
… q+1: ...
statement The grammar factorises on:
2. <truepart> ::= <ifclause> S1 else 1. <while-stat> ::= <while-clause> <S>
2. <while-clause>::= <while> <E> DO
{ <truepart>.quad = currentquad + 1; 3. <while> ::= WHILE
// save address q of jump over S2 for later An extra attribute, NXTQ, must be introduced here. It has
GEN ( JUMP, 0, 0, 0 ); the same meaning as QUAD in the previous example.
// jump
j over S2. Target
T t r nott known
k yet.
t 3 {<while>
3. {<while>.QUAD QUAD ::= NEXTQUAD}
Rule to find start of <E>
QUADRUPLE[ <ifclause>.quad ][ 2 ] = currentquad + 1; 2. {<while-clause>.QUAD := <while>.QUAD;
// backpatch JEQZ target to q+1 Move along start of <E>
} <while-clause>.NXTQ := NEXTQUAD;
Save the address to the next quadruple.
1. <ifstmt> ::= <truepart> S2 GEN(JEQF, <E>.ADDR, 0, 0)
Jump position not yet known! }
1. {GEN(JUMP, <while-clause>.QUAD,0,0);
{ QUADRUPLE[ <truepart>.quad ][ 1 ] = currentquad + 1; Loop, i.e. jump to beginning <E>
// backpatch JUMP target to (r-1)+1 QUADR[<while-clause>.NXTQ,3]:=NEXTQUAD
} Similarly: while statement, repeat statement …
(backpatch) Position at the end of <S> }
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.37 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.38 TDDD55/TDDB44 Compiler Construction, 2011
7
(cont.) (cont.)
Attribute grammar for syntax-directed type checking Attribute grammar extended for assignment statement
with implicit type conversion from integer to Real
... ...
E num { E.type = Inttype; } E E1 op E2 { E.type = ... }
E num . num { E.type = Realtype; } ...
E id { E.type
Et = lookup(id.name).type;
l k (id )t } S V := E { if (V.type
(V type == E E.type)
type)
... // generate code directly according to type
E E1 op E2 { E.type = (E1.type == Inttype && E2.type == Inttype)? Inttype : else
( E1.type == Inttype && E2.type == Realtype if (V.type == Inttype && E.type == Realtype)
|| E1.type == Realtype && E2.type == Inttype error(”Type error”);
|| E1.type == Realtype && E2.type == Realtype ) ? else
Realtype : if (V.type == Realtype && E.type == Inttype)
error(”Type error”), Notype; } // Code generation / evaluation with type conversion:
E.value = ... ;
type is a synthesised attribute: V.value = ConvertIntToReal( E.value );
information flows right-to-left, bottom-up }
}
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.43 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.44 TDDD55/TDDB44 Compiler Construction, 2011
(cont.)
S display(37)
Calculator input:
25 + 4 * 3 =
E 37 =
8
LR Implementation of Attribute Grammars LR Implementation of Attribute Grammars
In an LR parser: In an LR parser (comment to picture on the previous slide)
Semantic stack in parallel with the parse stack A semantic action: E.val = E1.val +T.val
(common stack pointer) translated to
Each entry can store all attributes of a nonterminal a statement: val[stkp-2] = val[stkp-2]+val[stkp]
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.51 TDDD55/TDDB44 Compiler Construction, 2011 P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.52 TDDD55/TDDB44 Compiler Construction, 2011
P. Fritzson, C. Kessler, IDA, Linköpings universitet. 8.53 TDDD55/TDDB44 Compiler Construction, 2011