0% found this document useful (0 votes)
13 views67 pages

Module 6 - Intermediate Code

Uploaded by

fitara8437
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
13 views67 pages

Module 6 - Intermediate Code

Uploaded by

fitara8437
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 67

BCSE307L

Compiler Design
MODULE - 4
Dr. WI. Sureshkumar
Associate Professor
School of Computer Science and Engineering (SCOPE)
VIT Chennai
wi.sureshkumar@vit.ac.in
Module -4
Intermediate Code Generation
• Intermediate Languages
• Declarations
• Assignment Statements
• Boolean Expressions
• Case Statements
• Backpatching
• Procedure Calls.
Intermediate code generator
• A compiler while translating a source program into its equivalent object
code representation, may first generate an intermediate representation.
• The advantage of generating an intermediate representation are
• Easy of conversion from source program to the intermediate code
• Easy to do the subsequent processing from the intermediate code

Intermediate
Parse
code
Tree Intermediate Code Code
Parser
Generator Generator

• The output of a parser is some representation of a parse tree. The


intermediate code generation phase transforms this parse tree into an
intermediate language representation.
Intermediate Representations
There are three kind of intermediate representations
1. Syntax tree
2. Postfix notation
3. Three address code
Syntax tree
A syntax tree is a condensed form of parse tree useful for representing
language constructs.
Example: syntax tree for a – 4 + c

+
c
-
a 4
Construction of syntax tree for expression
• The construction of a syntax tree for an expression is similar to the translation of the
expression into postfix form.
• Each node in a syntax tree can be implemented as a record with several fields.
• The following functions are used to create the nodes of a syntax tree for an
expression with binary operators

1. mknode(op, left, right)


• create an operator node with label op and two fields containing pointers to left and right

2. mkleaf( id, entry)


• create an identifier node with label id and a field containing entry, a pointer to the symbol table entry of the
identifier.

3. mkleaf( num, val)


• create a number node with label num and a field containing val, the value of the number.
Construction of syntax tree for expression
The syntax tree for the expression a – 4 + c
1. P1 = mkleaf(id, entry_a)
2. P2 = mkleaf(num, 4)
3. P3 = mknode(“-”, P1, P2)
4. P4 = mkleaf(id, entry_c)
5. P5 = mknode(“+”, P3, P4)
DAG (Directed Acyclic Graph)
• Identifies the common subexpression
• A node N in a DAG has more than one parent
• In a syntax tree, the common subexpression would be replicated as
many times as the subexpression appears in the original expression
DAG for the expression
A DAG for the expression a + a * (b – c) + (b – c) * d
1. P1 = mkleaf(id, entry_a)
2. P2 = mkleaf(id, entry_a)
3. P3 = mkleaf(id, entry_b)
4. P4 = mkleaf(id, entry_c)
5. P5 = mknode(“-”, P3, P4)
6. P6 = mknode(“*”, P2, P5)
7. P7 = mknode(“+”, P1, P6)
8. P8 = mkleaf(id, entry_b)
9. P9 = mkleaf(id, entry_c)
10. P10 = mknode(“-”, P8, P9)
11. P11 = mkleaf(id, entry_d)
12. P12 = mknode(“*”, P10, P11)
13. P13 = mknode(“+”, P7, P12)
a + a * (b – c) + (b – c) * d
+

+ *

a - d
*
b c
a -
b c
Postfix notation
a–4+c
a4–c+

a + a * (b – c) + (b – c) * d
a a bc– * + b c– d*+
Three address code
• Three address code is a sequence of statements of the form
x := y op z
x, y, z are names, constants, or compiler generated
temporaries and op stands for operator.
Ex: Three address code for the exp a + b * c + d
t1 = b * c
t2 = a + t1
t3 = t2 + d

x= (x-y) + (z-x) * (x-z)


Types of Three address statements
• x = y op z (op is binary)
• x = op y (op is unary)
• x=y
• goto L (unconditional jump to label L)
• if x goto L and if False x goto L (conditional jump to
label L)
• if x relop y goto L
Types of Three address statements
– param x1
– param x2
– ...
– param xn
– call p, n ( procedure call p(x1 ,x2 ,... xn) )
x = y[i] or y = x[i] (Indexed copy)
x = & y, x = * y, and * x = y (Address and pointer
assignments)
Data structures for three address codes
• Quadruples
• Has four fields: op, arg1, arg2 and result
• Triples
• Temporaries are not used and instead references to
instructions are made
• Indirect triples
• In addition to triples we use a list of pointers to triples
Quadruples, Triples and Indirect Triples
a = b * minus c + b * minus c
Three address code Quadruples Triples Indirect Triples
op arg1 arg2 result op arg1 arg2
minus op op arg1 arg2
t1 = minus c c t1 0 minus c 35 (0) 0 minus c
* b t1 t2 1 * b (0)
t2 = b * t1 minus c 36 (1) 1 * b (0)
t3 2 minus c 37 (2) 2 minus c
t3 = minus c * b t3 t4 b (2)
3 * 38 (3) 3 * b (2)
t4 = b * t3 + t2 t4 t5 4 + (1) (3) 39 (4) 4 + (1) (3)
t5 = t2 + t4 = t5 a 5 = a (4) 40 (5) 5 = a (4)
a = t5
Example: Three address code for a < b or c < d and e < f
100: if a < b goto 103 110 goto 112
101: t1 = 0 111 t3 = 1
102 goto 104 112 t4 = t2 and t3
103 t1 =1 113 t5 = t1 or t4
104 if c < d goto 107
105 t2 = 0
106 goto 108
107 t2 =1
108 if e < f goto 111
109 t3=0
Example:2
while (a < b) do 100 if a < b goto 102
if (c < d) 101 goto 110
x=y+z 102 if c < d goto 104
else 103 goto 107
x=y–z 104 t1 = y + z
105 x = t1
106 goto
100
107 t2 = y - z
108 x = t2
109 goto 100
110
Example: 3
while (a < c and b > d) do 10 if a < c goto 12 20 goto 10
if (a=1) 11 goto 24 21 t2 = a + 3
c=c+1 12 if b > d goto 14 22 a = t2
else 13 goto 24 23 goto 19
while( a<= d) do 14 if a = 1 goto 16 24
a=a+3 15 goto 19

16 t1 = c + 1
17 c = t1
18 goto 10
19 if a <= d goto 21
Example: 4
switch (i + j) 10 t1 = i + j 21 if t1 = 1 goto 12
{ 11 goto 21 22 if t1 = 2 goto 15
case 1: x = y + z; break; 12 t2 = y + z 23 goto 18
case 2: u = v + w; break; 13 x = t2 24
default: p = q + r; 14 goto 24
} 15 t3 = v + w

16 u = t3
17 goto 24
18 t4 = q + r
19 p = t4
20 goto 24
Example: 5
switch (a + b)
{
case 2: { x = y; break; }
case 5: { switch x
{
case 0: { a = b + 1; break;}
case 1: { a = b + 3; break;}
default: { a = 2; break;}
}
break;
case 9: { x = y - 1; break; }
default: { a = 2; break; }
}
10 t1 = a + b 21 a = 2 32 if t1 = 2 goto 12
11 goto 32 22 goto 36 33 if t1 = 5 goto 14
12 x = y 23 if x = 0 goto 15 34 if t1 = 9 goto 27
13 goto 36 24 if x = 1 goto 18 35 goto 30
14 goto 23 25 goto 21 36
15 t2 = b+ 1 26 goto 36
16 a = t2 27 t4 = y - 1
17 goto 36 28 x = t4
18 t3 = b + 3 29 goto 36
19 a = t3 30 a = 2
20 goto 36 31 goto 36
Syntax Directed Definition
• The following CFG
S→ABC , A→aA / a , B→bB / b , C→cC / c generates
L = { anbmcp /m, n, p ≥ 1}
• We define an Attributed grammar (AG) based on the above CFG to
generate
L = { anbncn / n ≥ 1}
• All the non terminal will have only synthesized attributes.
S.equal , A.count, B.count, C.count
SDD
S→ABC { S.equal = if A.count = B.count & B.count = C.count then T
else F}
A→ a { A.count = 1}
A→aA { A1.count = A2 . count +1}
B→b { B.count = 1}
B→bB { B1.count = B2 . count +1}
C→ c { C.count = 1}
C→cC { C1.count = C2.count +1}
• Attributed grammar (AG) for the evaluation of real number from its
binary representation,
110 . 101 = 6 . 625
N→ L . R , L→BL / B , R→BR / B , B→ 0 / 1

• All the non terminal will have only synthesized attributes.


N { value : real , length : integer}
L { value : real}
B { value : real}
R { value : real}
SDD
• N → L .R { N.value = L.value + R.value}
• L→ B { L.value = B.value , L.length = 1}
• L1→B L2 { L1.length = L2 . length +1 ,
L1. value = B. value * 2 L2.length + L2.value }
• R→B { R.value = B.value / 2}
• R1→B R2 { R1.value = (B . value + R2 . Value) / 2}
• B→ 0 { B.value = 0}
• B→1 { B.value = 1}
Syntax Directed Translation (SDT)
• Most of the programming language constructs such as declarations,
expressions, control flow statements, Boolean expressions are
implemented by CFG
• While implementing these constructs the productions of the
grammar are not sufficient, some more additional information is
required.
• So we associate each production with some informal notation called
semantic rules.
• Productions with semantic rules is called Syntax Directed Translation.
SDT for Assignment statement
a=b+c*d S a := b + c * d

id := E b+c*d
a

t1 = c * d E + E c*d
t2 = b + t1
a = t2
id * E
b E

id id
c d
SDT for Expressions
E→E+T { E.val := E.val + T.val}
E→ T { E.val := T.val}
T→T*F { T.val := T.val * F.val}
T→ F { T.val := F.val}
F →id { F.val := num.lexval}
5+6*4 E

E + T

T T * F

F id
F
4
id id
5
6
5+6*4 E.val = 5 + 24 = 29

E.val = 5 + T.val = 6 * 4 = 24

T.val = 5 T.val = 6* F.val = 4

F.val = 5 id
F.val = 6
4
id id
5
6
SDT for Boolean Expression
Example: Three address code for a < b or c < d and e < f
100: if a < b goto 103 110 goto 112
101: t1 = 0 111 t3 = 1
102 goto 104 112 t4 = t2 and t3
103 t1 =1 113 t5 = t1 or t4
104 if c < d goto 107
105 t2 = 0
106 goto 108
107 t2 =1
108 if e < f goto 111
109 t3=0
SDT for Flow of Control Statements
if (B.code) if (B.code)
{ {
B.true: B.true:

} }
B.false: goto S.next
{
B.false:

}
SDT for Flow of Control Statements
SDT for Declarations
D → TL { L.type := T.type}
T → int { T.type := int}
T → real { T.type := real}
L → L1,id { L1.type := L.type}
{ enter(id.ptr, L.type)
L →id {enter(id.ptr, L.type)}
int id1, id2, id3
D L. type = T. type = int

T L L. type = T. type = int


T. type = int
int , id
L
L. type = int

L , id
L. type = int
id
T → BC { T.a := C.a , C.b = B.a}
B → int {B.a := int}
B → float { B.a := float}
C → [num]C1 { C.a := array(num.val,
C1.a )
C1.b := C.b}
C →e {C.a = C.b}
Back-patching
• To implement SDT is to uses two passes. In the first pass, construct the
syntax tree for the input. In the second pass, walk through the tree in
depth first order, computing the translation in the definition.
• The main problem for generating three address code for Boolean
statement and flow of control statements in a single pass we may not
know the labels that control must go to at the time the jump statements
are generated.
• We can solve this problem by generating a series of branching statements
with target of the jumps temporarily unspecified.
• Each such statement will be put on a list of goto statements whose label
will be filled in when the proper label can be determined.
• We call this subsequent filling in of labels backpatching.
We show how backpatching can be used to generate code for Boolean
expression in one pass.
To manipulate list of labels, we use three functions:
makelist(i) - creates a new list containing only i, an index into the
array of quadruples.
merge(p1, p2) – concatenates the lists pointed to by p1 and p2 , and
returns a pointer to the concatenated list.
backpatch(p, i) – inserts i as the target label for each of the statements
on the list pointed to by p.
Boolean Expressions

(1) E → E1 or M E2
(2) E → E1 and M E2
(3) E → not E1
(4) E → (E1)
(5) E → id1 reop id2
(6) E → true
(7) E → false
(8) M→ ε
• Consider the production E → E1 and M E2 .
• If E1 is false, then E is also false, so E1.falselist become part of
E.falselist.
• If E1 is true, then we must next test E2. So the target for the
statements E1.truelist must be the beginning of the code generated
for E2.
• This target is obtained using the marker nonterminal M.
• Attribute M.quad records the number of the first statement E2.code
• With the production M → ε we associate the semantic action
{M.quad := nextquad}
Example: a < b or c < d and e < f
Apply production (5) for a < b
100 if a < b goto ___
101 goto ___

Apply production (5) for c < d


102 if c < d goto ___
103 goto ___

Apply production (5) for e < f


104 if e < f goto ___
105 goto ___
E

E1 or M
E2
ε
a < b E1 and M
E2
ε
c < d e < f
We now reduce E → E1 and M E2. The corresponding semantic action
calls backpatch({102}, 104).
This call to backpatch fills in 104 in statement 102.
100 if a < b goto ___
101 goto ___
102 if c < d goto 104
103 goto ___
104 if e < f goto ___
105 goto ___
The semantic action associated with the final reduction E → E1 or M E2
calls backpatch({101}, 102).

This call to backpatch fills in 102 in statement 101.


100 if a < b goto ___
101 goto 102
102 if c < d goto 104
103 goto ___
104 if e < f goto ___
105 goto ___
• The entire expression is true if and only if the goto’s of statements
100 and 104 are reached.
• The entire expression is false if and only if the goto’s of statements
103 and 105 are reached.
• These instructions will have their targets filled in later in the
compilation when it seen what must be done depending on the truth
or false value of the expression.
Procedure calls
Consider a grammar for a simple procedure call statement
(1) S → call id ( Elist ) { for each item p on queue do
emit(‘param’ p)
emit(‘call’ id.place, Elist.count) }
(2) Elist → Elist , E {append E.place at the end of the queue
Elist.count = Elist.count + 1}

(3) Elist → E {Initialize queue to contain only E.place


Elist.count = 1 }
Example:
S

call id ( Elist )
read

initial Elist , E
final
final
param initial
E
param final initial
call read , 2

You might also like