Contents
• Variants of Syntax trees
• Three Address Code Types
• Declarations
• Procedures
• Assignment Statements
• Translation of Expressions
• Control Flow
• Back Patching
• Switch
• Case Statements
Intermediate Code
• An language between source and target language
• Intermediate code is generated because the compiler can’t
generate machine code directly in one pass.
• It converts the source program into intermediate code, which performs
efficient generation of machine code further
• Provides an intermediate level of abstraction
• More details than source
• Fewer details than the target
Benefits of Intermediate Codes
• A compiler for different machines can be created by attaching
different backend to the existing front ends of each machine.
• A compiler for different source languages (on the same
machine) can be created by providing front ends for the
corresponding source languages to existing back end.
• A machine independent code optimizer can be applied to
intermediate code in order to optimize the code generation.
Different Intermediate Forms
1. Abstract syntax tree & DAG
2. Postfix notation
3. Three address code
1. Abstract syntax tree & DAG
• A syntax tree depicts the natural hierarchical structure of a source
program.
• A DAG gives the same information but in a more compact way because
common sub-expressions are identified.
• Ex: a=b*-c+b*-c
= =
+
a +
a
* * *
b uminus b uminus b uminus
c c Syntax Tree c DAG
Directed Acyclic Graph (DAG)
• A DAG is a three-address code that is generated as the result of an
intermediate code generation.
• It demonstrates how the statement’s computed value is used in
subsequent statements
• Example :
T1 = a + b
T2 = T1 + c
T3 = T1 x T2
Example - DAG
a + a * (b - c) + (b - c) * d
Exercise 1
Construct the DAG for the expression
((x + y) - ((x + y) * (x - y))) + ((x + y) * (x - y))
Exercise 2
Construct the DAG and identify the value numbers for the
subexpressions of the following expressions, assuming +
associates from the left.
a) a + b + (a + b)
b) a + b + a + b
c) a + a + (a + a + a + (a + a + a + a))
Solution
a) b) c)
Syntax-directed definition to produce
syntax trees or DAG
a +a*(b-c)+ (b-c)*d
Steps for constructing the DAG
2. Postfix Notation
• Postfix notation is a linearization of a syntax tree.
• In postfix notation the operands occurs first and then operators
are arranged.
• Ex: (A + B) * (C + D)
Postfix notation: A B + C D + *
• Ex: (A + B) * C
Postfix notation: A B + C *
• Ex: (A * B) + (C * D)
Postfix notation: A B * C D * +
3. Three address code
• Three address code is a sequence of statements of the general form,
a:= b op c
• Where a, b or c are the operands that can be names or constants and op
stands for any operator.
• Example: a = b + c + d
t1=b+c
t2=t1+d
a=t2
• Here t1 and t2 are the temporary names generated by the compiler.
• Three-address code
• There are at most three addresses allowed (two for operands and one for result).
Different Representation of Three Address Code
• There are three types of representation used for three address
code:
1. Quadruples
2. Triples
3. Indirect triples
• Ex: x= -a*b + -a*b
t1= - a
t2 = t1 * b
t3= - a
t4 = t3 * b
t5 = t2 + t4
x= t5
Quadruple
• The quadruple is a structure with at the most four fields such as op, arg1,
arg2 and result.
• The op field is used to represent the internal code for operator.
• The arg1 and arg2 represent the two operands.
• And result field is used to store the result of an expression.
Quadruple
x= y op z x= -a*b + -a*b No. Operator Arg1 Arg2 Result
x = op y [No arg2] t1= - a (0) uminus a t1
x = y [No arg2] t2 = t1 * b (1) * t1 b t2 Pointer to an
param x [No arg2, result] t3= - a (2) uminus a t3 entry in the
symbol table
goto L [L result] t4 = t3 * b (3) * t3 b t4
(including the
If x relop y the goto L t5 = t2 + t4 (4) + t2 t4 t5 temp variables)
[L result] x= t5 (5) = t5 x
Triple
• To avoid entering temporary names into the symbol table, we might refer
a temporary value by the position of the statement that computes it.
• If we do so, three address statements can be represented by records with
only three fields: op, arg1 and arg2.
x= -a*b + -a*b Quadruple Triple
No. Operator Arg1 Arg2 Result No. Operator Arg1 Arg2
(0) uminus a t1 (0) uminus a
(1) * t1 b t2 (1) * (0) b
(2) uminus a t3 (2) uminus a
(3) * t3 b t4 (3) * (2) b
(4) + t2 t4 t5 (4) + (1) (3)
(5) = t5 x (5) = x (4)
Indirect Triple
• In the indirect triple representation the listing of triples has been done.
And listing pointers are used instead of using statement numbers.
• This implementation is called indirect triples.
Triple Indirect Triple
No. Operator Arg1 Arg2 Statement No. Operator Arg1 Arg2
(0) uminus a (0) (14) (0) uminus a
(1) * (0) b (1) (15) (1) * (14) b
(2) uminus a (2) (16) (2) uminus a
(3) * (2) b (3) (17) (3) * (16) b
(4) + (1) (3) (4) (18) (4) + (15) (17)
(5) = x (4) (5) (19) (5) = x (18)
Example 2
Translate the arithmetic expression a + -(b + c) into:
a) A syntax tree.
b) Quadruples.
c) Triples.
d) Indirect triples.
Solution
Syntax Tree Quadruples
Triples Indirect Triples
Exercise
Write quadruple, triple and indirect triple for following:
1. a = b[i] + c[j]
2. a[i] = b*c - b*d
3. x = f(y+1) + 2
Solution 1
Quadruples Indirect Triples
Triples
Solution 2
Quadruples Indirect Triples
Triples
Solution 3
Quadruples Indirect Triples
Triples
Types of three address statements
1. Binary Assignment statements
2. Unary Assignment statements
3. Copy statements
4. Unconditional jump
5. Conditional jump
6. Procedural call
7. Indexed assignments
8. Address and pointer assignments
1. & 2. Assignment Statements
1. Binary Assignment statements
• Syn: X = Y op Z where X,Y,Z are compiler generated statements
• Ex: a = b+c*d
Three address statement
t1 = c * d
t2 = b + t1
a = t2
3. Copy statements
2. Unary Assignment statements Syn: X = Y
• Syn: X = op Y Ex: a = t2
• Ex: a = b*-c Three address statement
Three address statement a = t2
t1 = -c
t2 = b * t1
a = t2
4. Unconditional jump & 5. Conditional jump
• Unconditional jump
• Syn: goto L [the control will be transferred to the three
address statement labelled with L]
• Conditional jump
• Syn: if x relop y goto L
• Ex: if x > y goto L
• True the control will be transferred to statement labelled with L
• False next statement after conditional jump statement
6. Procedure call
• Syn: P (x1, x2, …, xn)
• Three address statement
Param x1
Param x2
…
Param xn
Call P, n
• Syn: return y
• Three address statement
return y
7. Indexed Assignments
• Syn: x = y[index] & y[index] = x
• Ex: a[i]=b[i]
Three address statement
t=b[i]
a[i]=t
8. Address and pointer assignments
• Syn: x = & y, x = *y, *x = y
• Ex: *x = *y
Three address statement
t = *y
*x = t
Boolean Expression (True / False)
1. Compute logical values (T / F)
2. Conditional expressions in flow of control statements
• Ex: if BE then S
if BE then S1 else S2
while BE do S
• Boolean operators: and, or, not
Operators Precedence Associativity
or 3 [Low] Left
and 2 Left
not 1 [High] Right
Translations schemes for Boolean
Expressions
• To translate the Boolean Expression into 3-Address Code
• Types
1. Numerical representation
[1 True; 0 False]
2. Flow of control method [Short Circuit method]
[a or b => if a is true no need to verify b]
[a and b => if a is false no need to verify b]
Example
1. a or b and not c 2. a < b
Three address representation if a < b then 1 else 0
t1 = not c Three address representation
t2 = b and t1 100: if a < b then goto 103 [100 + 3]
t3 = a or t2 101: t = 0
102: goto 104 [102 + 2]
103: t = 1
1. SDT using numerical representation
for Boolean expressions Produce 3ac
statement in
output file
Boolean Expression SDT [Semantics]
E E1 or E2 {[Link] = newtemp; emit([Link] = [Link] ‘or’ [Link]);}
E E1 and E2 {[Link] = newtemp; emit([Link] = [Link] ‘and’ [Link]);}
E not E1 {[Link] = newtemp; emit([Link] = not [Link]);}
E (E1) {[Link] = newtemp;}
E id1 relop id2 {[Link] = newtemp;
emit(if [Link] relop [Link] then curstmt+3;
emit([Link]=0;)
emit(goto curstmt+2);
emit([Link]=1);}
1 True {[Link] = newtemp; emit([Link]=1);}
0 False {[Link] = newtemp; emit([Link]=0);}
Example
100: if a<b then goto 103
101: t1=0
• a<b or c<d and e<f 102: goto 104
103: t1=1
104: if c<d then goto 107
105: t2=0
106: goto 108
107: t2=1
108: if e<f then goto 111
109: t3=0
110: goto 112
111: t3=1
112: t4 = t2 and t3
113: t5 = t1 or t4
2. Control Flow Translation of Boolean
Expressions [Short circuit code or jump code]
• Generate code without generating code for Boolean operators
• Generate code without evaluating the entire expression
• Ex:
A< B
If A<B goto [Link]
Goto [Link]
SDT to construct 3AC for Booleans
SDT to construct 3AC for Booleans – cont…
Example:
a<b or c<d and e<f
Cont…
Three address code a<b or c<d and e<f
If a<b then goto [Link]
Goto L1
L1: If c<d then goto L2
Goto Lfalse
L2: If e<f then goto Ltrue
Goto Lfalse
Flow of control statements
1. Code for if-then
2. Code for if-then-else
3. Code for while-do
Example
Case Statements
Translation of a case statement
Procedure calls
Function Calling: P(X1,X2,….,Xn) main()
{
Three address code: .
Param x1 .
Param x2 add(x,y)
. .
. .
. }
Param xn void add(int a, int b)
Call P, n {
.
.
}
SDT for procedure calls
S call id (Elist) {for each item p on queue do emit (‘param’ p);
emit(‘call’ [Link])}
Elist Elist, E {append [Link] to the end of queue}
Elist E {initialize queue to contain [Link]}
Example
add (a, b) QUEUE a b
p p
Three address statement
param a
param b
call add
NOTE:
[Link] = for storing the number of actual parameters
Declaration statements
SDT for declarations
{offset:=0}
Example
{
a: integer;
b: real;
c: array[10] of integer;
d: ^real;
}
Backpatching
• Problem generated by short-circuit code in generation of 3ac
• Backpatching is a technique used in compiler design to delay the assignment of
addresses to code or data structures until a later stage of the compilation
process.
• This allows the compiler to generate code with placeholder addresses that are
later updated or "backpatched" with the correct addresses once they are known.
• Backpatching is commonly used in compilers for languages that support
complex control structures or dynamic memory allocation.
• Leaves the label unspecified and fill it later
• Ex: a<b or c>d and e<f
Functions in backpatching
• Makelist(i) – creates a new list with index, i
• Merge(L1, L2) – concatenate List, L1 with list, L2
• Backpatch(L, label) – fills all 3ac in list, L with the target, Label
• Nextquad – gives index of next quadruple
SDT for Boolean expression using
backpatching
Example 1
a<b or c>d and e<f
Translation
scheme for
Boolean
expressions
Example 2
Generate Intermediate code for
the following expression using
backpatching
x < 100 || x > 200 && x ! = y