Unit 5 (1):
Code Generation
• Design Issues
• The target language
• Address in target code
• Basic blocks
• Flow graphs
• Optimization of basic blocks
• Code Generator
Introduction
• Final phase of compiler
• Creates assembly language/machine language
• Various properties required by an object code generation:
• Correctness of code:
• It should produce a correct code and not alter the purpose of source code.
• High quality code
• It should produce a high quality code.
• Efficient use of recourses: memory utilization
• Quick code generation
Design Issues
• Various issues are listed in the code generation phase:
1. Input to the code generator
2. Target program
3. Memory management
4. Instruction selection
5. Register allocation
6. Evaluation order
The Target Languages Absolute Code:
• It is a machine code that contains
reference to actual address within
program address space.
• It can be placed in a fixed
location in memory and can be
executed immediately. (user for
small program)
Relocatable Code:
• It allows subprograms to be compiled
separately.
• A set of relocatable object modules can
be linked together and loaded using
Linking Loader
• It makes the process of code generation
easier.
Assembly language:
Address in target code
• Prior knowledge of target machine and instruction set is required for
designing the good code generator.
• Use registers: consider three address code (OP source destination)
• OP:
• MOV
• ADD
• SUB
Addressing modes
Basic Block
• Basic block contains a sequence of statement.
• The flow of control enters at the beginning of the statement and leave at
the end without any halt or possibility of branching.
• The following sequence of three address statements forms a basic block:
t1:= x*x
t2:= x*y
t3:= 2 * t2
t4:= t1 + t3
t5:= y*y
t6:= t4 + t5
Basic Terminologies used in Basic
Block
• Define
• Use
Example: a:=b+c
define: a
use: b, c
• Live: basic block live after given point
• Dead: basic block is dead (not live) after given point.
Algo for Basic block construction
(Partition Algo)
• Input: It contains the sequence of three address
statements
• Output: it contains a list of basic blocks with each three
address statement in exactly one block
• Method: First identify the leader in the code. The rules
for finding leaders are as follows:
• The first statement is a leader.
• Statement L is a leader if there is an conditional or
unconditional goto statement like: if....goto L or goto L
• Instruction L is a leader if it immediately follows a goto or
conditional goto statement like: if goto B or goto B
Example: (1) prod := 0
(2) i := 1
prod :=0; (3) t1 := 4* i
(4) t2 := a[t1]
i:=1; (5) t3 := 4* i
(6) t4 := b[t3]
do (7) t5 := t2*t4
(8) t6 := prod+t5
{ (9) prod := t6
(10) t7 := i+1
(11) i := t7
prod :=prod+ a[i] * b[i]; (12) if i<=10 goto (3)
i =i+1;
}while (i <= 10);
Flow Graph
• Flow graph is a directed graph.
• It contains the flow of control information for the set of
basic block.
• A control flow graph is used to depict that how the
program control is being parsed among the blocks.
• It is useful in the loop optimization.
Example:
(1) prod := 0
(2) i := 1
(3) t1 := 4* i
(4) t2 := a[t1]
(5) t3 := 4* i
(6) t4 := b[t3]
(7) t5 := t2*t4
(8) t6 := prod+t5
(9) prod := t6
(10) t7 := i+1
(11) i := t7
(12) if i<=10 goto (3)
Optimization of Basic Blocks
• Optimization process can be applied on a basic block.
• While optimization, we don't need to change the set of
expressions computed by the block.
• There are two type of basic block optimization:
1. Structure-Preserving Transformations
2. Use of Algebraic Identities
1. Structure preserving
transformations
• Based on DAG
• Structure-Preserving Transformations can be applied
using following techniques:
• Common sub-expression elimination
• Dead code elimination
• Renaming of temporary variables
• Interchange of two independent adjacent statements
Example:
On Common sub-expression elimination
a:=b+c
a:=b+c
b:=a-d b:=a-d
c:=b+c c : = b+c
d:=a-d d:=b
2. Use of Algebraic Identities
• Algebraic identities are used in peephole optimization technique.
• Simple example:
a+0=a
a*1=a
a/1=a
Example: 2*a -------- a+a //Strength reduction
a/2 -------- a*0.5 // strength reduction
a=2*6.2---a=12.4
x: =y*z //x:constant
=y*z folding
Common subexpression elimination,
t:=z*r*y t:=x*r Use of associativity,
Use of commutative
Code Generation
• Conversion of three address code to target code (Assembly Language)
• Use registers for the conversion
• Operations: MOV, ADD, SUB, MUL, DIV
Example: Generate target code for
the expression:
x:=(a+b)*(c-d)+((e/f)*(a+b))
MOV a, R1
t1:=a+b ADD b, R1
t2:=c-d MOV c, R2
t3:=e/f SUB d, R2
t4:=t1*t2 MOV e, R3
t5:=t3*t1 DIV f, R3
t6:=t4+t5 MUL R1, R2
MUL R3, R1
ADD R2, R1
Example:
• a=b+c
MOV b, R1
• d=a+e
ADD c, R1
ADD e, R1
MOV R1, d
T1:=a+b
Example:T2:=d+e
T3:=c-T2
(a+b)-(c-(d+e)) T4:=T1-T3
MOV d, R0
• Using two registers: ADD e, R0
MOV c, R1
SUB R0,R1
MOV a, R0
ADD b, R0 MOV d, R0
SUB R1, R0 ADD e, R0
• Using one register: MOV R0, t1
MOV c, R0
SUB t1, R0
MOV R0,t2
MOV a, R0
ADD b, R0
SUB t2, R0