100% found this document useful (1 vote)
76 views

Compiler Ch9

This document discusses code generation in compilers. It describes [1] the position of code generators in compilers, [2] issues in designing code generators such as memory management, instruction selection, register allocation and evaluation order, [3] inputs to code generators such as intermediate representations, and [4] target programs including machine language and assembly language. It also covers specific topics in code generation like instruction selection, register allocation, basic blocks and flow graphs.

Uploaded by

api-3712520
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
76 views

Compiler Ch9

This document discusses code generation in compilers. It describes [1] the position of code generators in compilers, [2] issues in designing code generators such as memory management, instruction selection, register allocation and evaluation order, [3] inputs to code generators such as intermediate representations, and [4] target programs including machine language and assembly language. It also covers specific topics in code generation like instruction selection, register allocation, basic blocks and flow graphs.

Uploaded by

api-3712520
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Code Generation

Dewan Tanvir Ahmed


Assistant Professor, CSE
BUET
Introduction

Source
Intermediate Intermediate
program Front code Code code Code target

end Optimizer generator program

Symbol
table

Position of code generator


Issues in the Design of a Code Generator

 Details depend on
– Target language
– Operating System
 But following issues are inherent in all code
generation problems
– Memory management
– Instruction Selection
– Register allocation and
– Evaluation order
Input to the Code Generator

 We assume, front end has


– Scanned, parsed and translate the source program
into a reasonably detailed intermediate
representations
– Type checking, type conversion and obvious
semantic errors have already been detected
– Symbol table is able to provide run-time address of
the data objects
– Intermediate representations may be
• Postfix notations
• Three address representations
• Stack machine code
• Syntax tree
• DAG
Target Programs

 The output of the code generator is the target


program.
 Target program may be
– Absolute machine language
• It can be placed in a fixed location of memory and
immediately executed
– Re-locatable machine language
• Subprograms to be compiled separately
• A set of re-locatable object modules can be linked
together and loaded for execution by a linker
– Assembly language
• Easier
Instruction Selection

 The nature of the instruction set of the target


machine determines the difficulty of the
instruction selection.
 Uniformity and completeness of the instruction
set are important
 Instruction speeds is also important
– Say, x = y + z
Mov y, R0
Add z, R0
Statement
Mov R0, x by statement code
generation often produces poor
code
Instruction Selection (2)
a=b+c
d=a+e

MOV b, R0
ADD c, R0
MOV R0,
MOV R0, aa If a is subsequently
used
MOV a,
MOV a, R0
R0
ADD e, R0
MOV R0, d
Instruction Selection (3)

 The quality of the generated code is


determined by its speed and size.
 Cost difference between the different
implementation may be significant.
– Say a = a + 1
Mov a, R0
Add #1, R0
Mov R0, a
– If the target machine has increment instruction (INC),
we can write
inc a
Register Allocation

 Instructions involving register operands are


usually shorter and faster than those involving
operands in memory.
 Efficient utilization of register is particularly
important in code generation.
 The use of register is subdivided into two sub
problems
– During register allocation, we select the set of
variables that will reside in register at a point in the
program.
– During a subsequent register allocation phase, we
pick the specific register that a variable will reside in.
Basic Blocks and Flow Graphs

 A graph representation of three address


statements, called flow graph.
 Nodes in the flow graph represent
computations
 Edges represent the flow of control
Basic Block:
A basic block is a sequence of consecutive statements
in which flow of control enters at the beginning and leaves at
the end without halt or possibly of the branching except at the
end.
Basic Blocks and Flow Graphs (2)

t1 = a*a
t2 = a*b
This is a basic block
t3 = 2*t2
t4 = t1+t3
t5 = b*b
t 6 = t 4+ t5
Three address statement x = y + z is said to define x and to use y and z.

A name in a basic block is said to be live at a given point if its


value is used after that point in the program, perhaps in another
basic block
Basic Blocks and Flow Graphs (3)

 Partition into basic blocks


– Method
• We first determine the leader
– The first statement is a leader
– Any statement that is the target of a conditional or
unconditional goto is a leader
– Any statement that immediately follows a goto or
unconditional goto statement is a leader
• For each leader, its basic block consists of the
leader and all the statements up to but not
including the next leader or the end of the
program.
Basic Blocks and Flow Graphs (4)

(1) prod = 0
(2) i = 1
B1
(3)
(3) tt11=4*I
=4*I
------------
------------
------------
------------
------------
------------
(11)
B2
(11) II =
= tt77
(12)
(12) If
If II <=
<= 20
20 goto
goto (3)
(3)
Transformation on Basic Block

 A basic block computes a set of expressions.


 Transformations are useful for improving the
quality of code.
 Two important classes of local optimizations
that can be applied to a basic blocks
– Structure Preserving Transformations
– Algebraic Transformations
Structure Preserving Transformations

 Common sub-expression elimination

a =b+c a =b+c
b=a–d b=a–d b is redefined

same c=b+c c=b+c


d=a-d d=b
Structure Preserving Transformations

 Dead – Code Elemination


Say, x is dead, that is never subsequently used, at the point where the
statement x = y + z appears in a block.

We can safely remove x

 Renaming Temporary Variables


– say, t = b+c where t is a temporary var.
– If we change u = b+c, then change all
instances of t to u.
 Interchange of Statements
– t1 = b + c
– t2 = x + y
– We can interchange iff neither x nor y is t1 and
neither b nor c is t2
Algebraic Transformations

 Replace expensive expressions by cheaper


one
– X=X+0 eliminate
– X=X*1 eliminate
– X = y**2 (why expensive? Answer: Normally
implemented by function call)
• by X = y * y
 Flow graph:
– We can add flow of control information to the set of
basic blocks making up a program by constructing
directed graph called flow graph.
– There is a directed edge from block B1 to block B2 if
• There is conditional or unconditional jump from
the last statement of B1 to the first statement of
B2 or
• B2 is immediately follows B1 in the order of the
Loops

 A loop is a collection of nodes in a flow graph


such that
– All nodes in the collection are strongly connected,
that is from any node in the loop to any other, there
is a path of length one or more, wholly within the
loop, and
– The collection of nodes has a unique entry, that is, a
node in the loop such that, the only way to reach a
node from a node out side the loop is to first go
through the entry.
1 A = 4*i
The DAG representation of Basic Block 2 B = a[A]
3 C = 4*i
4 D = b[C]
5 E=B*D
6 F = prod + E
+ 7 Prod = F
8 G=i+1
9 i=G
10 if I <= 20
prod *
goto (1)

<
[] =
[]

* + 20

a b 4 i0 1
Exercise

 given the code fragment

x := a*a + 2*a*b + b*b;


y := a*a – 2*a*b + b*b;

draw the dependency graph before and after


common subexpression elimination.
Answers

dependency graph before CSE

x y

+ +

+ * - *

* * b b * * b b

a a * b a a * b

2 a 2 a
Answers

dependency graph after CSE

x y

+ +

+ * - *

* * b * * b b

a * a a * b

2 2 a
Answers

dependency graph after CSE


x y

+ +

+ * -

* * b

a *

2
The End

You might also like