0% found this document useful (0 votes)
57 views41 pages

Code Generation

The document discusses the design and implementation of a code generator, emphasizing the importance of correct and high-quality output code. Key issues include instruction selection, register allocation, and evaluation order, along with transformations on basic blocks to improve code quality. It also outlines algorithms for managing registers and generating code for expressions, highlighting the significance of efficient code generation in programming languages.

Uploaded by

hidingmyself117
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views41 pages

Code Generation

The document discusses the design and implementation of a code generator, emphasizing the importance of correct and high-quality output code. Key issues include instruction selection, register allocation, and evaluation order, along with transformations on basic blocks to improve code quality. It also outlines algorithms for managing registers and generating code for expressions, highlighting the significance of efficient code generation in programming languages.

Uploaded by

hidingmyself117
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

.

Code Generation

• output code must be correct


• output code must be of high quality
• code generator should run efficiently
Code Generation Reshma Pise 1
Issues in the design of code generator
1. Input: Intermediate representation with symbol
table
assume that input has been validated by the front
end

2. target programs :
– absolute machine language
fast for small programs
– relocatable machine code
requires linker and loader
– assembly code
requires assembler, linker, and loader
Code Generation Reshma Pise 2
3. Instruction Selection The nature of the instruction set of the target
machine determines the difficulty of instruction selection.

Instructions used
• a = a+ 1 Mov A, a
• Add A, 1
Uniformity

completeness

Instruction speed a= a* 2

4. Register allocation
- Instructions with register operands are faster
– store long life time and counters in registers

5. Evaluation order Code Generation Reshma Pise 3


Instruction Selection
The nature of the instruction set of the target machine
determines the difficulty of instruction selection. So, the
instruction selection depends upon:
Instructions used i.e. which instructions should be used in
case there are multiple instructions that do the same job.
Uniformity i.e. support for different object/data types, what
op-codes are applicable on what data types etc.
Completeness: Not all source programs can be
converted/translated into machine code for all
architectures/machines. E.g., 80x86 doesn't support
multiplication.
Instruction Speed: This is needed for better performance.
Code Generation Reshma Pise 4
Instruction Selection
• straight forward code if efficiency is not an issue
a=b+c Mov b, R 0
Add c, R 0
Mov R0, a
d=a+e Mov a, R0 can be eliminated
Add e, R0
Mov R0, d

a=a+1 Mov a, R0 Inc a


Add #1, R0
Mov R0, a
Code Generation Reshma Pise 5
Register Allocation:
finding an optimal assignment of registers to variables is difficult.
Instructions involving registers are usually faster than those involving
operands memory.
Store long life time values that are often used in registers.

Evaluation Order:
The order in which the instructions will be executed affect the efficiency
of the target code. This increases performance of the code. Picking a
best order is difficult.

x=a+d
Code Generation Reshma Pise 6
Basic blocks
• sequence of consecutive statements in which flow of control
enters at the beginning and leaves at the end without halt or
branching except at the end.
Algorithm to identify basic blocks
1. To determine leader:
- first statement is a leader
- any target of a goto statement is a leader
- any statement that follows a goto statement is a leader
2. for each leader its basic block consists of the leader and all
statements up to next leader
Code Generation Reshma Pise 7
Flow graphs
• A graph representation of three-address statements.
• add control flow information to basic blocks
• nodes are the basic blocks
• there is a directed edge from B1 to B2 if B 2can follow B1 in
some execution sequence
- there is a jump from the last statement of B1 to the first
statement of B2
- B2 follows B1 in natural order of execution of program &
B1 does not end in an unconditional jump.
• initial node: block with first statement as leader
Code Generation Reshma Pise 8
Example

Begin
prod := 0;
i := 1;
do begin
prod := prod + a[i] * b[i];
i := i + 1;
end
while i<= 20
end

Code Generation Reshma Pise 9


Example

Begin (1) prod := 0


prod := 0; (2) i := 1
i := 1; (3) t1 := 4 * i
(4) t2 := a[t1]
do begin (5) t3 := 4 * i
prod := prod + a[i] * b[i];(6) t4 := b[t3]
i := i + 1; (7) t5 := t2 * t4
end (8) t6 := prod + t5
(9) prod := t6
while i<= 20
(10) t7:= i + 1
end (11) i := t7
(12) if i<= 20 goto (3)
Code Generation Reshma Pise 10
Example
(1) i := m – 1 (16) t7 := 4 * i
(2) j := n (17) t8 := 4 * j
(3) t1 := 4 * n (18) t9 := a[t8]
(4) v := a[t1] (19) a[t7] := t9
(5) i := i + 1 (20) t10 := 4 * j
(6) t2 := 4 * i (21) a[t10] := x
(7) t3 := a[t2] (22) goto (5)
(8) if t3 < v goto (5) (23) t11 := 4 * i
(9) j := j - 1 (24) x := a[t11]
(10) t4 := 4 * j (25) t12 := 4 * i
(11) t5 := a[t4] (26) t13 := 4 * n
(12) If t5 > v goto (9) (27) t14 := a[t13]
(13) if i >= j goto (23) (28) a[t12] := t14
(14) t6 := 4*i (29) t15 := 4 * n
(15) x := a[t6] (30) a[t15] := x

Code Generation Reshma Pise 11


Example: CFG
(1) i := m – 1 (16) t7 := 4 * i
(2) j := n (17) t8 := 4 * j
(3) t1 := 4 * n (18) t9 := a[t8]
(4) v := a[t1] (19) a[t7] := t9
(5) i := i + 1 (20) t10 := 4 * j
(6) t2 := 4 * i (21) a[t10] := x
(7) t3 := a[t2] (22) goto (5)
(8) if t3 < v goto (5) (23) t11 := 4 * i
(9) j := j - 1 (24) x := a[t11]
(10) t4 := 4 * j (25) t12 := 4 * i
(11) t5 := a[t4] (26) t13 := 4 * n
(12) If t5 > v goto (9) (27) t14 := a[t13]
(13) if i >= j goto (23) (28) a[t12] := t14
(14) t6 := 4*i (29) t15 := 4 * n
(15) x := a[t6] (30) a[t15] := x

Code Generation Reshma Pise 12


TRANSFORMATIONS ON BASIC BLOCKS
Transformations are useful for improving the quality of code generated
from a basic block.
Classification of local transformations-
1 STRUCTURE-PRESERVING TRANSFORMATIONS
2 ALGEBRAIC TRANSFORMATIONS

STRUCTURE-PRESERVING TRANSFORMATIONS
• The primary structure-preserving transformations on basic blocks are:
1. common sub-expression elimination
2. dead-code elimination
3. renaming of temporary variables
4. interchange of two independent adjacent statements

Code Generation Reshma Pise 13


1. Common sub-expression elimination
Consider the basic block.
a:= b+c
b:= a-d
c:= b+c
d:= a-d

The second and fourth statements compute the same expression,


this basic block may be transformed into the equivalent block

a:= b+c
b:= a-d
c:= b+c
d:= b
Code Generation Reshma Pise 14
2. Dead-code elimination
Suppose x is dead,
never subsequently used,
at the point where the statement x:= y+z appears in a basic block.
Then this statement may be safely removed without changing the value
of the basic block.

Code Generation Reshma Pise 15


3. Renaming temporary variables
Suppose we have a statement t:= b+c, where t is a temporary.
If we change this statement to u:= b+c, where u is a new temporary
variable, and change all uses of this instance of t to u,
then the value of the basic block is not changed.
We call such a basic block a normal-form block.

Code Generation Reshma Pise 16


4. Interchange of statements
Suppose we have a block with the two adjacent statements
t1:= b+c
t2:= x+y

• Then we can interchange the two statements without affecting the value
of the block if and only if neither x nor y is t1 and neither b nor c is t2.

Code Generation Reshma Pise 17


ALGEBRAIC TRANSFORMATION
• It can be be used to change the set of expressions computed by a basic
block into an algebraically equivalent set.
x := x +0
Or
x := x*1
can be eliminated from a basic block without changing the set of
expressions it computes.
The exponentiation operator in the statements
x := y ** 2
usually requires a function call to implement. Using an algebraic
transformation, this statement can be replaced by cheaper, but
equivalent statement
x := y*y
Code Generation Reshma Pise 18
Next use information

• for register and temporary allocation


remove variables from registers if not used
• statement X = Y op Z defines X and uses Y and Z
• scan each basic blocks backwards
• assume all temporaries are dead on exit and all user variables
are live on exit

Code Generation Reshma Pise 19


Algorithm to compute next use information

• Suppose we are scanning i : X := Y op Z in backward scan


1. attach to stmt i information in symbol table about X, Y, Z
2. set X to not live and no next use in symbol table
3. set Y and Z to be live and next use in i in symbol table

• Note that the order of steps (2) and (3) may not be
interchanged because x may be y or z.
• If three-address statement i is of the form x := y or x := op
y, the steps are the same as above, ignoring z.

Code Generation Reshma Pise 20


• Example
1: t1 = a * a
2: t 2 = a * b
3: t3 = 2 * t2
4: t4 = t1 + t3
5: t5 = b * b
6: t6 = t 4 + t5
7: X = t 6
For example, consider the basic block shown above.

Code Generation Reshma Pise 21


Example
1: t1 = a * a
2: t 2 = a * b
3: t3 = 2 * t2
4: t4 = t1 + t3
5: t5 = b * b
6: t6 = t 4 + t5
7: X = t 6

Code Generation Reshma Pise 22


A simple Code Generator
• Considers each statement
• uses descriptors to keep track of register contents & address
for names.
1. Register descriptor
- Keeps track of what is currently in each register.
- Initially all the registers are empty
2. Address descriptor
- Keeps track of location where current value of the name
can be found at runtime.
- The location might be a register, stack, memory address
Code Generation Reshma Pise 23
Example of RD & AD Usage

Register descriptor Address Descriptor


Initially R1, R2, R3 = Empty x: R1
y: R1
a: R2 :30
R1 : x , y 14 14 b: R3, Memory
R2: a : 30 Code
3 Address
R3: b :30
x = z *2 Memory
y=x x : 5 => 14
a = a + 10 y : 10
a : 20
t= a+b b : 30
Z :7
Code Generation Reshma Pise 24
Code Generation Algorithm
for each X = Y op Z do

1. Invoke a function getreg( ) to determine location L where X


must be stored.
Usually L is a register.

2. Consult address descriptor of Y to determine Y'.


Prefer a register for Y'.
If value of Y not already in L generate
MOV Y', L

Code Generation Reshma Pise 25


Code Generation Algorithm (contd.)
for each X = Y op Z do
[Link]
OP Z', L
Again prefer a register for Z.
Update address descriptor of X to indicate X is in L.
If L is a register, update its descriptor to indicate that it
contains X and remove X from all other register descriptors.
4. If current value of Y and/or Z have
no next use and
are dead on exit from block and
are in registers, change register descriptor
to indicate that they no longer contain Y and/or Z.
Code Generation Reshma Pise 26
Function getreg( ) for X = Y op Z
1. If Y is in register (that holds no other values)
and e.g. x = y + z
Y is not live and 1)
has no next use after X = Y op Z RD – R1:y , R2:z AD: y :
then return register of Y for L. R1
& update the address desc. of Y to indicate return R1, AD: y : _
that Y is not in L
Code: ADD R2, R1
2. Failing (1) return an empty register
3. Failing (2) if X has a next use in the block or
RD – R1:x
op requires register (such as indexing) then Code:Mov R1, R3 , ADD R2, R3
get a register R, store its content into M (by 2)x = y + z
Mov R, M) and use it. ( select reg minimum …
spills) v=y+z
4. else select memory location X as L
return R3

3) R1: Y R2:z R3: v R4: s,t


return R3 or R4
Code Generation Reshma Pise 27
([Link])
For example: Use simple code generator algorithm to produce the code
for the expression-
d := (a - b) + (a - c) + (a - c)

Might be translated into the following 3-address code sequence.


t1:= a – b
t2:= a – c
t3:= t1 + t2
d:= t3 + t2
With d live at the end.
Assumption- a,b,c are always in memory.
t1,t2,t3 being temporaries, are not in memory unless
we explicitly store their values with a MOV instruction.
R0, R1 Available registers.
Code Generation Reshma Pise 28
For example, the assignment d := (a - b) + (a - c) + (a - c)

Stmt code genrated reg. desc addr. desc


t 1 =a-b MOV a, R0 R0 contains t1 t1 in R0
SUB b, R0

t2 =a-c MOV a, R1 R0 contains t 1 t1 in R0


SUB c, R1 R1 contains t2 t2 in R1

t3 =t1 +t 2 ADD R1 ,R0 R0 contains t3 t3 in R 0


R1 contains t2 t2 in R1

d=t3 +t2 ADD R1 ,R0 R0 contains d d in R0


MOV R0 ,d d in R0
Code Generation Reshma Pise and memory 29
Peephole Optimization

• target code often contains redundant instructions


and suboptimal constructs

• examine a short sequence of target instruction


(peephole) and replace by a shorter or faster
sequence

• peephole is a small moving window on the target


systems

Code Generation Reshma Pise 30


Peephole Optimization
Characteristics of peephole optimization
1. redundant instruction elimination ( Unreachable )
2. flow of control optimizations
3. algebraic simplification
4. use of machine idioms

Code Generation Reshma Pise 31


Peephole optimization examples…
Redundant loads and stores

• Consider the code sequence

Mov R0, a
Mov a, R0

• Instruction 2 can always be removed if it does not have


a label.

Code Generation Reshma Pise 32


Peephole optimization examples…
Unreachable code
• Consider following code sequence
#define debug 0
if (debug) {
print debugging info
}
this may be translated as
if debug = 1 goto L1
goto L2
L1: print debugging info
L2:

Eliminate jump over jumps


if debug <> 1 goto L2
print debugging information
L2: Code Generation Reshma Pise 33
Unreachable code example …
constant propagation
if 0 <> 1 goto L2
print debugging information
L2:

Evaluate boolean expression. Since if condition is always true


the code becomes
goto L2
print debugging information
L2:

The print statement is now unreachable. Therefore, the code


becomes
L2: Code Generation Reshma Pise 34
Peephole optimization examples…
• flow of control: replace jump sequences

goto L1 goto L2
… by …
… …
L1: goto L2
L1 : goto L2

• Simplify algebraic expressions

remove x := x+0 or x:=x*1

Code Generation Reshma Pise 35


Peephole optimization examples…
• Strength reduction
– Replace X^2 by X*X
– Replace multiplication by left shift
– Replace division by right shift

• Use faster machine instructions


replace Add #1,R
by Inc R

Code Generation Reshma Pise 36


Genrating code from DAG
t4
-
Consider following
basic block t1 t3
+ -

t1 = a + b
t2 = c + d a b e + t2

t3 = e –t2
t4 = t1 –t3 c d

and its DAG


Code Generation Reshma Pise 37
Three adress code for the
DAG (assuming only two
registers are available) - t4

MOV a, R0 t1 t3
+ -
ADD b, R0
MOV c, R1
ADD d, R1
a b e + t2
MOV R0, t1
MOV e, R0
SUB R1, R0 c d
MOV t1, R1
SUB R0, R1
MOV R1, t4 Code Generation Reshma Pise 38
Generating code from DAG contd..

Rearranged order of
the statements

t2 = c + d
t3 = e –t2
t1 = a + b
t4 = t1 –t3

Code Generation Reshma Pise 39


Rearranging order …
Rearranging the
code as
t2 = c + d
t3 = e –t2
t1 = a + b
t4 = t1 –t3
gives
MOV c, R0
ADD d, R0
MOV e, R1
SUB R 0, R 1
MOV a, R0
ADD b, R0
SUB R,R
Code Generation Reshma Pise 40
Rearranging order …
Rearranging the
code as
Three adress code for
the DAG (assuming t2 = c + d
only two registers are t3 = e –t2
available)
t1 = a + b
MOV a, R0 t4 = t1 –t3
ADD b, R0 gives
MOV c, R1 MOV c, R0
ADD d, R1 ADD d, R0
MOV R0, t1 MOV e, R1
MOV e, R0 SUB R0, R1
SUB R1, R0 MOV a, R0
MOV t1, R1 ADD b, R0
SUB R0, R1 SUB R1, R0
MOV R0, t4
MOV R1, t4
Code Generation Reshma Pise 41

You might also like