Lecture Notes On Loop-Invariant Code Motion
Lecture Notes On Loop-Invariant Code Motion
Lecture 17
1 Introduction
In this lecture we discuss an important instance of Partial Redundancy
Elimination (PRE) and aspects that will be the basis for almost every subse-
quent higher end optimizations. More information can be found in [App98,
Ch 18.1-18.3] and [Muc97].
L ECTURE N OTES
L17.2 Loop-Invariant Code Motion
inv(a)
n literal def (l, xi ) l outside loop inv(b)
LI0 LIo LI
inv(n) inv(xi ) inv(a ⊕ b)
j = loopinv
while (e) {
S
}
into
L ECTURE N OTES
Loop-Invariant Code Motion L17.3
L0: d = 0
L0: d = 0
L1: if (i>=N) goto L2
L1: i = i + 1
i = i + 1
d = a ⊕ b
a Good: b Bad: d = a ⊕ b
M[i] = d
M[i] = d
if (i<N) goto L1
goto L1
L2: x = d
L2: x = d
L0: d = 0
L0: d = 0
L1: i = i + 1
L1: M[j] = d
d = a ⊕ b
i = i + 1
M[i] = d
c Bad: d Bad: d = a ⊕ b
d = 0
M[i] = d
M[j] = d
if (i<N) goto L1
if (i<N) goto L1
L2: x = d
L2:
Figure 1: Good and bad examples for code motion of the loop-invariant
computation d=a⊕b in non-SSA. a: good. b: bad, because d used after
loop, yet should not be changed if loop iterates 0 times c: bad, because d
reassigned in loop body, thus would be killed. d: bad, because initial d
used in loop body before computing d=a⊕b.
L ECTURE N OTES
L17.4 Loop-Invariant Code Motion
2. and d is only defined once in the loop body (violated in Figure 1c),
3. and d is not live after the block before the loop (violated in Figure 1d)
Condition 2 is trivial for SSA. Condition 3 is simple, because d can only
be defined once, which is still in the loop body, and thus cannot be live
before. Condition 1 holds on SSA, if we make sure that we do not assign
to the same variable in unrelated parts of the SSA graph (every variable
assigned only once, statically, globally). The node doesn’t generally need
to dominate all loop exits in SSA form. But if the variable is live, then it
will. If it is doesn’t dominate one of the loop exits (and thus the variable
is not live after it), then the loop-invariant code motion optimization will
compute the expression in vain, but that still pays off if the loop usually
executes often.
While-loops more often violate condition 1, because the loop body doesn’t
dominate the statements following the loop. A way around that is to turn
while-loops into repeat-until-loops by prefixing them with an if statement
testing if they will be executed at all. Turn
while (e) {
T
j = loopinv // does not dominate all loop exits
S
}
into
if (e) {
repeat {
T
j = loopinv // dominates all loop exits
S
} until (!e)
}
3 Finding Loops
In source code, loops are obvious. But how do we find them in an interme-
diate code representation? Initially, we can easily tag loops, because they
L ECTURE N OTES
Loop-Invariant Code Motion L17.5
come from source code. Depending on the optimizations, this may become
a little more tricky, however, if previous optimizations aggressively shuf-
fled code around. More generally: how do find where the loops are in quite
arbitrary intermediate code graphs?
We have already seen dominators in the theory behind SSA construc-
tion. There, the dominance frontier gives the minimal φ-node placement.
Here we are not really interested in the dominance frontier, just in the dom-
inator relation itself. We recap
back edge
≥
Near such a back edge there is a loop. But where exactly is it? The natural
loop for the back edge are all nodes a that the back edge start (h for header)
also dominates and that have a path from a to the back edge end n without
passing through h.
{a : h ≥ a, a → s1 → s2 → ... → sk → n with si 6= h}
Note that the header does not uniquely identify the loop, because the same
header node could be the target of multiple back edges coming from a
branching structure into two natural loops. But the back edge uniquely
identifies its natural loop.
In loop optimization, it almost always makes sense to follow the golden
rule of optimizing inner loops first, because that’s where most of the time is
L ECTURE N OTES
L17.6 Loop-Invariant Code Motion
4 Strength Reduction
The basic idea of strength reduction is to replace computationally expen-
sive operations by simpler ones that still have an equivalent effect. The
primary application is to simplify multiplication by index variables to ad-
ditions within loops. This optimization is crucial for computers where ad-
dition is a lot faster than multiplication and can gain a factor of 3 for nu-
merical programs.
The simplest instance of strength reduction turns a multiplication op-
eration x ∗ 2n into a shift operation x n. More tricky uses of strength
reduction occur frequently in loop traversals. Suppose we have a program-
ming language with two-dimensional array operations (or equivalent array
packing optimizations) occurring in a loop
The address arithmetic for accessing a[i,j] is more involved, because it uses
the base address a of the array and the size s of the base type to compute
a+i∗m∗s+j∗s
L ECTURE N OTES
Loop-Invariant Code Motion L17.7
t ← a;
e ← a + n*m*s;
if (t >= e) goto E;
L:
use *t ...;
t ← t + s;
E:
This optimized version only needs one addition per loop. It is essentially
based on the insight that a[i+1,j] is the same memory location as a[i,j+m-
1]. The optimization we have used here assumes that i and j are not used
otherwise in the loop body, so that their computation can be eliminated.
Otherwise, they stay.
In order to perform this strength reduction, however, we need to know
which of the variables change linearly in the loop. It certainly would be
incorrect if there was a nonlinear change of i, like in i = i ∗ (i + 1).
Quiz
1. inv(e) means that e is loop invariant. Will we detect all loop-invariant
e? Should we move all such e out of the loop? Could we move all such
e out of the loop?
3. Should we move all e outside the loop when inv(e) holds and e is
side-effect free?
L ECTURE N OTES
L17.8 Loop-Invariant Code Motion
10. Are there circumstances where we can optimize the code by moving
code into a loop as opposed to out of it?
12. Will the inv(e) analysis find all formulas e such that e is invariant
during loop execution? Does this solve the verification problem for
programs?
13. Consider the transformation from a while loop to a repeat until loop
with an if around. How many problems does that solve at the same
time?
15. Given that loop invariant code motion has so many side conditions
(especially for nonSSA), should we do it at all?
16. Why can strength reduction have such a huge impact compared to
other optimizations? Give a natural practical example.
References
[App98] Andrew W. Appel. Modern Compiler Implementation in ML. Cam-
bridge University Press, Cambridge, England, 1998.
L ECTURE N OTES