Pumping Lemma
for
Context-free Languages
Fall 2005 Costas Busch - RPI 1
Take an infinite context-free language
Generates an infinite number
of different strings
Example: S ABE | bBd
A Aa | a
B bSD | cc
D Dd | d
E eE | e
Fall 2005 Costas Busch - RPI 2
In a derivation of a “long” enough
string, variables are repeated
A possible derivation:
S ABE AaBE aaBE
aabSDE aabbBdDE
aaabbccdDE aabbccddE
aabbccddeE aabbccddee
Fall 2005 Costas Busch - RPI 3
Derivation Tree aabbccddee
S
A B E
e E
A a b S D
a e
b B d d
c c
Repeated
variable
Fall 2005 Costas Busch - RPI 4
Derivation Tree aabbccddee
S
A B E
e E
A a b S D
a e
b B d d
c c
Repeated
variable
Fall 2005 Costas Busch - RPI 5
B bSD bbBdD bbBdd
B
b S D
b B d d
*
B bbBdd
Fall 2005 Costas Busch - RPI 6
S ABE AaBE aaBE aaBeE aaBee
A B E
e E
A a
a e
*
S aaBee
Fall 2005 Costas Busch - RPI 7
B
c c
B cc
Fall 2005 Costas Busch - RPI 8
Putting all together
S
A B E
e E
A a b S D
a e
b B d d
c c
* *
S aaBee B bbBdd B cc
Fall 2005 Costas Busch - RPI 9
* *
S aaBee B bbBdd B cc
* *
0 0
S aaBee aaccee aa(bb) cc(dd ) ee
0 0
aa(bb) cc(dd ) ee L(G )
Fall 2005 Costas Busch - RPI 10
We have removed the middle part
S
A B E
c e E
A a c
a e
*
0 0
S aa(bb) cc(dd ) ee
Fall 2005 Costas Busch - RPI 11
* *
S aaBee B bbBdd B cc
* *
S aaBee aabbBddee
* *
2 2 2 2
aa (bb) B (dd ) ee aa (bb) cc(dd ) ee
2 2
aa(bb) cc(dd ) ee L(G )
Fall 2005 Costas Busch - RPI 12
We have repeated middle part two times
S
A B E
e E
A a b S D
a 1 b B d d
e
b S D
2 b B d d
c c
*
2 2
S aa(bb) cc(dd ) ee
Fall 2005 Costas Busch - RPI 13
* *
S aaBee B bbBdd B cc
*
3 3
S aa(bb) cc(dd ) ee L(G )
Fall 2005 Costas Busch - RPI 14
We have repeated middle part three times
S
A B E
e E
A a b S D
a 1 b B d d
e
b S D
2 b B d d
b S D
3 b B d d
c c
*
3 3
S aa (bb) cc(dd ) ee
Fall 2005 Costas Busch - RPI 15
In General:
* *
S aaBee B bbBdd B cc
*
i i
S aa(bb) cc(dd ) ee L(G )
For any i0
Fall 2005 Costas Busch - RPI 16
Repeat middle part i times
S
A B E
e E
A a b S D
a 1 b B d d
e
b S D
i b B d d
c c
*
i i
S aa (bb) cc(dd ) ee
Fall 2005 Costas Busch - RPI 17
From Grammar
And string
S ABE | bBd
A Aa | a aabbccddee L(G )
B bSD | cc
D Dd | d
E eE | e
We inferred that a family of strings is in L(G )
*
i i
S aa(bb) cc(dd ) ee L(G ) for any i0
Fall 2005 Costas Busch - RPI 18
Consider now an Arbitrary Grammar
Consider now an arbitrary infinite
context-free language L
Let G be the grammar of L {}
Take G so that it has no unit-productions
and no -productions
(remove them)
Fall 2005 Costas Busch - RPI 19
Let r be the number of variables
Let t be the maximum size
of the right-hand side
of any production
Example: S ABE | bBd r 5
A Aa | a
B bSD | cc t 3
D Dd | d
E eE | e
Fall 2005 Costas Busch - RPI 20
r
Let m t 1
Take a string w L(G )
with length | w | m
Claim: in the derivation tree of w
there is a path from the root to a leaf
where a variable of G is repeated
Fall 2005 Costas Busch - RPI 21
Derivation tree of w
| w | m S
We will show:
Some variable
H
Is repeated
Fall 2005 Costas Busch - RPI 22
Proof of Claim:
We will show that the tree of w
Has at least one path with r 2 nodes
Suppose the opposite:
At most
r 1
Levels
Fall 2005 Costas Busch - RPI 23
Maximum number of nodes per level
Level 0: 1 nodes
Level 1: t nodes
t nodes
Fall 2005 Costas Busch - RPI 24
Maximum number of nodes per level
Level 0: 1 nodes
Level 1: t nodes
2
Level 2: t nodes
t nodes t nodes
2 nodes
t
Fall 2005 Costas Busch - RPI 25
Maximum number of nodes per level
Level 0: 1 nodes
At most
r 1 i : t i nodes
Level
Levels
Level r : t r nodes
Maximum possible string length =
r
= max nodes at level r = t
Fall 2005 Costas Busch - RPI 26
Therefore,
r
Maximum length of string w: | w | t
r
However we took, | w | m t 1
Contradiction!!!
Fall 2005 Costas Busch - RPI 27
Thus, there is a path from the root
to a leaf with at least r 2 nodes
V1 V1 S
At least V2
r2
Levels V3 r 1 Variables
Vr 1
symbol
Fall 2005 Costas Busch - RPI 28
Since there are at most r different variables,
some variable is repeated
V1 S
V2
H
V3 Pigeonhole
principle
H
Vr 1
END OF PROOF
Fall 2005 Costas Busch - RPI 29
Take now a string w with | w | m
S
H
Some variable H
is repeated
H
subtree
Take H to be deep, so that every path
in the subtree has unique variables
Fall 2005 Costas Busch - RPI 30
S
w uvxyz
yield u z yield
H
yield v y yield
H
u , v, x, y , z :
Strings of terminals x yield
Fall 2005 Costas Busch - RPI 31
Example: S
A B E
e E
A a b S D
e
u aa a b B d d
u v y z
v bb c c
x cc x
y dd B corresponds to H
z ee
Fall 2005 Costas Busch - RPI 32
Possible derivations S
S uHz
u z
H
H vHy
v y
H
Hx
x
Fall 2005 Costas Busch - RPI 33
Example: u aa
S
v bb
A B E
x cc
e E
A a b S D y dd
e
a b B d d z ee
c c B corresponds to H
S uHz H vHy H x
* *
S aaBee B bbBdd B cc
Fall 2005 Costas Busch - RPI 34
S uHz H vHy Hx
*
0 0
S uHz uxz uv xy z L(G )
Fall 2005 Costas Busch - RPI 35
S uHz H vHy Hx
* *
1 1
S uAz uvAyz uvxyz uv xy z L(G )
The original w uvxyz
Fall 2005 Costas Busch - RPI 36
S uHz H vHy Hx
* *
S uHz uvHyz uvvHyyz
*
2 2
uvvxyyz uv xy z L(G )
Fall 2005 Costas Busch - RPI 37
S uHz H vHy Hx
* *
S uHz uvHyz uvvHyyz
* *
3 3
uvvvHyyyz uvvvxyyyz uv xy z L(G )
Fall 2005 Costas Busch - RPI 38
S uHz H vHy Hx
S* uHz * uvHyz * uvvHyyz *
*
* uvvvvHy yyyz
*
* i i
uvvvvxy yyyz uv xy z L(G)
Fall 2005 Costas Busch - RPI 39
Therefore,
If we know that w uvxyz L(G )
i i
we also know that uv xy z L(G )
For all i 0
L(G ) L {}
i i
uv xy z L
Fall 2005 Costas Busch - RPI 40
Observations: S
| vy | 1
Since there are u z
H
no unit or
-productions
v y
H
subtree
| vxy | m
x
Since no variable is
repeated in any path
in subtree
Fall 2005 Costas Busch - RPI 41
The Pumping Lemma:
For any infinite context-free language L
there exists an integer m such that
for any string w L, | w | m
we can write w uvxyz
with lengths | vxy | m and | vy | 1
and it must be that:
i i
uv xy z L, for all i 0
Fall 2005 Costas Busch - RPI 42
Applications
of
The Pumping Lemma
Fall 2005 Costas Busch - RPI 43
Non-context free languages
n n n
{a b c : n 0}
Context-free languages
n n
{a b : n 0}
Fall 2005 Costas Busch - RPI 44
Theorem: The language
n n n
L {a b c : n 0}
is not context free
Proof: Use the Pumping Lemma
for context-free languages
Fall 2005 Costas Busch - RPI 45
n n n
L {a b c : n 0}
Assume for contradiction that L
is context-free
Since L is context-free and infinite
we can apply the pumping lemma
Fall 2005 Costas Busch - RPI 46
n n n
L {a b c : n 0}
Pumping Lemma gives a magic number m
such that:
Pick any string w L with length | w | m
m m m
We pick: wa b c
Fall 2005 Costas Busch - RPI 47
n n n
L {a b c : n 0}
m m m
wa b c
We can write: w uvxyz
with lengths | vxy | m and | vy | 1
Fall 2005 Costas Busch - RPI 48
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Pumping Lemma says:
i i
uv xy z L for all i0
Fall 2005 Costas Busch - RPI 49
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
We examine all the possible locations
of string vxy in w
Fall 2005 Costas Busch - RPI 50
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
m
Case 1: vxy is within a
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 51
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 1: v and y consist from only a
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 52
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 1: Repeating v and y
k 1
mk m m
aaaaaa...aaaaaa bbb...bbb ccc...ccc
u 2 2 z
v xy
Fall 2005 Costas Busch - RPI 53
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
2 2
Case 1: From Pumping Lemma: uv xy z L
k 1
mk m m
aaaaaa...aaaaaa bbb...bbb ccc...ccc
u 2 2 z
v xy
Fall 2005 Costas Busch - RPI 54
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
2 2
Case 1: From Pumping Lemma: uv xy z L
k 1
2 2 m k m m
However: uv xy z a b c L
Contradiction!!!
Fall 2005 Costas Busch - RPI 55
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
m
Case 2: vxy is within b
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 56
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 2: Similar analysis with case 1
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 57
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
m
Case 3: vxy is within c
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 58
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 3: Similar analysis with case 1
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 59
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
m m
Case 4: vxy overlaps a and b
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 60
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 4: Possibility 1: v contains only a
y contains only b
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 61
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 4: Possibility 1: v contains only a
k1 k 2 1 y contains only b
m k1 m k2 m
aaa...aaaaaaa bbbbbbb...bbb ccc...ccc
u 2 2 z
v xy
Fall 2005 Costas Busch - RPI 62
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
2 2
Case 4: From Pumping Lemma: uv xy z L
k1 k 2 1
m k1 m k2 m
aaa...aaaaaaa bbbbbbb...bbb ccc...ccc
u 2 2 z
v xy
Fall 2005 Costas Busch - RPI 63
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
2 2
Case 4: From Pumping Lemma: uv xy z L
k1 k 2 1
2 2 m k1 m k2 m
However: uv xy z a b c L
Contradiction!!!
Fall 2005 Costas Busch - RPI 64
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 4: Possibility 2: v contains a and b
y contains only b
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 65
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 4: Possibility 2: v contains a and b
k1 k 2 k 1 y contains only b
m k1 k 2 mk m
aaa...aaaaabbaabb bbbbbbb...bbb ccc...ccc
u 2
v xy 2 z
Fall 2005 Costas Busch - RPI 66
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
2 2
Case 4: From Pumping Lemma: uv xy z L
k1 k 2 k 1
m k1 k 2 mk m
aaa...aaaaabbaabb bbbbbbb...bbb ccc...ccc
u 2
v xy 2 z
Fall 2005 Costas Busch - RPI 67
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
2 2
Case 4: From Pumping Lemma: uv xy z L
However: k1 k 2 k 1
2 2 m k1 k 2 m k m
uv xy z a b a b c L
Contradiction!!!
Fall 2005 Costas Busch - RPI 68
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 4: Possibility 3: v contains only a
y contains a and b
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 69
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 4: Possibility 3: v contains only a
y contains a and b
Similar analysis with Possibility 2
Fall 2005 Costas Busch - RPI 70
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
m m
Case 5: vxy overlaps b and c
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 71
n n n
L {a b c : n 0}
m m m
wa b c
w uvxyz | vxy | m | vy | 1
Case 5: Similar analysis with case 4
m m m
aaa...aaa bbb...bbb ccc...ccc
u vxy z
Fall 2005 Costas Busch - RPI 72
There are no other cases to consider
(since | vxy | m , string vxy cannot
m m m
overlap a , b and c at the same time)
Fall 2005 Costas Busch - RPI 73
In all cases we obtained a contradiction
Therefore: The original assumption that
n n n
L {a b c : n 0}
is context-free must be wrong
Conclusion: L is not context-free
Fall 2005 Costas Busch - RPI 74