Lexical Analysis
Converting NFAs to DFAs (subset
construction)
• Idea: Each state in the new DFA will correspond to
some set of states from the NFA. The DFA will be in
state {s0,s1,…} after input if the NFA could be in any of
these states for the same input.
• Input: NFA N with state set SN, alphabet Σ, start state
sN, final states FN, transition function TN: SN x {Σ U ε} 🡪
SN
• Output: DFA D with state set SD, alphabet Σ, start state
sD = ε-closure(sN), final states FD, transition function
T D: S D x Σ 🡪 S D
Terminology: ε-closure
ε-closure(T) = T + all NFA states reachable from any state in T
using only ε transitions.
b
1 2 b
a ε
b 5 ε-closure({1,2,5}) = {1,2,5}
a
ε-closure({4}) = {1,4}
3
ε
4 ε-closure({3}) = {1,3,4}
ε-closure({3,5}) = {1,3,4,5}
Illustrating Conversion – An Example
Start with NFA: ∈ (a | b)*abb
a
2 3
∈ ∈
start ∈ ∈ a b
0 1 6 7 8 9
∈ ∈ ∈ b
b
4 5 10
First we calculate: ∈-closure(0) (i.e., state 0)
∈-closure(0) = {0, 1, 2, 4, 7} (all states reachable from 0
on ∈-moves)
Let A={0, 1, 2, 4, 7} be a state of new DFA, D.
Conversion Example – continued (1)
2nd , we calculate : a : ∈-closure(move(A,a)) and
b : ∈-closure(move(A,b))
a : ∈-closure(move(A,a)) = ∈-closure(move({0,1,2,4,7},a))}
adds {3,8} ( since move(2,a)=3 and move(7,a)=8)
From this we have : ∈-closure({3,8}) = {1,2,3,4,6,7,8}
(since 3→6 →1 →4, 6 →7, and 1 →2 all by ∈-moves)
Let B={1,2,3,4,6,7,8} be a new state. Define Dtran[A,a] = B.
b : ∈-closure(move(A,b)) = ∈-closure(move({0,1,2,4,7},b))
adds {5} ( since move(4,b)=5)
From this we have : ∈-closure({5}) = {1,2,4,5,6,7}
(since 5→6 →1 →4, 6 →7, and 1 →2 all by ∈-moves)
Let C={1,2,4,5,6,7} be a new state. Define Dtran[A,b] = C.
Conversion Example – continued (2)
3rd , we calculate for state B on {a,b}
a : ∈-closure(move(B,a)) = ∈-closure(move({1,2,3,4,6,7,8},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[B,a] = B.
b : ∈-closure(move(B,b)) = ∈-closure(move({1,2,3,4,6,7,8},b))}
= {1,2,4,5,6,7,9} = D
Define Dtran[B,b] = D.
4th , we calculate for state C on {a,b}
a : ∈-closure(move(C,a)) = ∈-closure(move({1,2,4,5,6,7},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[C,a] = B.
b : ∈-closure(move(C,b)) = ∈-closure(move({1,2,4,5,6,7},b))}
= {1,2,4,5,6,7} = C
Define Dtran[C,b] = C.
Conversion Example – continued (3)
5th , we calculate for state D on {a,b}
a : ∈-closure(move(D,a)) = ∈-closure(move({1,2,4,5,6,7,9},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[D,a] = B.
b : ∈-closure(move(D,b)) = ∈-closure(move({1,2,4,5,6,7,9},b))}
= {1,2,4,5,6,7,10} = E
Define Dtran[D,b] = E.
Finally, we calculate for state E on {a,b}
a : ∈-closure(move(E,a)) = ∈-closure(move({1,2,4,5,6,7,10},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[E,a] = B.
b : ∈-closure(move(E,b)) = ∈-closure(move({1,2,4,5,6,7,10},b))}
= {1,2,4,5,6,7} = C
Define Dtran[E,b] = C.
Conversion Example – continued (4)
This gives the transition table Dtran for the DFA of:
Input Symbol
Dstates a b
A B C
B B D
C B C
D B E
E B C
b C b
start A a B b D b E
a
a a
Algorithm For Subset Construction
push all states in T onto stack; computing the
initialize ∈-closure(T) to T; ∈-closure
while stack is not empty do begin
pop t, the top element, off the stack;
for each state u with edge from t to u labeled ∈ do
if u is not in ∈-closure(T) do begin
add u to ∈-closure(T) ;
push u onto stack
end
end
Algorithm For Subset Construction – (2)
initially, ∈-closure(s0) is only (unmarked) state in Dstates;
while there is unmarked state T in Dstates do begin
mark T;
for each input symbol a do begin
U := ∈-closure(move(T,a));
if U is not in Dstates then
add U as an unmarked state to Dstates;
Dtran[T,a] := U
end
end
Example 2: Subset Construction
NFA N with
NFA • State set SN = {1,2,3,4,5},
• Alphabet Σ = {a,b}
• Start state sN=1,
start ε
1 2 a,b • Final states FN={5},
• Transition function TN: SN x {Σ ∪ ε} 🡪 SN
a b 5
a,b
3
b
4 a b ε
1 3 - 2
2 5 5, 4 -
3 - 4 -
4 5 5 -
5 - - -
Example 2: Subset Construction
NFA 1
start
,
start ε 2
1 2 a,b
a b 5
a,b
3 4 T ∈-closure(move(T, ∈-closure(move(T, b))
b
{1,2} a))
Example 2: Subset Construction
NFA 1 4
start b
, ,
2 5
start ε a
1 2 a,b
3
a b 5 ,
5
a,b T ∈-closure(move(T, ∈-closure(move(T, b))
3 4
b {1,2} a))
{3,5} {4,5}
{3,5}
{4,5}
Example 2: Subset Construction
NFA 1 4
start b
, ,
2 5
start ε a
1 2 a,b
3
b , b 4
a 5
5
a,b T ∈-closure(move(T, ∈-closure(move(T,
3 4 a)) b))
b {1,2} {3,5} {4,5}
{3,5} - {4}
{4,5}
{4}
Example 2: Subset Construction
NFA 1 4
start b a,b
, , 5
2 5
start ε a
1 2 a,b
3
b , b 4
a 5
5
a,b T ∈-closure(move(T, ∈-closure(move(T, b))
3 4
b {1,2} a))
{3,5} {4,5}
{3,5} - {4}
{4,5} {5} {5}
{4}
{5}
Example 2: Subset Construction
NFA 1 4
start b a,b
, , 5
2 5
start ε a a,b
1 2 a,b
3
b , b 4
a 5
5
a,b T ∈-closure(move(T, ∈-closure(move(T,
3 4
b {1,2} a))
{3,5} b))
{4,5}
{3,5} - {4}
{4,5} {5} {5}
{4} {5} {5}
{5} - -
Example 2: Subset Construction
start 1 b 4 a,b
, , 5
2 5
NFA
a
a,b
start ε 3 b
1 2 a,b , 4
b 5
a 5
a,b T ∈-closure(move(T, ∈-closure(move(T, b))
3 4
b {1,2} a))
{3,5} {4,5}
{3,5} - {4}
{4,5} {5} {5}
All final states since the {4} {5} {5}
NFA final state is included
{5} - -
Example 3: Subset Construction
NFA
start b
1 2 b
a ε b 5
a
3 4
ε
Example 3: Subset Construction
NFA DFA
start b start a 1,
1 2 b 1 3,
b 4
a ε b 5 b a
a 2 b 1, a
3 4 3,
ε b b 4,
a 5
1,
4,
5
Example 4: Subset Construction
NFA DFA
b
start ε b 1, start
1 2 a 3 5 2,
4
ε a
3 b 3 b
b , , 3
a 4 5
b 4 5 b
a
4
a
Converting DFAs to REs
1. Combine serial links by concatenation
2. Combine parallel links by alternation
3. Remove self-loops by Kleene closure
4. Select a node (other than initial or final) for removal. Replace it
with a set of equivalent links whose path expressions
correspond to the in and out links
5. Repeat steps 1-4 until the graph consists of a single link
between the entry and exit nodes.
Example
a
start d b d a d
0 1 2 3 4 5
c
d b b
6 c 7
parallel edges become alternation
start d a|b|c d a d
0 1 2 3 4 5
d b
b|c
6 7
Example
start d a|b|c d a d
0 1 2 3 4 5
d b
b|c
6 7
serial edges become concatenation
start d (a|b|c) d a d
0 3 4 5
b (b|c) d
Example
start d (a|b|c) d a d
0 3 4 5
b (b|c) d
Find paths that can be “shortened”
start d (a|b|c) d a d
0 3 4 5
b(b|c)da
Example
start d (a|b|c) d a d
0 3 4 5
b(b|c)da eliminate self-loops
start d (a|b|c) d a (b(b|c)da)*d
0 3 4 5
serial edges become concatenation
start d (a|b|c) d a (b(b|c)da)*d
0 5
Relationship among RE, NFA, DFA
• The set of strings recognized by an NFA can be described by a
Regular Expression.
• The set of strings described by a Regular Expression can be
recognized by an NFA.
• The set of strings recognized by an DFA can be described by a
Regular Expression.
• The set of strings described by a Regular Expression can be
recognized by an DFA.
• DFAs, NFAs, and Regular Expressions all have the same “power”.
They describe “Regular Sets” (“Regular Languages”)
• The DFA may have a lot more states than the NFA.