0% found this document useful (0 votes)
21 views28 pages

Module-3 notes

Uploaded by

4mt22ci006
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
21 views28 pages

Module-3 notes

Uploaded by

4mt22ci006
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 28

Module-3

Context – Free Grammars


Context Free Grammar is formal grammar, the syntax or structure of a formal
language can be described using context-free grammar (CFG). A context-free
grammar is a formal notation for expressing recursive definitions of languages.

Definition of Context-Free Grammars:

The context-free grammar is defined by quadruple,

G=(V, T, P, S)

where,
• V is a finite set of variables, also called sometimes non-terminals. Each
variable represents a language i.e., a set of strings.
• T is a finite set of symbols that form the strings of the language being
defined. We call this alphabet the terminals or terminal symbols.
• S is the start symbol, is one of the variables represents the language being
defined.
• There is a finite set of productions or rules that represent the recursive
definition of a language. Each production consists of:
a) A variable that is being defined by the production. This variable is
often called the head of the production.
b) The production symbol →
c) A string of zero or more terminals and variables. This string, called
the body of the production, represents one way to form strings in the
language of the variable of the head.

Example: Consider the grammar for arithmetic expression as follows:


G= (V, T, S, P) with
V={E}
T={+, *, id}
P={ E→E+E
E→E-E
E→E*E
E→E/E
E→ id }
and E is the start symbol.

Derivations using a Grammar:


“The process of obtaining string of terminals and/or non-terminals from the start
symbol by applying some or all productions is called derivation”.

Let us derive the sentence id+id*id from the grammar for arithmetic expression:

E => E+E
=>id+E
=>id+E*E
=>id+id*E
=>id+id*id
Leftmost and Rightmost Derivations
LMD:
In order to restrict the number of choices we have in deriving a string, it is often
useful to require that at each step we replace the leftmost variable by one of its
production bodies. Such a derivation is called a leftmost derivation, and we

indicate that a derivation is leftmost by using the relations and , for one or
many steps, respectively.

RMD:
Similarly, it is possible to require that at each step the rightmost variable is
replaced by one of its bodies. If so, we call the derivation rightmost and use the

symbols and to indicate one or many rightmost derivations steps,


respectively.

Ex: Consider the string w=id+id*id. Let us derive w using LMD and RMD as
follows:
LMD:
E E+E
id+E
id+E*E
id+id*E
id+id*id
RMD:

E E+E

E+E*E

E+E*id
E+id*id

id+id*id

The Language of a Grammar:

If a language L is the language of some contex-free grammar, then L is said to


be context-free language or CFL.

Sentential Forms:

Examples:
Design Context free grammar for the given language:
1. Construct the CFG for the language having any number of a's over the set
∑= {a}.
Ans :
S→aS | ε

Therefore, G=(V,T,S,P)
where V={S}, T={a}, P={S→aS | ε}, S is the start symbol.
2. Construct a CFG for the regular expression (0+1)*
Ans: S→0S | 1S | ε
Therefore, G=(V,T,S,P)
where V={S}, T={0,1}, P={S→0S | 1S | ε}, S is the start symbol.
3. Construct a CFG for a language L = {wcwR | where w € (a, b)*}. Also
derive the string “abbcbba”
Ans:
S→aSa |bSa |c

Derivations:
S => aSa
S => abSba //S→bSb
S =>abbSbba //S→bSb
S =>abbcbba //S→c

4. Construct a CFG for the language L = anb2n where n>=1.


Ans:
S→aSbb |ab

5. L = { anbn | n is a positive integer }


Ans:
S -> aSb | ab

therefore G=(V, T, P, S) where,


V = { S } , T = { a , b }, P = { S -> aSb , S -> ab }and S is the start symbol.

6. Write CFG for balanced paranthesis.


Ans:
S→(S) | [S] | {S} |SS| ε
7. L={wwR : w€{a,b}*}

Ans: S→aSa |bSb | ε

8. The CFG for Plaindrome of 0’s and 1’s is


Ans:
S→aSa |bSb|a |b| ε

9. L={0i 1j | i≠j , i>=0, j>=0}


Ans:
S→0S1 | A | B
A→0A |0
B→1B | 1
10. L={ anbmcn | n>=0, m>=0}
Ans:
S→aAc | ε
A→aAc | B
B→ bB | ε
11.L={ w€{a,b}*: na(w)=nb(w)}
Ans: S→aSb | bSa | SS | ε

12. L={ w€{a,b}*: na(w)>nb(w)}

Ans: S→aSb | bSa | SS | a| aS

13.L={ w€{a,b}*: na(w)<nb(w)}

Ans: S→aSb | bSa | SS | b| bS


Note: Refer notes for More problems and its solutions.

Parse Trees
There is a tree representation for derivations that has proved extremely useful.
This tree shows us clearly how the symbols of a terminal string are grouped into
substrings, each of which belongs to the language of one of the variables of the
grammar.

Constructing Parse Trees


Let us consider a grammar G=(V, T, P, S). The parse trees for G are trees with
the following conditions:
1. Root of the tree is labelled by start symbol S of the grammar
2. Each interior node is labeled by a variable in V.
3. Each leaf is labeled by either a variable, a terminal, or ε. However, if the
leaf is labelled ε, then it must be the only child of its parent.
4. If an interior node is labeled A, and its children are labeled
X 1 , X2, ……………., X k
respectively, from the left, then A→X 1 X 2 …….Xk is a production in P.
Note that the only time one of the X’s can be ε is if that is the label of
the only child, and A→ε is a production of G.

Example: Consider the following grammar,


A derivation sequence of acb,
S=>ASB
=>aASB
=>aSB
=>acB
=>acbB
=>acb

The Parse tree of acb is,

The Yield of a Parse Tree


If we look at the leaves of any parse tree and concatenate them from the left,we
get a string, called the yield of the tree, which is always a string that is derived
from the root variable. Of special importance are those parse trees such that:
1. The yield is a terminal string. That is, all leaves are labeled either with a
terminal or with ε.
2. The root is labeled by the start symbol.
These are the parse trees whose yields are strings in the language of the
underlying grammar.
Ex: in the above parse tree yield is, a ε c b ε=acb.

Exercise on parse tree:

1. Consider the grammar S→ (L) | a

L→L, S | S

i. What are the terminals, nonterminal and the start symbol?

ii. Find the parse tree for the following sentence

a. (a,a)

b. (a, (a, a))

c. (a, ((a,a),(a,a)))

iii. Construct LMD and RMD for each.

Ambiguity in Grammars and Languages:


Applications of CFG’s often rely on the grammar to provide the structure of files.
When a grammar fails to provide unique structures, it is sometimes possible to
redesign the grammar to make the structure unique for each string in the language.
Unfortunately, sometimes we cannot do so. That is, there are some CFL’s that are
“inherently ambiguous”; every grammar for the language puts more than one
structure on some strings in the language.

Ambiguous Grammars:
Q. What do you mean by ambiguous grammar? Explain with
example.
A CFG is ambiguous if it produces more than one parse tree for a string in the
language. – i.e. there exists any string in the language that is the yield of two or
more parse trees. In other way, if we are able to derive any string of a grammar
either by applying LMD two or more time or applying RMD two or more time
we call such grammar as Ambiguous grammar.

Example:
1. A good example for ambiguous grammar is the grammar for arithmetic
expressions:

E → E+E | E*E | E-E | E/E | id | (E)


For the expression id+id*id we can generate 2 parse trees and 2 leftmost or 2
rightmost derivations as follows:

Since we got 2 different parse trees and LMD’s we say that given grammar is
ambiguous.

Removing Ambiguity From Grammars:

For the most parsers, the grammar must be unambiguous. Because,


• unambiguous grammar means unique selection of the parse tree for a
sentence
• We should eliminate the ambiguity in the grammar during the design phase
of the compiler.
• An ambiguous grammar should be written to eliminate the ambiguity.
• We have to prefer one of the parse trees of a sentence (generated by an
ambiguous grammar) to disambiguate that grammar to restrict to this
choice.

Sometimes an ambiguous grammar can be rewritten to eliminate the


ambiguity.

Example 1:
There are two causes of ambiguity in the grammar of arithmetic expression,

E→ E+E | E-E |E*E |E/E |(E) | id

1. The precedence of operators is not respected.


2. A sequence of identical operators can group either from the left or from
the right.

Ambiguous grammars (because of ambiguous operators) can be disambiguated


according to the precedence and associativity rules. We can eliminate ambiguity
of this grammar using the following precedence and associativity rules.

Precedence: from highest to lowest


*, / (left to right)
+, - (left to right)
Therefore the unambiguous grammar for arithmetic expression is,
E→E+T | E – T
T→ T*F |T/F
F→ (E) |id

Exercise Questions: Prove that the following grammars are ambiguous.


a) S→ AB | aaB
A→a | Aa
B→ b
b) S→ABA
A→aA |ɛ
B→bB | ɛ

Inherent Ambiguity
A context free language L is said to be inherently ambiguous if all its grammars
are ambiguous. If even one grammar for L is unambiguous, then L is an
unambiguous language.
Consider the Language,

The grammar for this language is,


Since it is impossible to derived more than one parse tree for any strings of
this grammar, it is an example for inherently ambiguous grammar.

PUSHDOWN AUTOMATA
The context free languages have a type of automaton that defined them. This
automaton, called a “pushdown automaton”, is an extension of the nondeterministic finite
automaton with ε – transitions, which is one of the ways to define the regular languages.

The pushdown automaton is essentially an ε – NFA with the addition of a stack. The
stack canbe read, pushed and popped only at the top, just like the “stack” data structure.

We define two different versions of the pushdown automaton: one that accepts by
entering an accepting state, like finite automata do and another version that accepts by
emptying its stack, regardless of the state it is in. we show that these two variations accept
exactly the context free languages i.e. grammars can be converted to equivalent pushdown
automata and vice-versa.

Model of Pushdown Automata

❖ The pushdown automaton is in essence a nondeterministic finite automaton with ε


– transitions permitted and one additional capability: a stack on which it can store a
string of “stack symbols”.
❖ The presence of a stack means that unlike finite automaton, the pushdown
automaton can remember an infinite amount of information.
❖ The model of PDA is as shown below:

Finite
Input Accept/reject
state
control
Stack
Figure 6.1: A Pushdown automata with stack

❖ As shown in the Figure 6.1, PDA has finite state control that reads inputs, one symbol
at a time. The pushdown automaton is allowed to observe the symbol at the top of
the stack and to base its transition on its current state, input symbol, and the symbol
at the top of stack.

FORMAL DEFINITION OF PUSHDOWN AUTOMATA

❖ Formal definition for pushdown automata (PDA) involves seven components. A


PDA P is defined as follows:
P = (Q, Σ, Γ, δ, q0, Z0, F)
❖ Where,
• Q: a finite set of states, like the states of a finite automata
• Σ: a finite set of input symbols, also analogous to the corresponding
component of afinite automaton.
• Γ: a finite stack alphabet. It is the set of symbols that we are allowed to
push onto the stack.
• δ: the transition function. As for a finite automaton, δ governs the
behavior of the automaton.
δ: QX Σ U {ϵ} X Γ → Q X Γ*
Formally, δ takes as argument a triple δ(q, a, X), where:
1. q is the state in Q
2. a is either an input symbol in Σ or a = ε, the empty string, which
is assumed not to be an input symbol.
3. X is a stack symbol, which is a member of Γ.
The output of δ is a finite set of pairs (p, γ), where p is the new state
and γ is the string of stack symbols that replaces X at the top of stack.
• q0: the start state. The PDA is in this state before making any transitions.
• Z0: the start symbol. Initially, the PDA’s stack consists of one instance of
this symbol, and nothing else.
• F: the set of accepting states, or final states.

A Graphical Notation for PDA’s:


The transition function δ will explain the behavior of a PDA. It has following
components:

Example: Let us design a PDA P to accept the language L={wwR : w €{0,1}*}


We shall use a stack symbol Z0 to mark the bottom of the stack. We need to have
this symbol present so that, after we pop w off the stack and realize that we have
seen wwR on the input, we still have something on the stack to permit us to make
a transition to the accepting state, q2. Thus, our PDA for Lwwr can be described
as:

where δ is defined by the following rules:


Transition diagram is,

Figure 2: PDA for L={wwR : w €{0,1}*}

Instantaneous Descriptions of a PDA:


The PDA goes from configuration to configuration, in response to input symbols
(or sometimes ε ) but unlike the finite automaton, where the state is the only thing
that we need to know about the automaton, the PDA’s configuration involves
both the state and the contents of the stack. We shall represent the configuration
of a PDA by a triple (q,w,γ), where:
1. q is the state,
2. w is the remaining input, and
3. γ is the stack contents

Such a triple is called an instantaneous description, or ID of the pushdown


automaton.
Example:
Let us consider the action of the PDA of Example L={wwR } on the input 1111.
Since q0 is the start state and Z0 is the start symbol, the initial ID is (q0,1111,Z0).
The IDs of the PDA are:

The Languages of a PDA:


❖ We have assumed that a PDA accepts it input by consuming it and entering an
accepting state. We call this approach “acceptance by final state”.
❖ There is a second approach to define the language of PDA that has important
applications. We may also define for any PDA the language “accepted by empty
stack”, i.e. the set of strings that cause the PDA to empty the stack, starting from the
initial ID.
❖ These two methods are equivalent, in the sense that a language L has a PDA that
accepts it by final state if and only if L has a PDA that accepts it by empty stack.

EQUIVALENCE OF ACCEPTANCE BY FINAL STATE AND EMPTY


STACK

From Empty Stack to Final State:


From Final State to Empty Stack
Deterministic Pushdown Automata

Definition:

Example for Deterministic PDA is,

The PDA to accept L={wCwR : w€{0,1}*}


Example for Non-deterministic PDA is,

The PDA to accept L={wwR : w€{a,b}*}

Regular Languages and Deterministic PDA’s:


2. Obtain a PDA to accept a string of balanced parenthesis. The parathesis
to be considered are (, ), [, ]. Ex: (( )), [( )( )],( ) ( )
Ans: The transition diagram for the given language is,
Therefore PDA P=(Q, ∑, Γ, δ, q0, Z0, F)
Where,
Q={q0, q1}
∑={a,b}
Γ={(, [, Zo}
δ is as shown in the transition diagram.
q0 is the initial state
F={q1}

3. Construct a Pushdown automata to accept the language L={a nbn :n>=1}.


Also show the IDs for the string w=aaabbb.

Given w=aaabbb. The IDs are:


4. Construct a PDA to accept the language L={ w: w€ {a,b}* and
na(w)=nb(w)}.
Let us trace the string w=abbbaa as follows:

Note: For more problems refer class work


a)From Grammars to Pushdown Automata- By empty stack method:

(Suitable for grammars not in GNF):

Let G =(V, T, P, S) be a CFG. Construct the PDA P that accepts L(G) by


empty stack as follows:

Method 2:

Converting CFG to PDA by final State (Suitable for CFG in GNF form)
Algorithm:

Step1: Check whether grammar is in Greibach Normal Form(GNF)

[For GNF, all productions must be of the form, A→aα, where aϵT and
αϵV*]

Step2: Push Start symbol of the grammar to the stack and change state from q0
to q1. The transition is:

δ(q0, ϵ, Z0)=(q1, SZ0)

Step3: For each production of the form, A→aα, where aϵT and αϵV*, add

transitions,

δ(q1, a, A)=(q1, α)

Step4: Finally add a transition to accept the string by final state,

δ(q1, ϵ, Z0)=(qf, Z0)

Ex: Convert the following grammar to PDA by final state:

S→aSSS | a

Ans:

Step1: The given grammar is in GNF.

Step2: δ(q0, ϵ, Z0)=(q1, SZ0)

Step3:
S→aSSS δ(q1, a, S)=(q1, SSS)
S→a δ(q0, a, S)=(q1, ϵ)

Step4: δ(q1, ϵ, Z0)=(qf, Z0)

Therefore, P=({ q0, q1, qf }, {a}, {S, Z0 }, δ, q0, Z0, {qf}}

******************************************************

You might also like