0% found this document useful (0 votes)

21 views28 pages

Module-3 notes

Uploaded by

4mt22ci006

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

21 views28 pages

Module-3 notes

Uploaded by

4mt22ci006

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 28

Module-3

Context – Free Grammars

Context Free Grammar is formal grammar, the syntax or structure of a formal
language can be described using context-free grammar (CFG). A context-free
grammar is a formal notation for expressing recursive definitions of languages.

Definition of Context-Free Grammars:

The context-free grammar is defined by quadruple,

G=(V, T, P, S)

where,
• V is a finite set of variables, also called sometimes non-terminals. Each
variable represents a language i.e., a set of strings.
• T is a finite set of symbols that form the strings of the language being
defined. We call this alphabet the terminals or terminal symbols.
• S is the start symbol, is one of the variables represents the language being
defined.
• There is a finite set of productions or rules that represent the recursive
definition of a language. Each production consists of:
a) A variable that is being defined by the production. This variable is
often called the head of the production.
b) The production symbol →
c) A string of zero or more terminals and variables. This string, called
the body of the production, represents one way to form strings in the
language of the variable of the head.

Example: Consider the grammar for arithmetic expression as follows:

G= (V, T, S, P) with
V={E}
T={+, *, id}
P={ E→E+E
E→E-E
E→E*E
E→E/E
E→ id }
and E is the start symbol.

Derivations using a Grammar:

“The process of obtaining string of terminals and/or non-terminals from the start
symbol by applying some or all productions is called derivation”.

Let us derive the sentence id+id*id from the grammar for arithmetic expression:

E => E+E
=>id+E
=>id+E*E
=>id+id*E
=>id+id*id
Leftmost and Rightmost Derivations
LMD:
In order to restrict the number of choices we have in deriving a string, it is often
useful to require that at each step we replace the leftmost variable by one of its
production bodies. Such a derivation is called a leftmost derivation, and we

indicate that a derivation is leftmost by using the relations and , for one or
many steps, respectively.

RMD:
Similarly, it is possible to require that at each step the rightmost variable is
replaced by one of its bodies. If so, we call the derivation rightmost and use the

symbols and to indicate one or many rightmost derivations steps,

respectively.

Ex: Consider the string w=id+id*id. Let us derive w using LMD and RMD as
follows:
LMD:
E E+E
id+E
id+E*E
id+id*E
id+id*id
RMD:

E E+E

E+E*E

E+E*id
E+id*id

id+id*id

The Language of a Grammar:

If a language L is the language of some contex-free grammar, then L is said to

be context-free language or CFL.

Sentential Forms:

Examples:
Design Context free grammar for the given language:
1. Construct the CFG for the language having any number of a's over the set
∑= {a}.
Ans :
S→aS | ε

Therefore, G=(V,T,S,P)
where V={S}, T={a}, P={S→aS | ε}, S is the start symbol.
2. Construct a CFG for the regular expression (0+1)*
Ans: S→0S | 1S | ε
Therefore, G=(V,T,S,P)
where V={S}, T={0,1}, P={S→0S | 1S | ε}, S is the start symbol.
3. Construct a CFG for a language L = {wcwR | where w € (a, b)*}. Also
derive the string “abbcbba”
Ans:
S→aSa |bSa |c

Derivations:
S => aSa
S => abSba //S→bSb
S =>abbSbba //S→bSb
S =>abbcbba //S→c

4. Construct a CFG for the language L = anb2n where n>=1.

Ans:
S→aSbb |ab

5. L = { anbn | n is a positive integer }

Ans:
S -> aSb | ab

therefore G=(V, T, P, S) where,

V = { S } , T = { a , b }, P = { S -> aSb , S -> ab }and S is the start symbol.

6. Write CFG for balanced paranthesis.

Ans:
S→(S) | [S] | {S} |SS| ε
7. L={wwR : w€{a,b}*}

Ans: S→aSa |bSb | ε

8. The CFG for Plaindrome of 0’s and 1’s is

Ans:
S→aSa |bSb|a |b| ε

9. L={0i 1j | i≠j , i>=0, j>=0}

Ans:
S→0S1 | A | B
A→0A |0
B→1B | 1
10. L={ anbmcn | n>=0, m>=0}
Ans:
S→aAc | ε
A→aAc | B
B→ bB | ε
11.L={ w€{a,b}*: na(w)=nb(w)}
Ans: S→aSb | bSa | SS | ε

12. L={ w€{a,b}*: na(w)>nb(w)}

Ans: S→aSb | bSa | SS | a| aS

13.L={ w€{a,b}*: na(w)<nb(w)}

Ans: S→aSb | bSa | SS | b| bS

Note: Refer notes for More problems and its solutions.

Parse Trees
There is a tree representation for derivations that has proved extremely useful.
This tree shows us clearly how the symbols of a terminal string are grouped into
substrings, each of which belongs to the language of one of the variables of the
grammar.

Constructing Parse Trees

Let us consider a grammar G=(V, T, P, S). The parse trees for G are trees with
the following conditions:
1. Root of the tree is labelled by start symbol S of the grammar
2. Each interior node is labeled by a variable in V.
3. Each leaf is labeled by either a variable, a terminal, or ε. However, if the
leaf is labelled ε, then it must be the only child of its parent.
4. If an interior node is labeled A, and its children are labeled
X 1 , X2, ……………., X k
respectively, from the left, then A→X 1 X 2 …….Xk is a production in P.
Note that the only time one of the X’s can be ε is if that is the label of
the only child, and A→ε is a production of G.

Example: Consider the following grammar,

A derivation sequence of acb,
S=>ASB
=>aASB
=>aSB
=>acB
=>acbB
=>acb

The Parse tree of acb is,

The Yield of a Parse Tree

If we look at the leaves of any parse tree and concatenate them from the left,we
get a string, called the yield of the tree, which is always a string that is derived
from the root variable. Of special importance are those parse trees such that:
1. The yield is a terminal string. That is, all leaves are labeled either with a
terminal or with ε.
2. The root is labeled by the start symbol.
These are the parse trees whose yields are strings in the language of the
underlying grammar.
Ex: in the above parse tree yield is, a ε c b ε=acb.

Exercise on parse tree:

1. Consider the grammar S→ (L) | a

L→L, S | S

i. What are the terminals, nonterminal and the start symbol?

ii. Find the parse tree for the following sentence

a. (a,a)

b. (a, (a, a))

c. (a, ((a,a),(a,a)))

iii. Construct LMD and RMD for each.

Ambiguity in Grammars and Languages:

Applications of CFG’s often rely on the grammar to provide the structure of files.
When a grammar fails to provide unique structures, it is sometimes possible to
redesign the grammar to make the structure unique for each string in the language.
Unfortunately, sometimes we cannot do so. That is, there are some CFL’s that are
“inherently ambiguous”; every grammar for the language puts more than one
structure on some strings in the language.

Ambiguous Grammars:
Q. What do you mean by ambiguous grammar? Explain with
example.
A CFG is ambiguous if it produces more than one parse tree for a string in the
language. – i.e. there exists any string in the language that is the yield of two or
more parse trees. In other way, if we are able to derive any string of a grammar
either by applying LMD two or more time or applying RMD two or more time
we call such grammar as Ambiguous grammar.

Example:
1. A good example for ambiguous grammar is the grammar for arithmetic
expressions:

E → E+E | E*E | E-E | E/E | id | (E)

For the expression id+id*id we can generate 2 parse trees and 2 leftmost or 2
rightmost derivations as follows:

Since we got 2 different parse trees and LMD’s we say that given grammar is
ambiguous.

Removing Ambiguity From Grammars:

For the most parsers, the grammar must be unambiguous. Because,

• unambiguous grammar means unique selection of the parse tree for a
sentence
• We should eliminate the ambiguity in the grammar during the design phase
of the compiler.
• An ambiguous grammar should be written to eliminate the ambiguity.
• We have to prefer one of the parse trees of a sentence (generated by an
ambiguous grammar) to disambiguate that grammar to restrict to this
choice.

Sometimes an ambiguous grammar can be rewritten to eliminate the

ambiguity.

Example 1:
There are two causes of ambiguity in the grammar of arithmetic expression,

E→ E+E | E-E |E*E |E/E |(E) | id

1. The precedence of operators is not respected.

2. A sequence of identical operators can group either from the left or from
the right.

Ambiguous grammars (because of ambiguous operators) can be disambiguated

according to the precedence and associativity rules. We can eliminate ambiguity
of this grammar using the following precedence and associativity rules.

Precedence: from highest to lowest

*, / (left to right)
+, - (left to right)
Therefore the unambiguous grammar for arithmetic expression is,
E→E+T | E – T
T→ T*F |T/F
F→ (E) |id

Exercise Questions: Prove that the following grammars are ambiguous.

a) S→ AB | aaB
A→a | Aa
B→ b
b) S→ABA
A→aA |ɛ
B→bB | ɛ

Inherent Ambiguity
A context free language L is said to be inherently ambiguous if all its grammars
are ambiguous. If even one grammar for L is unambiguous, then L is an
unambiguous language.
Consider the Language,

The grammar for this language is,

Since it is impossible to derived more than one parse tree for any strings of
this grammar, it is an example for inherently ambiguous grammar.

PUSHDOWN AUTOMATA
The context free languages have a type of automaton that defined them. This
automaton, called a “pushdown automaton”, is an extension of the nondeterministic finite
automaton with ε – transitions, which is one of the ways to define the regular languages.

The pushdown automaton is essentially an ε – NFA with the addition of a stack. The
stack canbe read, pushed and popped only at the top, just like the “stack” data structure.

We define two different versions of the pushdown automaton: one that accepts by
entering an accepting state, like finite automata do and another version that accepts by
emptying its stack, regardless of the state it is in. we show that these two variations accept
exactly the context free languages i.e. grammars can be converted to equivalent pushdown
automata and vice-versa.

Model of Pushdown Automata

❖ The pushdown automaton is in essence a nondeterministic finite automaton with ε

– transitions permitted and one additional capability: a stack on which it can store a
string of “stack symbols”.
❖ The presence of a stack means that unlike finite automaton, the pushdown
automaton can remember an infinite amount of information.
❖ The model of PDA is as shown below:

Finite
Input Accept/reject
state
control
Stack
Figure 6.1: A Pushdown automata with stack

❖ As shown in the Figure 6.1, PDA has finite state control that reads inputs, one symbol
at a time. The pushdown automaton is allowed to observe the symbol at the top of
the stack and to base its transition on its current state, input symbol, and the symbol
at the top of stack.

FORMAL DEFINITION OF PUSHDOWN AUTOMATA

❖ Formal definition for pushdown automata (PDA) involves seven components. A

PDA P is defined as follows:
P = (Q, Σ, Γ, δ, q0, Z0, F)
❖ Where,
• Q: a finite set of states, like the states of a finite automata
• Σ: a finite set of input symbols, also analogous to the corresponding
component of afinite automaton.
• Γ: a finite stack alphabet. It is the set of symbols that we are allowed to
push onto the stack.
• δ: the transition function. As for a finite automaton, δ governs the
behavior of the automaton.
δ: QX Σ U {ϵ} X Γ → Q X Γ*
Formally, δ takes as argument a triple δ(q, a, X), where:
1. q is the state in Q
2. a is either an input symbol in Σ or a = ε, the empty string, which
is assumed not to be an input symbol.
3. X is a stack symbol, which is a member of Γ.
The output of δ is a finite set of pairs (p, γ), where p is the new state
and γ is the string of stack symbols that replaces X at the top of stack.
• q0: the start state. The PDA is in this state before making any transitions.
• Z0: the start symbol. Initially, the PDA’s stack consists of one instance of
this symbol, and nothing else.
• F: the set of accepting states, or final states.

A Graphical Notation for PDA’s:

The transition function δ will explain the behavior of a PDA. It has following
components:

Example: Let us design a PDA P to accept the language L={wwR : w €{0,1}*}

We shall use a stack symbol Z0 to mark the bottom of the stack. We need to have
this symbol present so that, after we pop w off the stack and realize that we have
seen wwR on the input, we still have something on the stack to permit us to make
a transition to the accepting state, q2. Thus, our PDA for Lwwr can be described
as:

where δ is defined by the following rules:

Transition diagram is,

Figure 2: PDA for L={wwR : w €{0,1}*}

Instantaneous Descriptions of a PDA:

The PDA goes from configuration to configuration, in response to input symbols
(or sometimes ε ) but unlike the finite automaton, where the state is the only thing
that we need to know about the automaton, the PDA’s configuration involves
both the state and the contents of the stack. We shall represent the configuration
of a PDA by a triple (q,w,γ), where:
1. q is the state,
2. w is the remaining input, and
3. γ is the stack contents

Such a triple is called an instantaneous description, or ID of the pushdown

automaton.
Example:
Let us consider the action of the PDA of Example L={wwR } on the input 1111.
Since q0 is the start state and Z0 is the start symbol, the initial ID is (q0,1111,Z0).
The IDs of the PDA are:

The Languages of a PDA:

❖ We have assumed that a PDA accepts it input by consuming it and entering an
accepting state. We call this approach “acceptance by final state”.
❖ There is a second approach to define the language of PDA that has important
applications. We may also define for any PDA the language “accepted by empty
stack”, i.e. the set of strings that cause the PDA to empty the stack, starting from the
initial ID.
❖ These two methods are equivalent, in the sense that a language L has a PDA that
accepts it by final state if and only if L has a PDA that accepts it by empty stack.

EQUIVALENCE OF ACCEPTANCE BY FINAL STATE AND EMPTY

STACK

From Empty Stack to Final State:

From Final State to Empty Stack
Deterministic Pushdown Automata

Definition:

Example for Deterministic PDA is,

The PDA to accept L={wCwR : w€{0,1}*}

Example for Non-deterministic PDA is,

The PDA to accept L={wwR : w€{a,b}*}

Regular Languages and Deterministic PDA’s:

2. Obtain a PDA to accept a string of balanced parenthesis. The parathesis
to be considered are (, ), [, ]. Ex: (( )), [( )( )],( ) ( )
Ans: The transition diagram for the given language is,
Therefore PDA P=(Q, ∑, Γ, δ, q0, Z0, F)
Where,
Q={q0, q1}
∑={a,b}
Γ={(, [, Zo}
δ is as shown in the transition diagram.
q0 is the initial state
F={q1}

3. Construct a Pushdown automata to accept the language L={a nbn :n>=1}.

Also show the IDs for the string w=aaabbb.

Given w=aaabbb. The IDs are:

4. Construct a PDA to accept the language L={ w: w€ {a,b}* and
na(w)=nb(w)}.
Let us trace the string w=abbbaa as follows:

Note: For more problems refer class work

a)From Grammars to Pushdown Automata- By empty stack method:

(Suitable for grammars not in GNF):

Let G =(V, T, P, S) be a CFG. Construct the PDA P that accepts L(G) by

empty stack as follows:

Method 2:

Converting CFG to PDA by final State (Suitable for CFG in GNF form)
Algorithm:

Step1: Check whether grammar is in Greibach Normal Form(GNF)

[For GNF, all productions must be of the form, A→aα, where aϵT and
αϵV*]

Step2: Push Start symbol of the grammar to the stack and change state from q0
to q1. The transition is:

δ(q0, ϵ, Z0)=(q1, SZ0)

Step3: For each production of the form, A→aα, where aϵT and αϵV*, add

transitions,

δ(q1, a, A)=(q1, α)

Step4: Finally add a transition to accept the string by final state,

δ(q1, ϵ, Z0)=(qf, Z0)

Ex: Convert the following grammar to PDA by final state:

S→aSSS | a

Ans:

Step1: The given grammar is in GNF.

Step2: δ(q0, ϵ, Z0)=(q1, SZ0)

Step3:
S→aSSS δ(q1, a, S)=(q1, SSS)
S→a δ(q0, a, S)=(q1, ϵ)

Step4: δ(q1, ϵ, Z0)=(qf, Z0)

Therefore, P=({ q0, q1, qf }, {a}, {S, Z0 }, δ, q0, Z0, {qf}}

******************************************************

Lesson 10 - Formal Language S
No ratings yet
Lesson 10 - Formal Language S
16 pages
Examination Paper: May/June 2017 COMP2211-WE01
No ratings yet
Examination Paper: May/June 2017 COMP2211-WE01
12 pages
Context-Free Grammar: Sojharo Mangi BS-2
No ratings yet
Context-Free Grammar: Sojharo Mangi BS-2
10 pages
Compiler Design ECX6235 Answers For Tma 01: Name Reg. No Centre Due Date
No ratings yet
Compiler Design ECX6235 Answers For Tma 01: Name Reg. No Centre Due Date
11 pages
Chapter3-CFG
No ratings yet
Chapter3-CFG
67 pages
FLAT UNIT 3 -28.9.20
No ratings yet
FLAT UNIT 3 -28.9.20
7 pages
Unit - Iii
No ratings yet
Unit - Iii
21 pages
Automata Chapter 3
No ratings yet
Automata Chapter 3
14 pages
Module 3
No ratings yet
Module 3
56 pages
ATCD - 21CS51 - M3 - Savitha T
No ratings yet
ATCD - 21CS51 - M3 - Savitha T
56 pages
Grammar in Automata
No ratings yet
Grammar in Automata
74 pages
Chapter Three Context Free Grammar
No ratings yet
Chapter Three Context Free Grammar
55 pages
Context Free Grammar (CFG)
No ratings yet
Context Free Grammar (CFG)
18 pages
Chapter Three
No ratings yet
Chapter Three
110 pages
CFG (31 34)
No ratings yet
CFG (31 34)
78 pages
Cit 425-Automata Theory, Computability and Formal Languages
No ratings yet
Cit 425-Automata Theory, Computability and Formal Languages
34 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
28 pages
Unit-1 F&CD
No ratings yet
Unit-1 F&CD
31 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
20 pages
Theory of Computation Notes
No ratings yet
Theory of Computation Notes
4 pages
[Week 4] Syntax Analysis (CFG)
No ratings yet
[Week 4] Syntax Analysis (CFG)
50 pages
Regular Grammars
100% (2)
Regular Grammars
46 pages
Parsing Bun
No ratings yet
Parsing Bun
48 pages
Unit-2 PCD
No ratings yet
Unit-2 PCD
36 pages
Chomsky Hierarchy of Languages
No ratings yet
Chomsky Hierarchy of Languages
24 pages
Context Free Grammar CFG
No ratings yet
Context Free Grammar CFG
71 pages
Unit3 Toc
No ratings yet
Unit3 Toc
97 pages
CC 3
No ratings yet
CC 3
29 pages
TPL lect 15 - 16
No ratings yet
TPL lect 15 - 16
5 pages
FLAT Unitt-1
No ratings yet
FLAT Unitt-1
9 pages
ToC Notes - Unit 2
No ratings yet
ToC Notes - Unit 2
20 pages
Cdmodule 2
No ratings yet
Cdmodule 2
22 pages
Unit I
No ratings yet
Unit I
37 pages
Why Study The Theory of Computation?: Implementations Come and Go
No ratings yet
Why Study The Theory of Computation?: Implementations Come and Go
68 pages
Unit-3 Flat
No ratings yet
Unit-3 Flat
29 pages
PPT_203105351_2
No ratings yet
PPT_203105351_2
66 pages
Automata Theory Answers
No ratings yet
Automata Theory Answers
33 pages
COMPILER DESIGN UNIT 2
No ratings yet
COMPILER DESIGN UNIT 2
44 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
43 pages
Lecture 2 - Chapter 1 (1.2)
No ratings yet
Lecture 2 - Chapter 1 (1.2)
32 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
Unit 3 SDD
No ratings yet
Unit 3 SDD
7 pages
CS6109-MODULE-4
No ratings yet
CS6109-MODULE-4
36 pages
14. Automata and Complexity Theory
No ratings yet
14. Automata and Complexity Theory
166 pages
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
No ratings yet
UNIT IV CONTEXT FREE GRAMMARS and LANGUAGES
69 pages
8 Notes
No ratings yet
8 Notes
12 pages
comp106_6_computation
No ratings yet
comp106_6_computation
63 pages
CD UNIT-II Syntax Analysis
No ratings yet
CD UNIT-II Syntax Analysis
13 pages
Unit 2
No ratings yet
Unit 2
45 pages
Compiler Lecture 7
No ratings yet
Compiler Lecture 7
18 pages
FL&T Unit 3 - 1 - 1724732026415
No ratings yet
FL&T Unit 3 - 1 - 1724732026415
17 pages
Compiler Lecture 4
No ratings yet
Compiler Lecture 4
17 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Automata
No ratings yet
Automata
65 pages
Toc 3
No ratings yet
Toc 3
65 pages
CO2 Material-Part 2
No ratings yet
CO2 Material-Part 2
12 pages
Lec 4
No ratings yet
Lec 4
16 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
ATCD PPT Module-3
No ratings yet
ATCD PPT Module-3
136 pages
The Genetic Code of All Languages,(Part 2.1; Numerals)
From Everand
The Genetic Code of All Languages,(Part 2.1; Numerals)
Moni Kanchan Panda
No ratings yet
Toc Unit-1 Notes
No ratings yet
Toc Unit-1 Notes
11 pages
Context Free Grammar and Parsing
0% (1)
Context Free Grammar and Parsing
138 pages
Automata Theory Problems and Exercises PDF Free
No ratings yet
Automata Theory Problems and Exercises PDF Free
11 pages
Formal Languages and Automata: Simplification of Context-Free Grammars and Normal Forms
No ratings yet
Formal Languages and Automata: Simplification of Context-Free Grammars and Normal Forms
33 pages
FLA Syllabus
No ratings yet
FLA Syllabus
2 pages
Rcs454: Python Language Programming LAB: Write A Python Program To
No ratings yet
Rcs454: Python Language Programming LAB: Write A Python Program To
39 pages
Gadissa Hailu
No ratings yet
Gadissa Hailu
77 pages
DS Unitwise Old Questions
No ratings yet
DS Unitwise Old Questions
22 pages
TN TRB Assistant Professor Syllabus 2024 29 34
No ratings yet
TN TRB Assistant Professor Syllabus 2024 29 34
6 pages
Ezekiel CSC317 assignment (2)
No ratings yet
Ezekiel CSC317 assignment (2)
10 pages
Examcollection Older Hints
No ratings yet
Examcollection Older Hints
50 pages
Compiler Design Objectives
No ratings yet
Compiler Design Objectives
4 pages
Compiler Lecture 3
No ratings yet
Compiler Lecture 3
16 pages
Parsing ME Modified
No ratings yet
Parsing ME Modified
168 pages
L System
No ratings yet
L System
14 pages
Imp Q
100% (1)
Imp Q
7 pages
Scheme of Syllabus For 5 Semester (2015-2016 To 2017-2018 Admitted Batches)
No ratings yet
Scheme of Syllabus For 5 Semester (2015-2016 To 2017-2018 Admitted Batches)
71 pages
04 - Parsing in NLP
No ratings yet
04 - Parsing in NLP
39 pages
CD Question Bank
No ratings yet
CD Question Bank
17 pages
MST-QUIZ-1 - Attempt Review
No ratings yet
MST-QUIZ-1 - Attempt Review
13 pages
Compiler Design May 2024
No ratings yet
Compiler Design May 2024
8 pages
MA7151-Mathematical Foundations For Computer Applications Question Bank
No ratings yet
MA7151-Mathematical Foundations For Computer Applications Question Bank
15 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
17 pages
Question Bank: Unit 1: Introduction To Finite Automata
No ratings yet
Question Bank: Unit 1: Introduction To Finite Automata
8 pages
UNIT-3 Part-A:Semantic Analysis 1. Intermediate Code Forms
No ratings yet
UNIT-3 Part-A:Semantic Analysis 1. Intermediate Code Forms
26 pages
Language Translation Principles PT 1
No ratings yet
Language Translation Principles PT 1
40 pages
CD Unit-3
No ratings yet
CD Unit-3
146 pages