0% found this document useful (0 votes)

52 views

Syntax Analysis

The parser obtains tokens from the lexical analyzer and verifies that the string of tokens can be generated by the grammar of the source program. It constructs a parse tree and passes it to the rest of the compiler. There are three main types of parsers: universal, top-down, and bottom-up. Top-down parsers build the parse tree from the top down while bottom-up parsers build from the leaves up. Context-free grammars are used to formally describe the syntax or structure of a language. They consist of terminals, non-terminals, production rules, and a start symbol. Derivations apply production rules to generate strings from the start symbol. Parse trees provide a graphical representation of derivations. Ambiguous gramm

Uploaded by

Nakib Ahsan

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Syntax Analysis

Uploaded by

Nakib Ahsan

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

Syntax Analysis

Md Mehrab Hossain Opi

Role of the Parser
 The parser
 Obtains a string of token from the lexical analyzer.
 Verifies the string can be generated by the grammar of
source program.
 Report any syntax errors.
 Recover from commonly occurring errors.
Role of the Parser
 The parser constructs a parse tree and passes it to
the rest of the compiler.
token
source Lexical parse Rest of Intermediate
Parser
program Analyzer get next tree Front End representation
token

Symbol Table

Fig 1: Position of Parser in Compiler Model.

Role of the Parser
 There are three general types of parser for grammars
 Universal
 Top-down
 Bottom-up
 Universal methods like CYK or Earley’s algorithm can
parse any grammar.
 Too slow for compiler.
Role of the Parser
 Top-down
 Builds parse tree from the top (root) to the bottom (leaves).
 Bottom-up
 Starts from the leaves and work their way up to the root.
 Input is scanned from left to right.
 Most efficient top-down and bottom-up methods work
only for sub-classes of grammars.
Syntax Error Handling
 Goal of error handler
 Report the presence of errors clearly and accurately.
 Recover from each error quickly enough to detect subsequent
errors.
 Add minimal overhead to the processing of correct programs.
Error-Recovery Strategies
 Common recovery strategies
 Panic-Mode Recovery
 Phrase-Level Recovery
 Error-Productions
 Global-Correction.
Panic-Mode Recovery
 On discovering error
 Parser discards input symbol one at a time
 Until one of a designated set of synchronizing tokens in found.

 Synchronizing tokens are usually delimiters

 Semicolon or }, whose role is clear and unambiguous.
 Simple and guaranteed not to go into an infinite loop.
Phase-Level Recovery
 On discovering error
 Perform local correction on remaining input.
 Replace a prefix to continue.

 Replacement must not lead to infinite loop.

 Can’t perform if the actual error has occurred earlier.
Error Productions
 Anticipate common errors
 Construct productions that generate the erroneous
lines.
Global Correction
 There are algorithms for choosing a minimal sequence
of changes to obtain a globally least-cost correction.
 Given an incorrect string x and Grammar G
 These algorithm will find a parse tree for a related string y,
such that the number of insertions, deletions, and change of
tokens is as small as possible.
 Too costly to implement in terms of time and space.
Context-Free Grammars
 A formal notation to describe the syntax or structure
of a formal language.
 Formally, a CFG consists of
 A finite set of Terminals
 A finite set of Non-terminals
 A finite set of production rules
 A start symbol.
Context-Free Grammars
 Terminals
 Basic symbols from which strings are formed.
 token name is a synonym for terminal.
 First component of tokens output of lexical analyzer.
 Non-terminals
 Syntactical variables that denote sets of strings.
 Help define the language generated by the grammar.
 Impose a hierarchical structure on the language.
Context-Free Grammar
 Production rules
 Specify the manner in which the terminals an non-terminals
can be combined.
 Each production consists of
 A non-terminal called the head or left side of the production
 The symbol →
 A body or right side consisting of zero on more terminals and
non-terminals.
 One non-terminal is distinguished as the start symbol.
Notational Conventions
 Terminals
 Lowercase letters early in the alphabet. a, b, c.
 Operator symbols such as +, -, *, etc.
 Punctuation symbols – parentheses, comma, etc.
 Digits
 Boldface strings id, if, etc.
Notational Conventions
 Non-terminals
 Uppercase letters –A, B, C.
 The letter S normally the start symbol
 Lowercase, italic names – expr, stmt
 Uppercase letters late in the alphabet – X, Y, Z
represent grammar symbol.
 Lowercase letters late in the alphabet – x, y, z
represents string of terminals.
 Greek letters α, β, γ string of grammar symbol.
Notational Convention
 A set of productions with a common head A can be
written as A→α1| α2 …| αk.
 Unless stated otherwise, the head of the first
production is the start symbol.
Example
 We will be using the grammar a lot

expression → expression + term

expression → expression – term
expression → term
term → term * factor
term → term / factor
term → factor
factor → (expression)
factor → id
Example
 Using the notational convention

E→E+T|E–T|T
T → T * F | T/F | F
F → ( E ) | id
Derivations
 Start with the start terminal.
 At each step replace a non-terminal by the body of
one of its production.
 Consider the grammar

E → E+E | E*E | -E | (E) | id

Derivations
 For the statement E → -E, we say
 E derives –E.
 A sequence of replacement is called derivation.

E → -E [ E → -E]
→ -(E) [ E → (E)]
→ -(id) [ E → id ]

 Derivation of –(id) from E.

 Proves that –(id) is an instance of an expression.
Derivations
 For a sequence of derivation
α1 → α2 → . . . → αn
 We say α1 derives αn in 0 or more steps.
 We write

 Similarly means derived in one or more steps.

Derivations
 If , where S is the start symbol of grammar G,
then is a sentential form of G.
 A sentence of G is a sentential form with no non-
terminals.
 The language generated by a grammar is its set of
sentences.
Derivations
 At each step of derivation we make two choices
 Which non-terminal to replace.
 Which production of that non-terminal to use.

 Leftmost Derivations
 Leftmost non-terminal is always chosen.
 Defined as
 Rightmost derivations
 Rightmost non-terminal is always chosen.
 Defined as
 Also called canonical derivation.
Parse Tree
 A graphical representation of a derivation.
 Each interior node represents the application of a
production.
 Interior node is labeled with the non-terminal A in the
head of the production.
 The children of the node are labeled from left to right,
by the symbols in the body of the production.
Parse Tree
E
Parse tree for the derivation of
-(id + id)
- E

( E )

E + E

id id
Ambiguity
 A grammar that produces more than one parse tree
for some sentence is said to ambiguous.
 Consider the two leftmost derivations for the sentence

id+id*id

𝐸 𝐸
Ambiguity

E E

E + E E * E

id E * E E + E id

id id id id
CFG vs Regular Expression
 CFG are more powerful than regular expressions.
 Every construct that can be described by a regular
expression can be described by a grammar.
 Not vice versa.
Lexical vs Syntactic Analysis
 Why use both regular expression and CFG?
 Separation modularizes the front end of a compiler
into two manageable-sized component.
 Lexical rules are quite simple
 No need of CFG.
 RE provides more concise and easier-to-understand
notation for tokens than grammar.
Eliminating Ambiguity
 Rewriting an ambiguous grammar can resolve
ambiguity sometimes.
 Consider the grammar

Here, other stands for any other statement.

Eliminating Ambiguity
 The grammar is ambiguous.
 Consider the sentence

stmt

if expr then stmt

if expr then stmt else stmt

E2 S1 S2
Eliminating Ambiguity
 Another parse tree for

stmt

if expr then stmt else stmt

E1
S2

if expr then stmt

E2 S1
Dangling else
 Which parse tree should we consider as correct one?
 The first parse tree is preferred in programming
language.
 The rule is “Match each else with the closest
unmatched then”.
Eliminating Ambiguity
 We can convert the grammar into an unambiguous
one.
Left Recursion
 A grammar is left recursive if it has a non terminal A
such that there is a derivation for some string .
 Immediate left recursion occurs when there’s
production
 Top down parsing method can not handle it.
 How do we resolve it?
Immediate Left Recursion Elimination
 Any production can be replaced with

 To eliminate any number of immediate left recursion

 First group the production

 No begins with an A.
 Replace the A-productions by
Immediate Left Recursion Elimination
 Consider the example.

 The non-terminal S is recursive because

 But it is not immediate left recursive.
 How do we eliminate this?
Elimination of Left Recursion
 Algorithm to remove left recursion.

Input: Grammar G with no cycles or -production

Output: An equivalent grammar with no left recursion.
Method
1. Arrange the non-terminal in some order A1,A2,…,An.
2. for (each i from 1 to n){
3. for(each j from 1 to i-1){
4. replace each production of the form by the
production where
are all current productions
5. }
6. eliminate the immediate left recursion among the productin
7. }
Elimination of Left Recursion
 Let’s go back to our previous grammar

 We have non-terminals S and A.

 Let’s order them as S,A.
 No left recursion with S. Nothing happens on first
outer loop.
 For i=2, substitute for S in .

 Now eliminate the immediate left recursion.

Elimination of Left Recursion
 Finally we get
Left Factoring
 A grammar transformation
 Useful for producing grammar suitable for predictive, or top-
down parsing.
 Consider the grammar

 We can not decide which production to choose upon

seeing if.
Left Factoring
 In general, if where is non-empty.
 We do not know which grammar to expand if we find .
 However expanding might help.
 Rewriting the grammar we get

 Now we can expand A to upon finding

Left Factoring
 Algorithm to left factor a grammar

Input: Grammar G
Output: An equivalent left-factored grammar.
Method
For each non-terminal A, find the longest prefix common to two or more of its
alternatives. If replace all of the A-productions , where represents all alternatives that do
not begin with , by

Repeatedly apply this transformation until no two alternatives for a nonterminal have a
common prefix.
Left Factoring Example
 Consider the dangling-else example

 Here i , t, and e stands for if, then, else.

 E and S stands for conditional expression and
statement.
 Left-factored, we get
To be Continued.

Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Unit 2
No ratings yet
Unit 2
29 pages
Unit - II CD
No ratings yet
Unit - II CD
38 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Compiler 2
No ratings yet
Compiler 2
32 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
43 pages
Syntax Analysis
No ratings yet
Syntax Analysis
58 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Compiler Design Lecture Notes
No ratings yet
Compiler Design Lecture Notes
37 pages
Unit4 Notes
No ratings yet
Unit4 Notes
32 pages
CS602PC - Compiler Design Lecture Notes Unit 2
No ratings yet
CS602PC - Compiler Design Lecture Notes Unit 2
42 pages
Syntax Analysis
No ratings yet
Syntax Analysis
27 pages
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
No ratings yet
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
76 pages
CD Unit 2
No ratings yet
CD Unit 2
15 pages
Wa0005.
No ratings yet
Wa0005.
42 pages
Top Down Parsing-Note1
No ratings yet
Top Down Parsing-Note1
18 pages
CH03
No ratings yet
CH03
57 pages
Unit Iii
No ratings yet
Unit Iii
28 pages
Parser
No ratings yet
Parser
4 pages
CD Unit2
No ratings yet
CD Unit2
45 pages
6836
No ratings yet
6836
42 pages
Unit-2 PCD
No ratings yet
Unit-2 PCD
36 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
Unit-3-Parser Basics, Need and Role of Parser
No ratings yet
Unit-3-Parser Basics, Need and Role of Parser
5 pages
Unit 2
No ratings yet
Unit 2
45 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Ch3_Syntax Analysis
No ratings yet
Ch3_Syntax Analysis
96 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
COMPILER DESIGN UNIT 2
No ratings yet
COMPILER DESIGN UNIT 2
44 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
28 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Ch3_Syntax Analysis
No ratings yet
Ch3_Syntax Analysis
96 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Cdmodule 2
No ratings yet
Cdmodule 2
22 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
CC 3
No ratings yet
CC 3
29 pages
Compiler Theory: (A Simple Syntax-Directed Translator)
No ratings yet
Compiler Theory: (A Simple Syntax-Directed Translator)
50 pages
2024_CD-Ch03_Syntaxx_Analysis
No ratings yet
2024_CD-Ch03_Syntaxx_Analysis
28 pages
Class Three
No ratings yet
Class Three
74 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
Module 2
No ratings yet
Module 2
19 pages
Chapter 6
No ratings yet
Chapter 6
52 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR
No ratings yet
Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR
13 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
51 pages
Tekkom M4,5
No ratings yet
Tekkom M4,5
29 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
11 pages
Chapter – 3
No ratings yet
Chapter – 3
46 pages
compiler_design- Module3
No ratings yet
compiler_design- Module3
19 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Chapter 3 (Part 1)
No ratings yet
Chapter 3 (Part 1)
33 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
88 pages
Lecture 08 09 PDF
No ratings yet
Lecture 08 09 PDF
10 pages
PCD 1.4 Syntax Analysis
No ratings yet
PCD 1.4 Syntax Analysis
33 pages
Perl One-Liners: 130 Programs That Get Things Done
From Everand
Perl One-Liners: 130 Programs That Get Things Done
Peteris Krumins
4/5 (3)
Physics Note Physical Optics
No ratings yet
Physics Note Physical Optics
19 pages
Review of Object Orientation
No ratings yet
Review of Object Orientation
50 pages
Lung Cancer Prediction Using Electronic Claims Records A Transformer-Based Approach
No ratings yet
Lung Cancer Prediction Using Electronic Claims Records A Transformer-Based Approach
12 pages
Lung-RetinaNet Lung Cancer Detection Using A RetinaNet With Multi-Scale Feature Fusion and Context Module
No ratings yet
Lung-RetinaNet Lung Cancer Detection Using A RetinaNet With Multi-Scale Feature Fusion and Context Module
12 pages
System Labreport3-4
No ratings yet
System Labreport3-4
10 pages
Procedures & Functions
No ratings yet
Procedures & Functions
44 pages
TimeCraft 1
No ratings yet
TimeCraft 1
12 pages
Electrohub: A Electronics Product Selling Website
No ratings yet
Electrohub: A Electronics Product Selling Website
4 pages
Pulse Transformer
No ratings yet
Pulse Transformer
2 pages
Lecture On Scmitt - Trigger
No ratings yet
Lecture On Scmitt - Trigger
17 pages
Pulse Circuits - Blocking Oscillators
No ratings yet
Pulse Circuits - Blocking Oscillators
6 pages
Op-Amp As A Comparator: Course Code: EE 2113 Digital Electronics
No ratings yet
Op-Amp As A Comparator: Course Code: EE 2113 Digital Electronics
18 pages
Language Specification & Compiler Construction: Hanspeter Mössenböck University of Linz
No ratings yet
Language Specification & Compiler Construction: Hanspeter Mössenböck University of Linz
26 pages
ECMA-262 3rd Edition December 1999
No ratings yet
ECMA-262 3rd Edition December 1999
188 pages
Programming Language Concepts
No ratings yet
Programming Language Concepts
76 pages
Syntax Analysis
No ratings yet
Syntax Analysis
47 pages
First Follow Instruction
No ratings yet
First Follow Instruction
4 pages
Compiler Construction 1st edition by Niklaus Wirth 0201403536Â 978-0201403534 - Download the full set of chapters carefully compiled
100% (4)
Compiler Construction 1st edition by Niklaus Wirth 0201403536Â 978-0201403534 - Download the full set of chapters carefully compiled
88 pages
Unit 3 SDD
No ratings yet
Unit 3 SDD
7 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
Chapter 4 - Context-Free Grammars and Languages
No ratings yet
Chapter 4 - Context-Free Grammars and Languages
60 pages
Fundamentals of Grammar: Lexical and Grammatical Categories
No ratings yet
Fundamentals of Grammar: Lexical and Grammatical Categories
27 pages
Compiler Design File
No ratings yet
Compiler Design File
13 pages
Chapter 3 "Describing Syntax and Semantics"
No ratings yet
Chapter 3 "Describing Syntax and Semantics"
10 pages
Compiler Construction CS-4207: Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Instructor Name: Atif Ishaq
19 pages
M Language, Power Query, Microsoft
100% (1)
M Language, Power Query, Microsoft
968 pages
Bison
No ratings yet
Bison
108 pages
An Introduction To The Theory of Computation (1989 Gurari)
No ratings yet
An Introduction To The Theory of Computation (1989 Gurari)
601 pages
Principles of Programming Languages: M.Archana
No ratings yet
Principles of Programming Languages: M.Archana
94 pages
Language Description: Syntactic Structure
No ratings yet
Language Description: Syntactic Structure
35 pages
[Ebooks PDF] download Theories of Programming Languages 1st Edition John C. Reynolds full chapters
100% (1)
[Ebooks PDF] download Theories of Programming Languages 1st Edition John C. Reynolds full chapters
81 pages
Preliminaries
No ratings yet
Preliminaries
45 pages
First
No ratings yet
First
2 pages
UNIT 3 Syntax Analysis-Part1: Harshita Sharma
No ratings yet
UNIT 3 Syntax Analysis-Part1: Harshita Sharma
70 pages
MCode Language - Power Query
No ratings yet
MCode Language - Power Query
963 pages
PL 10 CH 3
No ratings yet
PL 10 CH 3
57 pages
PPL Notes
No ratings yet
PPL Notes
144 pages
CFG (31 34)
No ratings yet
CFG (31 34)
78 pages
Describing Syntax and Semantics: ISBN 0-321-33025-0
No ratings yet
Describing Syntax and Semantics: ISBN 0-321-33025-0
139 pages
Programming Language 1
No ratings yet
Programming Language 1
55 pages
1.describing Syntax and Semantics
No ratings yet
1.describing Syntax and Semantics
110 pages
Chapter 1 - Principles of Programming Languges
No ratings yet
Chapter 1 - Principles of Programming Languges
37 pages