0% found this document useful (0 votes)

15 views47 pages

Chapter 2

Chapter 2 discusses the syntax-directed translation in compilers, detailing the analysis and synthesis phases that convert source programs into intermediate code and ultimately into target programs. It explains the role of context-free grammar (CFG) in defining syntax, the process of parsing, and the importance of unambiguous grammars for correct interpretation of expressions. The chapter also covers operator associativity and precedence, syntax-directed translation schemes, and the concept of synthesized attributes in relation to programming constructs.

Uploaded by

chattolagadget

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views47 pages

Chapter 2

Uploaded by

chattolagadget

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Chapter 2

Simple Syntax-Directed Translation

Muhammad Kamal Hossen

Associate Professor
Dept. of CSE, CUET
E-mail: khossen@[Link]
Introduction
• The analysis phase of a compiler breaks up a source
program into constituent pieces & produces an internal
representation for it, called intermediate code.
• The synthesis phase translates the intermediate code into
the target program.
• Analysis is organized around the "syntax" of the language to
be compiled.
• syntax of a programming language describes the proper
form of its programs,
• semantics of the language defines what its programs mean;
that is, what each program does when it executes.
• For syntax: CFG or BNF 2
A model of a compiler front end

3
A model of a compiler front end
• Lexical Analyzer: allows a translator to handle multi-character
constructs like identifiers, which are written as sequences of
characters, but are treated as units called tokens during syntax analysis;
 count + 1, the identifier count is treated as a unit.
 The lexical analyzer in allows numbers, identifiers, and "white space"
(blanks, tabs, and newlines) to appear within expressions.

• Intermediate-code generation:
 abstract syntax trees or simply syntax trees, represents the hierarchical
syntactic structure of the source program.
 three-address instructions

• Parser: produces a syntax tree, that is further translated into three-

address code.
 Some compilers combine parsing and intermediate-code generation
into one component. 4
Intermediate code for "do i = i + 1 ;
while ( a [iJ < v) ; "

Three-address code

Abstract syntax tree

5
Syntax Definition
• "context-free grammar," or "grammar“ specify the syntax of a language.
• A grammar naturally describes hierarchical structure of language constructs.
• For example, an if-else statement in Java can have the form
if ( expression ) statement else statement
• An if-else statement is the concatenation of the keyword if, an opening
parenthesis, an expression, a closing parenthesis, a statement, the keyword
else, and another statement.
• Using the variable expr to denote an expression and the variable stmt to
denote a statement, this structuring rule can be expressed as
stmt  if ( expr ) stmt else stmt
• in which the arrow may be read as "can have the form."
• Such a rule is called a production.
• Lexical elements- keyword if and the parentheses are called terminals.
• Variables- expr & stmt represent sequences of terminals and are called
non-terminals.
6
Definition of Grammars
• A context-free grammar (CFG) has four components:
• 1. A set of terminal symbols, sometimes referred to as
"tokens."
o The terminals are the elementary symbols of the language
defined by the grammar.
• 2. A set of nonterminals, sometimes called "syntactic
variables."
o Each non-terminal represents a set of strings of terminals
• 3. A set of productions,
o each production consists of a non-terminal, called the head
or left side of the production, an arrow, and a sequence of
terminals and/or non-terminals , called the body or right
side of the production
7
Definition of Grammars
• The intuitive intent of a production is to
specify one of the written forms of a
construct; if the head non-terminal represents
a construct, then the body represents a
written form of the construct .
• 4. A designation of one of the non-terminals
as the start symbol
CFG = {T, NT, P, S}
8
Example

• Terminals: + - 0 1 2 3 4 5 6 7 8 9
 A string of terminals is a sequence of zero or
more terminals.
 The string of zero terminals is called the empty
string ()
• Non-terminals: list digit
• Start symbol: list
9
Derivations
• A grammar derives strings by beginning with
the start symbol & repeatedly replacing a non-
terminal by the body of a production for that
non-terminal.
• The terminal strings that can be derived from
the start symbol form the language defined by
the grammar.

10
Example
(1)
(2)
(3)
(4)

• 9-5+2 is a list:
a) 9 is a list by production (3) , since 9 is a digit.
b) 9-5 is a list by production (2) , since 9 is a list & 5 is a
digit
c) 9-5+2 is a list by production (1) , since 9-5 is a list and 2
is a digit. 11
Parsing
• Parsing is the problem of taking a string of terminals
and figuring out how to derive it from the start symbol
of the grammar.
• If it cannot be derived from the start symbol of the
grammar, then reporting syntax errors within the
string.
• Parsing is one of the most fundamental problems in all
of compiling.
• A source program has multi-character lexemes that are
grouped by the lexical analyzer into tokens, whose first
components are the terminals processed by the parser.
12
Parse Trees
• A parse tree pictorially shows how the start
symbol of a grammar derives a string in the
language.
• If non-terminal A has a production A  XYZ,
then a parse tree may have an interior node with
three children labeled X, Y, & Z, from left to right:

13
Properties of the Parse Tree
1. The root is labeled by the start symbol.
2. Each leaf is labeled by a terminal or by .
3. Each interior node is labeled by a non-terminal.
4. If A is the non-terminal labeling some interior node and X1 ,
X2, • • • , Xn are the labels of the children of that node from
left to right, then there must be a production A  X1X2 · · ·
X n.
• Here, X1 , X2 , . . . , Xn each stand for a symbol that is either
a terminal or a non-terminal .
• As a special case, if A   is a production, then a node
labeled A may have a single child labeled 
14
Tree Terminology
 A tree consists of one or more nodes. Nodes may have
labels, which in this book typically will be grammar
symbols.
 When we draw a tree, we often represent the nodes by
these labels only.
 Exactly one node is the root.
 All nodes except the root have a unique parent; the root
has no parent.
 When we draw trees , we place the parent of a node
above that node and draw an edge between them.
 The root is then the highest (top) node.

15
Tree Terminology
 If node N is the parent of node M, then M is a
child of N. The children of one node are called
siblings. They have an order, from the left, and
when we draw trees , we order the children of a
given node in this manner.
 A node with no children is called a leaf. Other
nodes - those with one or more children - are
interior nodes.
 A descendant of a node N is either N itself, a child
of N, a child of a child of N, and so on , for any
number of levels. We say node N is an ancestor of
node M if M is a descendant of N.
16
Example: 9-5+2
(1)
(2)
(3)
(4)

17
Ambiguity
• ambiguous grammar: a grammar can have
more than one parse tree generating a given
string of terminals.
• Since a string with more than one parse tree
usually has more than one meaning,
 We need to design unambiguous
grammars for compiling applications,
 Or, to use ambiguous grammars with
additional rules to resolve the ambiguities
18
• 9-5+2 has more than one parse tree with this
grammar.
• Two ways: (9-5) +2 and 9- (5+2) .

• (9-5) +2 • 9- (5 + 2) 19
Associativity of Operators
• By convention, 9+5+2 is equivalent to (9+5) +2
• 9-5-2 is equivalent to ( 9-5) -2.
• When an operand like 5 has operators to its left and
right, conventions are needed for deciding which
operator applies to that operand.
• operator + associates to the left, because an operand
with plus signs on both sides of it belongs to the
operator to its left.
• In most programming languages: addition, subtraction,
multiplication, and division are left-associative.
20
Associativity of Operators
• exponentiation are right-associative.
• assignment operator = in C and its
descendants is right associative;
 the expression a=b=c is treated in the
same way as the expression a= (b=c )
• right  letter = right I letter
• letter  a I b I . . . I z

21
Parse Tree for 9-5-2 and a=b=c

Left associative Right associative

22
Precedence of Operators
• 9+5*2
 02 possible interpretations:
 (9 + 5) * 2 or, 9 + (5 * 2)
• The associativity rules for + and * apply to
occurrences of the same operator, so they do not
resolve this ambiguity.
• Rules defining the relative precedence of
operators are needed when more than one kind
of operator is present. 23
Precedence of Operators
• We say that * has higher precedence than + if
* takes its operands before + does.
• In ordinary arithmetic, multiplication &
division have higher precedence than addition
& subtraction
• 9+5*2  9+ (5*2)
• 9*5+2  (9*5) +2

24
Example 2.6: A grammar for arithmetic expressions can be
constructed from a table showing the associativity and precedence
of operators. Operators on the same line have the same
associativity and precedence:

left - associative: + -
left - associative: * /
 We create two nonterminals expr and term for the two levels
of precedence, and an extra nonterminal factor for generating
basic units in expressions.
 The basic units in expressions are presently digits and
parenthesized expressions.

factor  digit I ( expr )

25
 Binary operators, * and /, that have the highest precedence.
 Since these operators associate to the left, the productions are
simiilar to those for lists that associate to the left.

Term  term * factor

I term / factor
I factor
 Similarly, expr generates lists of terms separated by the additive
operators

expr  expr + term

I expr - term
I term

27
Syntax-Directed Translation
• Syntax-directed translation is done by attaching
rules or program fragments to productions in a
grammar.
• For example, consider an expression expr
generated by the production
expr  expr1 + term
• expr is the sum of the two sub expressions expr1
& term.
• (The subscript in expr1 is used only to distinguish
the instance of expr in the production body from
the head of the production) .
28
• We can translate expr by exploiting its
structure:
translate expr1 ;
translate term;
Handle +;

29
Attributes
• An attribute is any quantity associated with a
programming Construct
• Examples: data types of expressions, the number of
instructions in the generated code, or the location of
the first instruction in the generated code for a
construct , among many other possibilities.
• Since we use grammar symbols (nonterminals and
terminals) to represent programming constructs, we
extend the notion of attributes from constructs to the
symbols that represent them.

30
(Syntax- directed) translation schemes
• A translation scheme is a notation for attaching
program fragments to the productions of a
grammar.
• The program fragments are executed when the
production is used during syntax analysis.
• The combined result of all these fragment
executions, in the order induced by the syntax
analysis, produces the translation of the program
to which this analysis/synthesis process is
applied.
31
Postfix Notation
• The postfix notation for an expression E:
1. If E is a variable or constant , then the
postfix notation for E is E itself.
2. If E is an expression of the form E1op E2 ,
where op is any binary operator, then the
postfix notation for E is E1`E2` op, where E1` &
E2` are the postfix notations for E1 & E2
3. If E is a parenthesized expression of the
form (E1), then the postfix notation for E is
the same as the postfix notation for E1 ·
32
The postfix notation for (9-5)+2  95-2+
The translations of 9, 5, and 2 are the
constants themselves, by rule (1).
The translation of 9-5 is 95- by rule (2)
The translation of (9-5) is the same by rule (3)
• Having translated the parenthesized
subexpression, we may apply rule (2) to the entire
expression, with (9-5) in the role of E1 and 2 in the
role of E2 , to get the result 95-2+
9 - (5+2)  952+-
• 5+2 is first translated into 52+, and this expression
becomes the second argument of the minus sign.
33
Postfix Notation
• No parentheses are needed in postfix notation, because the
position and arity (number of arguments) of the operators
permits only one decoding of a postfix expression.
 Trick
 repeatedly scan the postfix string from the left, until
you find an operator.
 Then, look to the left for the proper number of
operands, and group this operator with its operands.
 Evaluate the operator on the operands, and replace
them by the result.
 Then repeat the process, continuing to the right and
searching for another operator.

34
• 952+-3*
• Scanning from the left , we first encounter the
plus sign.
• Looking to its left we find operands 5 and 2
• Their sum , 7, replaces 52+, and we have the
string 97-3*
• Now, the leftmost the result of the subtraction
leaves 23*
• Last, the multiplication sign applies to 2 and 3,
giving the result 6

35
Synthesized Attributes
• The idea of associating quantities with programming
constructs-for example, values and types with expressions-
can be expressed in terms of grammars.
• We associate attributes with nonterminals and terminals.
• Then, we attach rules to the productions of the grammar;
 these rules describe how the attributes are computed
at those nodes of the parse tree where the production
in question is used to relate a node to its children.
 A syntax-directed definition associates
• 1. With each grammar symbol, a set of attributes, and
• 2. With each production, a set of semantic rules for
computing the values of the attributes associated with the
symbols appearing in the production.
36
Synthesized Attributes
• Attributes can be evaluated as follows.
 For a given input string x, construct a parse tree for
x.
 Then, apply the semantic rules to evaluate
attributes at each node in the parse tree:
 Suppose a node N in a parse tree is labeled by the
grammar symbol X .
 We write X.a to denote the Value of attribute a of X at
that node.
• A parse tree showing the attribute values at each node
is called an annotated parse tree.

37
Annotated parse tree for 9-5+2 with an attribute
t associated with the nonterminals expr & term.

Attribute values at nodes in a parse tree

38
Synthesized Vs. Inherited Attribute
• Synthesized attribute: An attribute is said to
be synthesized if its value at a parse-tree node
N is determined from attribute values at the
children of N & at N itself.
• Inherited attribute: have their value at a
parse-tree node determined from attribute
values at the node itself, its parent , and its
siblings in the parse tree.

39
Example 2.10 : syntax-directed definition for translating
expressions consisting of digits separated by plus or minus signs
into postfix notation.

Each nonterminal has a string-valued attribute t that represents

postfix notation for the expression generated by that nonterminal
in a parse tree. The symbol || in the semantic rule is the operator
for concatenation

Infix to postfix translation

40
Parsing
• Parsing is the process of determining how a
string of terminals can be generated by a
grammar.
• A parser must be capable of constructing the
tree in principle, or else the translation cannot
be guaranteed correct.
• Yacc; Parser tool; can implement the
translation scheme without modification

41
Parsing
• For any CFG there is a parser that takes at most O
(n3 ) time to parse a string of n terminals.
But cubic time is generally too expensive.
• For real programming languages, design a
grammar that can be parsed quickly (Linear-time
algorithms)
• Programming-language parsers almost always
make a single left-to-right scan over the input,
looking ahead one terminal at a time, and
constructing pieces of the parse tree as they go.
42
Parsing Methodology
• Mostly 02 classes:
 Top-down
 Bottom-up

• Top-down: construction starts at the root & proceeds

towards the leaves
• Bottom-up: construction starts at the leaves & proceeds
towards the root
 top-down parsers is more popular; efficient parsers
can be constructed more easily by hand
 Bottom-up parsing, can handle a larger class of
grammars & translation schemes, so software tools
for generating parsers directly from grammars
43
Top-down parsing
• Starting with the root (stmt), & repeatedly
performing the following two steps:
1. At node N, labeled with nonterminal A,
select one of the productions for A &
construct children at N for the symbols in the
production body.
2. Find the next node at which a subtree is to
be constructed, typically the leftmost
unexpanded nonterminal of the tree

44
Example
Lookahead symbol:
current terminal being
scanned in the input

 Initially, the lookahead

fo r ( ; expr ; expr ) other symbol is the first
 leftmost ,
terminal of the
input string.

45
Top-down parsing
while scanning
the input from
left to right

46
Thank You

Syntax-Directed Translation in Compilers
No ratings yet
Syntax-Directed Translation in Compilers
51 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
CSC441-Lesson 04
No ratings yet
CSC441-Lesson 04
40 pages
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
No ratings yet
CS 4300: Compiler Theory A Simple Syntax-Directed Translator
70 pages
Compiler 2
No ratings yet
Compiler 2
45 pages
(Week 3) Syntax Analysis (Derivation)
No ratings yet
(Week 3) Syntax Analysis (Derivation)
46 pages
Chapter 2 (Part 1)
No ratings yet
Chapter 2 (Part 1)
32 pages
Compiler Theory: Syntax-Directed Translation
No ratings yet
Compiler Theory: Syntax-Directed Translation
50 pages
Entrepreneurship Process
No ratings yet
Entrepreneurship Process
22 pages
Class Three
No ratings yet
Class Three
74 pages
BCS 324 Compiler Design Notes - Unit2
No ratings yet
BCS 324 Compiler Design Notes - Unit2
37 pages
02 Simple Sysntax Directed Translation (Updated)
No ratings yet
02 Simple Sysntax Directed Translation (Updated)
60 pages
Compiler Design 3
No ratings yet
Compiler Design 3
140 pages
CH2-1 To CH2-3
No ratings yet
CH2-1 To CH2-3
79 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Compiler 2
100% (1)
Compiler 2
45 pages
Syntax and Parsing in Compiler Design
No ratings yet
Syntax and Parsing in Compiler Design
19 pages
CSC 409 Note 2
No ratings yet
CSC 409 Note 2
12 pages
Lecture2 PDF
No ratings yet
Lecture2 PDF
45 pages
Syntax Analysis in Programming Languages
No ratings yet
Syntax Analysis in Programming Languages
28 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Chapter - Three
No ratings yet
Chapter - Three
139 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
(Week 4) Syntax Analysis (CFG)
No ratings yet
(Week 4) Syntax Analysis (CFG)
50 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Lecture 1 Introduction DR Raheel 19022024 032426pm
No ratings yet
Lecture 1 Introduction DR Raheel 19022024 032426pm
32 pages
Lec4 SyntaxAnalysis
No ratings yet
Lec4 SyntaxAnalysis
41 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
8 Notes
No ratings yet
8 Notes
12 pages
Ch2 Modified
No ratings yet
Ch2 Modified
39 pages
CH03
No ratings yet
CH03
57 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
Chapter - Three: Syntax Analysis
No ratings yet
Chapter - Three: Syntax Analysis
100 pages
Chapter-3 So Far
No ratings yet
Chapter-3 So Far
50 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
6 pages
Lecture 3 03032025 113959am
No ratings yet
Lecture 3 03032025 113959am
51 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
02 Simple Sysntax Directed Translation
No ratings yet
02 Simple Sysntax Directed Translation
57 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
Syntax and Semantics in Programming
100% (2)
Syntax and Semantics in Programming
50 pages
Multimedia Application L4
No ratings yet
Multimedia Application L4
42 pages
L4 Formal Grammers
No ratings yet
L4 Formal Grammers
23 pages
Figure 1two Parse Trees For 9-5+2
No ratings yet
Figure 1two Parse Trees For 9-5+2
3 pages
CH2 1
No ratings yet
CH2 1
27 pages
A Simple One - Pass Compiler
No ratings yet
A Simple One - Pass Compiler
62 pages
Structure Ofa Compiler: Front End
No ratings yet
Structure Ofa Compiler: Front End
95 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
9 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
No ratings yet
Syntax Analysis: - Check Syntax and Construct Abstract Syntax Tree
22 pages
Compiler Syntax Analysis Overview
No ratings yet
Compiler Syntax Analysis Overview
18 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
16 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Syntax Analysis and Parser Techniques
No ratings yet
Syntax Analysis and Parser Techniques
171 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
Module 2
No ratings yet
Module 2
19 pages
Right Linear Grammars Homework 7
No ratings yet
Right Linear Grammars Homework 7
2 pages
The Concept of Truth in Formalized Languages
No ratings yet
The Concept of Truth in Formalized Languages
3 pages
Unit 2 Logic Homework 3 Answers
No ratings yet
Unit 2 Logic Homework 3 Answers
3 pages
Truth Tables For Negation, Conjunction and Disjunction
No ratings yet
Truth Tables For Negation, Conjunction and Disjunction
13 pages
COMPILER DESIGN (18CS2T33) - Repeat Mid Term Exam - 2022-2023
No ratings yet
COMPILER DESIGN (18CS2T33) - Repeat Mid Term Exam - 2022-2023
1 page
Propositions
No ratings yet
Propositions
2 pages
CNF and DNF Exercises in CS
No ratings yet
CNF and DNF Exercises in CS
2 pages
Unit 1
No ratings yet
Unit 1
74 pages
Kolmogorov 25
No ratings yet
Kolmogorov 25
12 pages
Discrete Mathematic Ebmma22006
No ratings yet
Discrete Mathematic Ebmma22006
56 pages
094 - MA8351, MA6566 Discrete Mathematics - 2 Marks With Answers
No ratings yet
094 - MA8351, MA6566 Discrete Mathematics - 2 Marks With Answers
11 pages
On The Bourbaki's Fixed Point Theorem and The Axiom of Choice
No ratings yet
On The Bourbaki's Fixed Point Theorem and The Axiom of Choice
5 pages
Logic Fundamentals for Students
No ratings yet
Logic Fundamentals for Students
12 pages
04 Propositional Logic - pptx-2
No ratings yet
04 Propositional Logic - pptx-2
60 pages
Compiler Design Overview and Phases
No ratings yet
Compiler Design Overview and Phases
121 pages
Sudkamp Solutions 3rd
No ratings yet
Sudkamp Solutions 3rd
115 pages
FOPL
No ratings yet
FOPL
6 pages
Chapter-2.3.3 Closure Properties OfContext Sensitive Language
No ratings yet
Chapter-2.3.3 Closure Properties OfContext Sensitive Language
16 pages
Grammar
No ratings yet
Grammar
57 pages
MS101 - Module 2 - Predicate Logic
No ratings yet
MS101 - Module 2 - Predicate Logic
6 pages
PDA Acceptance Criteria Explained
No ratings yet
PDA Acceptance Criteria Explained
10 pages
PHIL 120 A: Ch. 1 Homework Quiz
No ratings yet
PHIL 120 A: Ch. 1 Homework Quiz
5 pages
Theory of Inference (Autosaved)
No ratings yet
Theory of Inference (Autosaved)
35 pages
Module 3 Notes - Part2
No ratings yet
Module 3 Notes - Part2
26 pages
Error Recovery Techniques
No ratings yet
Error Recovery Techniques
2 pages
AI Problem Solving Solutions
No ratings yet
AI Problem Solving Solutions
8 pages
Topic 1 - Logic and Quantifier
No ratings yet
Topic 1 - Logic and Quantifier
103 pages
Ai Unit IV Notes
No ratings yet
Ai Unit IV Notes
67 pages
Module:1 Introduction To Languages and Grammars
No ratings yet
Module:1 Introduction To Languages and Grammars
20 pages
Godel's Incompleteness Theorem: by Dale Myers
No ratings yet
Godel's Incompleteness Theorem: by Dale Myers
9 pages

Chapter 2

Uploaded by

Chapter 2

Uploaded by

Chapter 2

Simple Syntax-Directed Translation

Muhammad Kamal Hossen

• Parser: produces a syntax tree, that is further translated into three-

Abstract syntax tree

Left associative Right associative

factor  digit I ( expr )

Term  term * factor

expr  expr + term

Attribute values at nodes in a parse tree

Each nonterminal has a string-valued attribute t that represents

Infix to postfix translation

• Top-down: construction starts at the root & proceeds

 Initially, the lookahead

You might also like