0% found this document useful (0 votes)
108 views

Compiler Design - Chapter 4 - Syntax Directed Translation

The document discusses syntax directed translations and attribute grammars. It introduces syntax directed definitions which associate attributes with grammar symbols and semantic rules to compute attribute values. Synthesized attributes are computed from child attributes while inherited attributes are computed from parent/sibling attributes. The evaluation order of rules is determined by the attribute dependency graph. Examples show how attribute grammars can be used to evaluate expressions and distribute type information in declarations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views

Compiler Design - Chapter 4 - Syntax Directed Translation

The document discusses syntax directed translations and attribute grammars. It introduces syntax directed definitions which associate attributes with grammar symbols and semantic rules to compute attribute values. Synthesized attributes are computed from child attributes while inherited attributes are computed from parent/sibling attributes. The evaluation order of rules is determined by the attribute dependency graph. Examples show how attribute grammars can be used to evaluate expressions and distribute type information in declarations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Chapter 4 – Syntax Directed Translation

● Introduction
● Syntax Directed Definitions
● Form of a Syntax-Directed Definition

● Synthesized Attributes

● Dependency Graphs

● Inherited Attributes

● Evaluation Order

● Construction of Syntax Trees for Expressions


● A Syntax-directed Definition for Constructing Syntax Trees
Directed Acyclic Graphs for Expressions

● Some Classes of Non-circular Attributed Grammars

● S-Attributed grammars

● L-Attributed grammars
Introduction
● We associate information with a programming construct by
attaching attributes to the grammar symbols representing the
construct.
● Values for attributes are computed by “Semantic Rules”
associated with the grammar productions.
● There are two notations for associating semantic rules with
productions:
– syntax directed definitions, and
– translation schemes.
● Syntax-directed definitions are high-level specifications for
translations.
– They hide many implementation details and free the user from having
to specify explicitly the order in which translation takes place.
Introduction
● Translation schemes indicate the order in which
semantic rules are to be evaluated
– so they allow some implementation details to be shown.
● We use both notations for specifying semantic
checking, particularly the determination of types, and
for generateing intermediate code.
● Conceptually, with both syntax-directed definitioins and
translation schemes:
– we parse the input token stream,
– build the parse tree, and
– then traverse the tree as needed to evaluate the semantic
rules at the parse-tree nodes.
Introduction

● Evaluation of the semantic rules


– may generate code,
– save information in the symbol table,
– issue error messages, or
– perform any other activities.
● The translation of the token stream is the result
obtained by evaluating the semantic rules.
Introduction
● Parsing an input to do nothing about it is useless.
● In fact, an input is parsed so that some code is
generated, an information is displayed, etc.
● These different actions are done by the semantic
actions associated to the different rules of the
grammar.
● To execute these actions the parsing table may or
may not be constructed first.
● Indeed, the best solution performance-wise is to
execute the actions while producing the parse tree.
Syntax Directed Definitions
● A syntax directed definition is a generalization of the CFG
in which each grammar symbol has an associated set of
synthesized and inherited attributes.
● An attribute can represent anything we choose:
– a string, a number, a type, a memory location, etc.
● The value of an attribute at a parse-tree node is defined by
a semantic rule associated with the production used at that
node.
– The value of a synthesized attribute is computed from the
values of attributes at the children of that node in the parse tree.
– And the value of an inherited attribute is computed from the
values of attributes at the siblings and parent of that node in the
parse tree.
Syntax Directed Definitions
● Semantic rules calculate the values of attributes hence set-up
dependencies between attributes that will be represented by a
graph.
● The dependency graph enables to find an evaluation order for
the semantic rules.
● Evaluation of the semantic rules defines the values of the
attributes at the nodes in the parse tree for the input string.
● A semantic rule may also have side effects, e.g., printing a
value or updating a global variable.
● A parse tree showing the values of the attributes is called an
annotated parse tree.
● The process of computing the attribute values at the nodes is
called annotating or decorating the parse-tree.
Form of a Syntax-Directed Definition

In a syntax directed definition, each grammar production   α has
associated with it a set of semantic rules of the form:
b := f (c1, c2, ... , ck)
where f is a function, and either
1. b is a synthesized attribute of A and c1, c2, ... , ck are attributes belonging to the
grammar symbols of the production, or
2. b is an inherited attribute of one of the grammar symbols on the right side of
the production, and c1, c2, ... , ck are attributes belonging to the grammar symbols
of the production.
● In either case, we say that attribute b depends on attributes c1, c2, ... , ck.
● An attribute grammar is a syntax-directed definitioin in which the
functions in semantic rules cannot have side effects.
● Functions in semantic rules will often be written as expressions.
● Semantic rules are written as procedure calls or program fragments.
Form of a Syntax-Directed Definition
Example 1: The following is the syntax-directed definition for a desk
calculator program.
● This definition associates an integer valued synthesized attribute
called val with each of the nonterminals E, T, and F.
● For each E, T, and F-production, the semantic rule computes the
value of attributes val for the nonterminal on the left side from the
values of val for the nonterminals on the right side.
PRODUCTION SEMANTIC RULES
LEn print(E.val)
E  E1 + T E.val := E1.val + T.val
E T E.val := T.val
T  T1 * F T.val := T1.val  F.val
T F T.val := T.val
F(E) F.val := E.val
F  digit F.val := digit.lexval
Form of a Syntax-Directed Definition
Example 1: ...
● The token digit has a synthesized attribute lexval whose value is
assumed to be supplied by the lexical analyzer.

The rule associated with the production L  E n for the starting
nonterminal L is just a procedure call that prints as output the value
of the arithmetic expression generated by E; we can think of this
rule as defining a dummy attribute for nonterminal L.
● In syntax-directed definition, terminals are assumed to have
synthesized attribute only, as the definition does not provide any
semantic rules for terminals.
● Values for attributes of terminals are usually supplied by the lexical
analyzer.
● Furthermore, the start symbol is assumed not to have any inherited
attributes, unless otherwise stated.
Synthesized Attributes

● A syntax-directed definition that uses


synthesized attributes exclusively is said to be
an S-attributed definition.
● A parse-tree for an S-attributed definition can
always be annotated by evaluating the semantic
rules for the attributes at each node bottum up,
from the leaves to the root.
Synthesized Attributes

Example 2: The S-attributed definition in Example 1


specifies a desk calculator that reads an input line
containing an arithmetic expression
– involving digits, parentheses, the operators + and *,
– followed by a newline character n, prints the value of the
expression.
● For example, given the expression 3*5+4 followed by a
newline, the program prints the value 19.
● The following figure contains an annotated parse tree for
the input 3*5+4n.
● The output, printed at the root of the tree, is the value of
E.val at the first child of the root.
Synthesized Attributes

Example 2: ...
Synthesized Attributes

Example 2: ...
● To see how the attribute values are computed,
– consider the leftmost bottommost interior node, which
corresponds to the use of the production F  digit.
– The semantic rule corresponding to this production, F.val :=
digit.lexval,
● defines the attribute F.val at that node to have the value 3 because
the value of digit.lexval at the child of this node is 3.
– Similarly, at the parent of this F-node, the attribute T.val has
the value 3.
● The attribute values for the other nonterminals is
computed in similar manner.
Inherited Attributes
● An inherited attribute is one whose value at a node in a parse-
tree is defined in terms of attributes at the parent and/or siblings
of that node.
● Inherited attributes are convinient for expressing the dependence
of a programming language construct on the context in which it
appears.
● For example, we can use an inherited attribute to keep track of
whether
– an indentifier appears on the left or right side of an assignment,
– inorder to decide whether the address or the value of the identifier is
needed.
● Let us consider an example that an inherited attribute that
distributes type information to the various identifiers in a
declaration.
Inherited Attributes

Example 3: A declaration generated by the nonterminal D in


the syntax-directed definition in the table below consists of
the keyword int or real, followed by a list of identifiers.
● The nonterminal T has a synthesized attribute type, whose
value is determined by the keyword in the declaration.
PRODUCTION SEMANTIC RULES
DTL L.in := T.type

T  int T.type := integer

T  real T.type := real

T  T1 * F T.val := T1.val  F.val


L  L1 , id L1.in := L.in
addtype(id.entry, L.in)
L  id addtype(id.entry, L.in)
Inherited Attributes

Example 3: ...
● The semantic rule L.in := T.type, associated with
production D  T L,
– Sets inherited attribute L.in to the type in the
declaration.
– The rules then pass this type down the parse tree
using the inherited attribute L.in.
– Rules associated with the productions for L call
procedure addtype to add the type of each identifier
to its entry in the symbol table (pointed to by
attribute entry).
Inherited Attributes

Example 3: ...
● The following figure shows the annotated parse-
tree for the sentence real id1 , id2, id3.
Inherited Attributes

Example 3: ...
– L-nodes gives the type of the identifiers id1, id2 and
id3.
– The values are determined by computing the value
of the attribute T.type at the left child of the root and
then evaluateing L.in top-down at the three L-nodes
in the right subtree of the root.
– At each L-node we also call the procedure addtype to
insert into the symbol table the fact that the identifier
at the right child of this node has type real.
Dependency Graphs
● If an attribute b at a node in a parse tree depends on attribute
c, then the semantic rule for b at that node must be evaluated
after the semantic rule that defines c.
● The interdependence among the inherited and synthesized
attributes at the nodes in a parse tree can be depicted by a
directed graph called a dependency graph.
● Before constructing a dependency graph for a parse tree,
– we put each semantic rule into the form b := f (c1, c2, ..., ck), by
introducing a dummy synthesized attribute b for each semantic rule
that consists of a procedure call.
● The graph has a node for each attribute and an edge to the
node for b from the node for c if attribute b depends on
attribute c.
Dependency Graphs

● Algorithm for the construction of the dependency


graph is as follows:

for each node n in the parse tree do


for each attribute a of the grammar symbol at n do
construct a node in the dependency graph for a ;
for each node n in the parse tree do
for each semantic rule b := f(c1, c2, . . . , ck)
associated with the production used at n do
for i := 1 to k do
construct an edge from the node for ci to
the node for b;
Dependency Graphs

● For example, suppose A.a := f (X.x, Y.y) is a semantic


rule for the production A  XY.
– This rule defines a synthesized attribute A.a that depends
on the attributes X.x and Y.y.
– If this production is used in the parse tree, then there will be
three nodes A.a, X.x, and Y.y in the dependency graph with
● an edge to A.a from X.x since A.a depends on X.x, and
● an edge to A.a from Y.y since A.a also depends on Y.y.
– If the production A  XY has the semantic rule
X.i := g(A.a, Y.y) associated with it, then there will be
● an edge to X.i from A.a and also
● an edge to X.i from Y.i, since X.i depends on both A.a and Y.i.
Dependency Graphs

Example 4: The figure below shows the dependency


graph for the parse tree in Example 3.
Dependency Graphs

Example 4: ...
● Nodes in the dependency graphs are marked by numbers;
these numbers will be used below.
– There is an edge to node 5 for L.in form node 4 for T.type
because the inherited attribute L.in depends on the attribute
T.type according to the semantic rule L.in := T.type for the
production D  TL.
– The two downward edges into nodes 7 and 9 arise because L1.in
depends on L.in according to the semantic rule L1.in := L.in for
the production L  L1 , id.
– Each of the semantic rules addtype(id.entry, L.in) associated with
the L-productions leads to the creation of a dummy attribute.
– Nodes 6, 8, and 10 are constructed for these dummy attributes.
Evaluation Order
● A topological sort of a directed acyclic graph is any
ordering m1, m2, ..., mk of the nodes of the graph such that
edges go from nodes earlier in the ordering to later nodes;
– that is, if mi  mj is an edge from mi to mj, then mi appears
before mj in the ordering.
● Any topological sort of a dependency graph gives a valid
order in which the semantic rules associated with the
nodes in a parse tree can be evaluated.
– That is, in the topological sort, the dependent attribute c1, c2, ...,
ck in a semantic rule b := f (c1, c2, ..., ck) are available at a node
before f is evaluated.
– Evaluation of the semantic rules in this order yields the
translation of the input string.
Evaluation Order

Example 5: Each of the edges in the dependency


graph in Example 4 goes from a lower numbered
node to a higher-numbered node.
● Hence, a topological sort of the dependency
graph is obtained by writing down the nodes in
the order of their numbers.
● From this topological sort, we obtain the
following program.
● We write an for the attribute associated with the
node numbered n in the dependency graph.
Evaluation Order

Example 5: ...
a4 := real;
a5 := a4;
addtype(id3.entry, a5)
a7 := a5;
addtype(id2.entry, a7)
a9 := a7;
addtype(id1.entry, a9)
● Evaluating this semantic rules stores the type real
in the symbol-table entry for each identifier.
Evaluation Order
● Several methods have been proposed for evaluating semantic
rules:
1.Parse tree methods: At compile time, these method obtain the
evaluation order from a topological sort of the dependency graph
constructed from the parse tree for each input.
● These method will fail to find an evaluation order only if the dependency graph
for a particular parse tree under consideration has a cycle.
2.Rule based methods: The order in which the attributes associated
with a production are evaluated is predetermined at compiler-
construction time.
● For this method, the dependency graph need not be constructed.
3.Oblivious methods: The evaluation order is chosen without
considering the semantic rules.
● This restricts the class of syntax directed definition that can be used.
● Rule based and oblivious methods need not explicitly construct
the dependency graph at compile time.
Construction of Syntax Tree
● The syntax-directed definitions can be used to
specify the construction of syntax trees and
other graphical representations of language
constructs.
● The use of syntax trees as an intermediate
representation allows translation to be
decoupled from parsing.
● The C compiler constructs a syntax tree for
declarations.
Syntax Tree
● An (abstract) syntax tree is a condensed form of parse
tree useful for representing language constructs.

The productions S  if B then S1 else S2 might appear in

a syntax tree as:

● In a syntax tree, operators and keywords do not appear


as leaves, but rather, are associated with the interior
node that would be the parent of those leaves in the
parse tree.
Syntax Tree
● Another simplification found in syntax trees is the chains of single
productions may be collapsed.
● Syntax-directed translation can be based on syntax trees as well as
parse trees.
● The approach is the same in each case; we attach attributes to the
nodes in the parse tree.

Annotated Parse Tree for 3 * 5 + 4n Syntax Tree for 3 * 5 + 4n


Construction of Syntax Trees for
Expressions
● The construction of a syntax tree for an expression is similar to
the translation of the expression into postfix form.
● We construct subtrees of the subexpressions by creating a node
for each operator and operand.
– The children of an operator node are the roots of the node representing
the subexpressions constituting the operands of that operator.
● Each node in a syntax tree can be implemented as a record with
several fields.
– In the node for an operator, one field identifies the operator and the
remaining fields contain pointers to the nodes for the operands.
– The operator is often called the label of the node.
– When used for translation, the nodes in a syntax tree may have
additional fields to hold the values (or pointers to values) of attributes
attached to the node.
Construction of Syntax Trees for
Expressions
● We use the following functions to create the nodes of
the syntax trees for expressions with binary operators.
● Each function returns a pointer to a newly created
node.
1. mknode(op, left, right) – creates an operator node with label
op and two fields containing pointers to left and right.
2. mkleaf(id, entry) – creates an identifier node with label id
and a field containing entry, a pointer to the symbol-table
entry for the identifier.
3. mkleaf(num, val) – creates a number node with label num
and a field containing val, the value fo the number.
Construction of Syntax Trees for
Expressions
Example 6: The following sequence of function calls creates the
syntax tree for the expression a – 4 + c.
● In this sequence, p1, p2, . . . , p4 are pointers to nodes, and
entrya and entryc are pointers to the symbol-table entries for
identifiers a and c, respectively.
p1 := mkleaf(id, entrya);
p2 := mkleaf(num, 4);
p3 := mknode( '-', p1, p2);
p4 := mkleaf(id, entryc);
p5 := mknode( '+', p3, p4);
● The syntax tree for the expression a – 4 + c is constructed
bottom up.
Construction of Syntax Trees for
Expressions
Example 6: ...
– The funciton calls mkleaf(id, entrya) and mkleaf(num, 4) construct the
leaves for a and 4; the pointers to these nodes are saved using p1
and p2.
– The call mknode('-', p1, p2) then constructs the interior node with the
leaves for a and 4 as children.
– After two more steps, p5 is left pointing to the root.
● The resulting syntax tree for a – 4 + c is:
A Syntax-directed Definition for
Constructing Syntax Trees
● Consider the S-attributed definition for constructing
– a syntax tree for an expression containing the operators + and –.
PRODUCTION SEMANTIC RULES
E  E1 + T E.nptr := mknode('+', E1.nptr, T.nptr)
E  E1 - T E.nptr := mknode('–', E1.nptr, T.nptr)
ET E.nptr := T.nptr
T(E) T.nptr := E.nptr
T  id T.nptr := mkleaf(id, id.entry)
T  num T.nptr := mkleaf(num, num.val)
– Syntax-directed definition for constructing a syntax tree for an
expression
● ues productions to schedule calls to functions mknode and mkleaf, and
● the synthesized attribute nptr for E and T keeps track of pointers returned
by the funciotn calls.
A Syntax-directed Definition for
Constructing Syntax Trees
Example: An annotated parse tree depicting the
construction of a syntax tree for the expression a – 4 + c is:

– The parse tree is shown dotted and


– The nonterminals E and T use the synthesized attribute nptr
to hold a pointer to the syntax tree node for the expression
represented by the nonterminal.
A Syntax-directed Definition for
Constructing Syntax Trees
Example: ...
● The semantic rules associated with the productions T  id and
T  num define
– attribute T.nptr to be a pointer to a new leaf for an identifer and a
number respectively.
– Attributes id.entry and num.val are lexical values assumed to be
returned by the lexical analyzer with the tokens id and num.
● In the above figure, when an expression E is a single term,
corresponding to a use of the production E  T,
– the attribute E.nptr gets the value of T.nptr.
– When the semantic rule E.nptr := mknode('–', E1.nptr, T.nptr) associated
with the production E  E1 – T is invoked, previous rules have set
E1.nptr and T.nptr to be pointers to the leaves for a and 4, respectively.
Directed Acyclic Graphs for Expressions

● A directed acyclic graph, called dag, for an expression


is used to identify the common subexpressions in the
expression.
● Like a syntax tree, a dag has
– a node for every subexpression of the expression;
– an interior node represents an operator and its children
represent its operands.
● The difference is that
– a node in a dag representing a common subexpression has
more than one “parent” in a syntax tree,
– the common subexpression would be represented as a
duplicated subtree.
Directed Acyclic Graphs for Expressions

● Consider a dag for the expression


a+a*(b–c)+(b –c)*d

– The leaf for a has two parents because a is common to the


two subexpressions a and a * ( b – c ).
– Likewise, both occurrences for the common subexpression
b – c are represented by the same node, wich also has two
parents.
Directed Acyclic Graphs for Expressions

● The syntax-directed definition for construction of a dag


instead of a syntax tree is obtained by modifying the
operations for constructing nodes.
– i.e. a dag is obtained if the function constructing a node first
checks to see whether an identical node already exists.
● In many applications, nodes are implemented as
records in an array.
– Each record has a label field that determines the nature of
the node.
– We refer to a node by its index or posion in the array.
– The integer index of a node is often called a value number.
Directed Acyclic Graphs for Expressions

● For example, using value numberss, we can say


node 3 has label +, its left child is node 1, and its
right child is node 2.

● Nodes in a dag for i := i + 10 allocated from an array


Directed Acyclic Graphs for Expressions

● Attribute grammar is a special form of context-free


grammar
– where some additional information (attributes) are appended to
one or more of its non-terminals in order to provide context-
sensitive information.
● Each attribute has well-defined domain of values,
– such as integer, float, character, string, and expressions.
● Attribute grammar is a medium to provide semantics to
the context-free grammar
– it can help to specify the syntax and semantics of a
programming language.
● Attribute grammar (when viewed as a parse-tree) can
pass values or information among the nodes of a tree.
Some Classes of Non-circular Attributed
Grammars
S-Attributed grammars
● We say that an attributed grammar is S-
Attributed when all of its attributes are
synthesized; i.e. it doesn't have inherited
attributes.
● Synthesized attributes can be evaluated by a
bottom-up parser as the input is being parsed.
● A new stack (values stack) will be maintained to
store the values of the attributes in addition to
the state stack.
Some Classes of Non-circular Attributed
Grammars
Example: E  E 1 + E2 { E.v := E1.v + E2.v }
Val(newTop) := Val(oldTop) + Val(oldTop – 2)
$$ = $1 + $3 (in Yacc)
● We assume that the synthesized attributes are evaluated just
before each reduction.
● Before the reduction, attributes of E is in Val(Top) and attributes
of E1 and E2 are in Val (Top – 1) and Val(Top – 2) respectively.
● After the reduction, E is put at the top of the State stack and its
attribute values are put at the top of Val, the Value stack.
● The semantic actions that reference the attributes of the
grammar will in fact be translated by the Compiler generator
(such as Yacc) into codes that reference the value stack.
Some Classes of Non-circular Attributed
Grammars
L-Attributed grammars
● It is difficult to execute the tasks of the compiler just
by synthesized attributes.
● The L-attributed (L stands for left) class of grammar
allows a limited kind of inherited attributes.
Definition: A grammar is L-Attributed if and only if for
each rule X0  X1 X2 . . . Xj . . . XN, all inherited attributes
of Xj depend only on:
1.Attributes of X1 X2 . . . Xj-1
2.Inherited attributes of X0
● Of course all S-attributed grammars are L-attributed.
Some Classes of Non-circular Attributed
Grammars
L-Attributed grammars
Example:
ALM {L.h = f1 (A.h) , M.h = f2 (L.s) , A.s = f3 (M.s) }
● This production does not contradict the rules of
L-attributed grammars.
● Therefore, the corresponding grammar may be
L-attributed if all of the other productions follow
the rule of L-attributed grammars.
Some Classes of Non-circular Attributed
Grammars
L-Attributed grammars
Example:
● A  Q R {R.h := f4 (A.h), Q.h := f5 (R.s), A.s = f6 (Q.s) }
● The grammar containing this production is not
L-Attributed since Q.h depends on R.s which
contradicts with the rule of L-Attributed
grammars.
Thank You!

You might also like