SDT and SDD in Compiler Design
SDT and SDD in Compiler Design
Syntax-Directed Translation (SDT) is a compiler design technique that integrates parsing with
the translation of programming languages by associating semantic actions with grammar
rules. SDT helps in converting source code into intermediate representations, machine code,
or other forms required by the compiler. It combines syntax and semantics by attaching rules
to the grammar that specify how attributes of grammar symbols are computed during
parsing.
Role of Syntax-Directed Definitions (SDD) in SDT:
1. Definition and Structure:
o An SDD defines the semantics of a language by associating attributes with
grammar symbols and rules for computing these attributes.
o Attributes can be synthesized (computed from child nodes) or inherited
(computed from parent or sibling nodes).
2. Foundation of SDT:
o SDDs form the theoretical foundation of SDT. They specify how semantic
actions in SDT are defined and computed during parsing.
o For example, in a grammar rule like E → E1 + T, an SDD might define how to
compute the value of E as the sum of E1 and T.
3. Attribute Evaluation:
o SDT uses SDDs to evaluate attributes in the correct order. For instance,
synthesized attributes are typically computed in a bottom-up traversal, while
inherited attributes are computed in a top-down manner.
4. Driving Translation:
SDDs guide the generation of intermediate representations like syntax trees or
three-address code by describing how to evaluate and propagate semantic
information through grammar rules.
ACD
5. Example: For the rule E → E1 + T, an SDD might have:
o Synthesized attribute [Link] = [Link] + [Link].
o The parser evaluates this rule using SDT during parsing to compute and
propagate values.
Therefore, SDDs define the semantic logic that SDT uses during parsing, ensuring
accurate and efficient translation of source code into its desired representation.
Evaluation Orders for Syntax-Directed Definitions (SDDs):
The evaluation order of attributes in an SDD determines how attributes are
computed during parsing. The two primary evaluation orders are:
1. Dependency Order:
• Definition: Attributes are evaluated based on their dependencies. For an attribute to
be computed, all attributes it depends on must be evaluated first.
• Process:
o Construct a dependency graph where nodes represent attributes, and edges
indicate dependencies.
o Perform a topological sort of the dependency graph to determine the order
of evaluation.
• Significance in Translation:
o Ensures correctness by respecting dependencies.
o Supports both synthesized and inherited attributes.
o Can be used in both top-down and bottom-up parsing, provided the graph
has no cycles.
2. Postorder Traversal:
• Definition: Attributes are evaluated in a postorder (bottom-up) traversal of the parse
tree.
• Process:
o Visit child nodes before evaluating a parent node.
ACD
Compute synthesized attributes for each grammar symbol as its subtree is
traversed.
• Significance in Translation:
o Naturally aligns with bottom-up parsing methods like LR parsing.
o Efficient for synthesized-only SDDs since inherited attributes are not needed.
o Simplifies attribute computation when dependencies flow strictly from
children to parent.
Significance of Evaluation Orders in Translation:
1. Ensuring Correctness:
o Proper evaluation order prevents undefined attributes by ensuring
dependencies are resolved before computation.
2. Handling Different Attribute Types:
o Dependency Order is necessary when inherited attributes are present, as
they depend on information from parent or sibling nodes.
o Postorder Traversal is efficient for synthesized attributes, which depend only
on child nodes.
3. Implementation Flexibility:
o Dependency-based evaluation supports flexibility in traversal patterns.
o Postorder traversal is simpler and aligns with many parsing algorithms.
4. Parser Compatibility:
o Top-down Parsers: Benefit from dependency order for resolving inherited
attributes.
o Bottom-up Parsers: Naturally perform postorder evaluation, making them
suitable for synthesized-only SDDs.
5. Efficient Attribute Computation:
o Ensures attributes are computed in minimal steps, avoiding recomputation or
circular dependencies.
ACD
syntax analysis with semantic evaluation, guiding the translation process step-by-
step.
Structure of an SDT:
1. Grammar Rule: Each production in the grammar has associated semantic actions.
2. Semantic Actions: Represented as code snippets (e.g., in C, Java) inserted within
grammar rules. These can manipulate attributes, generate intermediate code, or
perform calculations.
3. Execution Timing: Semantic actions are executed based on the position in the
grammar rule:
o Before parsing the rule (pre-action).
o During the parsing of a symbol (in-action).
o After parsing the rule (post-action).
Syntax-Directed Definition (SDD):
An SDD is a formal framework that defines the attributes of grammar symbols and
the rules for computing these attributes. It focuses on associating semantic rules
with grammar productions, describing attribute evaluation in a structured and
declarative manner.
Differences Between SDT and SDD:
Syntax-Directed
Syntax-Directed
Aspect Translation Scheme
Definition (SDD)
(SDT)
Practical embedding of
Abstract association of
Focus semantic actions in
attributes and rules.
grammar rules.
Procedural; semantic
Declarative; attributes
actions are written
Representation and rules are defined
inline with grammar
outside grammar.
rules.
Semantic actions
Attribute evaluation order
Execution executed explicitly
determined implicitly.
during parsing.
ACD
Syntax-Directed
Syntax-Directed
Aspect Translation Scheme
Definition (SDD)
(SDT)
Example E → E1 + T { [Link] =
[Link] = [Link] + [Link]
Representation [Link] + [Link]; }
Used as a specification;
needs additional Directly executable
Implementation
mechanisms for within a parser.
execution.
ACD
▪ Other previously computed attributes in the same node.
2. Traversal Order:
o Attributes can be evaluated in a depth-first, left-to-right traversal of the parse
tree.
3. Rules:
o An inherited attribute of a symbol in a production can depend only on:
▪ Attributes of symbols to its left in the same production.
▪ Attributes of the parent node.
o Synthesized attributes are computed from inherited attributes and attributes
of child nodes.
Example of an L-attributed SDD:
Consider a grammar for expressions:
E → T E'
E' → + T E' | ε
T → num
Attributes:
• [Link] is the value of the expression.
• [Link] is the value of the terminal num.
• E'.inh is an inherited attribute used to pass intermediate values during parsing.
• E'.syn is a synthesized attribute used to compute final results.
Semantic Rules:
E → T E' { E'.inh = [Link]; [Link] = E'.syn; }
E' → + T E1' { E1'.inh = E'.inh + [Link]; E'.syn = E1'.syn; }
E' → ε { E'.syn = E'.inh; }
T → num { [Link] = [Link]; }
In this example, E'.inh passes the cumulative sum during parsing, and E'.syn holds the
result.
Efficient Implementation of L-attributed SDDs:
1. Top-Down Parsing:
o Use a recursive descent parser or LL parser to traverse the grammar in a left-
to-right manner.
o Pass inherited attributes through function arguments.
o Compute synthesized attributes from the return values of recursive calls.
2. Single Pass:
ACD
o Traverse the parse tree once, evaluating inherited attributes as nodes are
visited and synthesized attributes as subtrees are completed.
o Avoid recomputation by storing intermediate results in node attributes.
3. Attribute Storage:
o Store attributes directly in the parse tree nodes or use a separate data
structure indexed by grammar symbols.
4. Non-Recursive Implementation:
o Use a stack to simulate the left-to-right traversal.
o Push inherited attributes onto the stack before processing a node and pop
them after computation.
5. Applicability in Compilers:
o L-attributed SDDs are suitable for generating symbol tables, type checking,
and code generation in a single pass.
ACD
Role of Syntax Trees in Intermediate Code Generation
Syntax trees are hierarchical representations of the source code that mirror its
grammatical structure. They are used to generate intermediate code systematically.
1. Tree Structure:
o Internal nodes represent operators.
o Leaf nodes represent operands (variables, constants, etc.).
2. Traversal for Code Generation:
o A postorder (bottom-up) traversal of the syntax tree computes values for
subexpressions and generates intermediate instructions step by step.
Example: Expression Translation
Input Expression:
a=b+c*d
Syntax Tree Representation:
=
/ \
a +
/\
b *
/\
c d
Step-by-Step Code Generation (Three-Address Code):
1. Visit the subtree for c * d:
2. t1 = c * d
3. Visit the subtree for b + t1:
4. t2 = b + t1
5. Assign t2 to a:
6. a = t2
Generated Intermediate Code:
t1 = c * d
t2 = b + t1
a = t2
ACD
Significance of Syntax Trees in This Process:
1. Systematic Traversal:
o Syntax trees ensure that subexpressions are processed in the correct order,
respecting operator precedence and associativity.
2. Reusability:
o Common subexpressions can be identified and computed once, avoiding
redundant computations.
3. Optimization Opportunities:
o Nodes in the syntax tree can be replaced or combined to reflect optimizations
like constant folding (2 + 3 → 5) or common subexpression elimination.
4. Abstraction Layer:
o Provides a structured, language-agnostic framework for generating
intermediate representations.
By leveraging syntax trees, compilers achieve a structured and efficient method for
translating source programs into intermediate code, simplifying both optimization
and machine code generation.
Features of TAC:
1. Simplicity:
o Breaks complex expressions into smaller, manageable instructions.
ACD
2. Platform Independence:
o Serves as an abstract representation, independent of the target architecture.
3. Flexibility:
o Easy to perform optimizations like constant folding, dead code elimination,
and strength reduction.
Example of TAC
Input Expression:
a=b+c*d-e
Step-by-Step TAC Generation:
1. Evaluate c * d and store the result in a temporary variable:
2. t1 = c * d
3. Add b to t1:
4. t2 = b + t1
5. Subtract e from t2:
6. t3 = t2 - e
7. Assign the result to a:
8. a = t3
Final TAC:
t1 = c * d
ACD
t2 = b + t1
t3 = t2 - e
a = t3
ACD
▪ =
▪ / \
▪ a *
▪ /\
▪ + d
▪ /\
▪ b c
2. Directed Acyclic Graph (DAG):
o Definition: A compact representation of the syntax tree that eliminates
duplicate nodes for common subexpressions.
o Features:
▪ Nodes are shared for repeated subexpressions.
▪ Reduces the size of the representation and avoids redundant
computation.
o Use Case in Intermediate Code Generation:
▪ Useful for optimizing code by identifying and reusing computed
results of common subexpressions.
o Example:
▪ For a = b + c * d and e = c * d + b, the DAG eliminates redundancy:
▪ +
▪ / \
▪ b *
▪ /\
▪ c d
▪ The computation c * d is shared across both expressions.
3. Annotated Syntax Tree:
o Definition: A syntax tree where nodes are enriched with additional
information (e.g., attributes like types, intermediate results, or symbol table
references).
o Features:
▪ Integrates semantic information directly into the tree structure.
o Use Case in Intermediate Code Generation:
▪ Facilitates type checking, semantic analysis, and efficient translation
into intermediate code.
ACD
o Example:
▪ For a = b + c, each node may store:
▪ The type of the operation (+ is integer addition).
▪ Intermediate results for attributes like val.
4. Syntax-Directed Translation Tree (SDT Tree):
o Definition: A syntax tree where semantic actions are directly associated with
nodes or productions.
o Features:
▪ Combines semantic evaluation with syntax tree traversal.
o Use Case in Intermediate Code Generation:
▪ Guides the generation of intermediate code by embedding rules into
the tree traversal process.
o Example:
▪ For a = b + c, an SDT tree might include:
▪ Semantic rules for generating TAC during the traversal.
ACD
addresses unique challenges, ensuring robust and optimized translation from source
code to intermediate representation.
ACD
o No Fragmentation: Stack allocation does not suffer from fragmentation
because the stack grows and shrinks in a predictable manner (always growing
downward and shrinking upward).
2. Space Reusability:
o Once a function call ends, its stack frame is discarded, making the memory it
occupied reusable for subsequent function calls or local variables.
3. Fast Allocation and Deallocation:
o Constant-time operations: Both pushing (allocating memory) and popping
(deallocating memory) from the stack are O(1) operations, making them
extremely fast.
4. Security and Scope:
o Limited Lifetime: The space allocated for function calls is automatically
cleaned up when the function ends, ensuring that variables are only
accessible within their valid scope (function context).
Example:
Consider the following program snippet:
void foo() {
int x = 5;
int y = 10;
// function body
}
int main() {
foo();
// function body
}
Stack Allocation:
1. On calling main():
o A stack frame for main is created.
o Space is allocated for local variables (if any) in main.
2. On calling foo() from main():
o A new stack frame is created for foo().
o Local variables x and y are allocated within the stack frame of foo().
ACD
3. On returning from foo():
o The stack frame for foo() is popped off the stack.
o The memory for x and y is reclaimed, and the stack pointer is adjusted.
4. On returning from main():
o The stack frame for main() is popped, and the program exits.
Stack allocation is a key technique for managing memory during program execution,
especially for handling function calls and local variables. Its simplicity and efficiency make it
ideal for managing space in most programs, providing quick memory allocation and
deallocation, avoiding fragmentation, and ensuring that memory is used only during its valid
scope. The stack’s LIFO nature provides a clear and effective way to manage space without
the need for complex memory management schemes.
ACD
3. Access Mechanism:
o Stack Traversal: When a function accesses a nonlocal variable, it often
involves stack unwinding (traversing upwards) to find the variable in the
appropriate stack frame.
o Frame Pointer (FP): Modern systems use the frame pointer to point to the
base of the current stack frame. By walking through the stack using the frame
pointer, the compiler can locate nonlocal data stored in previous stack
frames.
4. Accessing Nonlocal Data for Nested Functions:
o In languages that support nested functions or closures (e.g., Python,
JavaScript), a nested function may need to access variables from its enclosing
function (nonlocal variables). This is typically handled by capturing the stack
pointer of the enclosing function, allowing the nested function to access
those variables even after the enclosing function has returned.
ACD
Significance in the Run-time Environment:
1. Function Call Hierarchies:
o Nonlocal data access is vital in nested function calls, where functions at a
deeper level need to access variables in outer levels (parent or global
functions).
2. Closures:
o In languages supporting closures (e.g., Python, JavaScript), nonlocal data
access is used when functions "capture" variables from their enclosing scope.
These captured variables remain accessible even after the enclosing function
has returned.
3. Efficient Stack Use:
o Using the stack for nonlocal data access ensures that the run-time
environment remains efficient in terms of memory usage, leveraging the call
stack to track data across different scopes without needing to allocate
additional memory locations.
4. Scope Resolution:
o The stack-based memory model allows lexical scoping, where variables are
accessed based on the structure of the program (e.g., function nesting),
rather than just runtime conditions. This provides the benefit of predictable
and efficient variable access during execution.
Example with Closures (in Python-like Pseudocode):
def outer():
x = 10 # Nonlocal variable
def inner():
print(x) # Accesses 'x' from the enclosing function 'outer'
inner()
outer()
• How nonlocal access works:
o The function inner can access x, even though x is not in inner's local scope,
because the runtime environment will use the stack to find x in the stack
frame of outer().
o This is an example of how nonlocal variables are "captured" for use inside
nested functions.
ACD
Access to nonlocal data on the stack enables functions to interact with variables that are not
directly in their local scope but exist in higher-level stack frames. This is essential for
handling nested function calls and closures. The runtime environment manages this by using
mechanisms like the frame pointer and stack traversal to locate nonlocal variables, ensuring
efficient memory use and correct variable scope resolution during program execution.
ACD
o Size Limitations: The heap is much larger than the stack, but improper
memory management can lead to issues such as memory leaks or heap
fragmentation.
ACD
Key Differences Between Heap and Stack Memory:
Through pointers or
Access Method Direct access to variables
references
ACD