0% found this document useful (0 votes)

83 views32 pages

Principles of Compiler Design Overview

This document discusses the introduction to compiler design principles. It defines a compiler as a program that translates a high-level language program into an equivalent machine language program. It covers the basic concepts of compilers including the different types (single pass, multi-pass, load and go, optimizing), their classification, and cousins like assemblers and interpreters. The objective is for students to understand the basic concepts, principles, phases and tools involved in compiler design and construction.

Uploaded by

Hana Abe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views32 pages

Principles of Compiler Design Overview

Uploaded by

Hana Abe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Principles of Compiler Design

(SENG471)

Chapter One

Introduction

1
Preliminaries Required

• Basic knowledge of Programming languages.

• Basic knowledge of Automata and Context Free Grammar.

Textbook:

Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman,

“Compilers: Principles, Techniques, and Tools”

Addison-Wesley, 2007.
Objective
At the end of this session students will be able to:

 Understand the basic concepts and principles of Compiler Design

 Understand the term compiler, its functions and how it works.

 Be familiar with the different classification of compilers.

 Be familiar with cousins of compiler: Linkers, Loaders, Interpreters, Assemblers

 Understand the need of studying compiler Design and construction

 Understand the phases of compilation and the steps of compilation

 Understand the history of compilers. (Assignment For G1 Students)

 Be familiar with the different compiler tools (Assignment For G2 Students)

3
Introduction
Definition

 Compiler is an executable program that can read a program in one high-level language and

translate it into an equivalent executable program in machine language.

 A compiler is a computer program that translates an executable program in a source language into

an equivalent program in a target language.

A source program/code is a program/code written in the source language, which is usually a

high-level language.

A target program/code is a Source Target

program Compiler
program
program/code written in the target
language, which often is a machine
Error
language or an intermediate code message
4
(Object Code).
Contd.
 As a discipline compiler design involves multiple computer science and

Engineering courses like:

Programming Languages

Data structures and Algorithms

Theory of Computation (Automata and formal language theory)

Assembly language

Software Engineering

Computer Architecture

Operating Systems and

Discrete Mathematics

5
Why Study Theory of Compiler?
 Curiosity

 Prerequisite for developing advanced compilers, which continues to be active as

new computer architectures emerge

To improve capabilities of existing compiler/interpreter

To write more efficient code in a high-level language
 Useful to develop software tools that parse computer codes or strings

E.g., editors, debuggers, interpreters, preprocessors, …

 Important to understand how compliers work to program more effectively

 To provide solid foundation in parsing theory for parser writing

 To make compiler design as an excellent “capstone” project

 To apply almost all of the major computer science fields, such as: automata theory,

computer programming, programming language design theory, assembly language,

6
computer architecture, data structures, algorithms, and software engineering.
Classification of Compilers
o Classifying compilers by number of
 Compilers Viewed from Many Perspectives
passes has its background in the
Single Pass hardware resource limitations of
computers.
Multiple Pass Construction
o Compiling involves performing lots of

Load & Go work and early computers did not have

enough memory to contain one program
that did all of this work.
Debugging  So compilers were split up into
Functional
smaller programs which each
Optimizing
made a pass over the source (or
some representation of it)
performing some of the required
analysis and translations.
 However, all utilize same basic tasks to accomplish their actions
7
Contd.
1. Single(One) Pass Compilers:- is a compiler that passes through the source code of
each compilation unit only once
Also called narrow compilers.
The ability to compile in a single pass has classically been seen as a benefit because
it simplifies the job of writing a compiler.
Single-pass compilers generally perform compilations faster than multi-pass
compilers.
Due to the resource limitations of early systems, many early languages were
specifically designed so that they could be compiled in a single pass (e.g., Pascal).

Disadvantage of single pass Compilers:

It is not possible to perform many of the sophisticated optimizations needed to
generate high quality code.
It can be difficult to count exactly how many passes an optimizing compiler
8
makes.
Contd.
2. Multi-Pass Compilers:- is a type of compiler that processes the source code
or abstract syntax tree of a program several times.
Also called wide compilers.
Phases are separate "Programs", which run sequentially
Here, by splitting the compiler up into small programs, correct programs
will be produced.
Proving the correctness of a set of small programs often requires less
effort than proving the correctness of a larger, single, equivalent
program.
Many programming languages cannot be represented with a single pass
compilers, for example most latest languages like Java require a multi-
9 pass compiler.
Contd.
3. Load and Go Compilers:- generates machine code & then immediately
executes it.

Compilers usually produce either absolute code that is executed

immediately upon conclusion of the compilation or object code that is

transformed by a linking loader into absolute code.

These compiler organizations will be called Load & Go and

Link/Load.

Both Load & Go and Link/Load compilers use a number of passes to

translate the source program into absolute code.

10 A pass reads some form of the source program, transforms it into an

Contd.
4. Optimizing Compilers:- is a compiler that tries to minimize or maximize

some attributes of an executable computer program.

The most common requirement is to minimize the time taken to execute a

program; a less common one is to minimize the amount of memory occupied.

The growth of portable computers has created a market for minimizing the

power consumed by a program.

Compiler optimization is generally implemented using a sequence of

optimizing transformations, algorithms which take a program and

transform it to produce a semantically equivalent output program that

uses fewer resources.

Types of Optimization includes Peephole optimizations, local optimization,

11
global optimization, loop optimization, machine-code optimization, etc.
Cousins of Compilers
A. Assembler:- is a translator that converts programs written in assembly language
into machine code.
 Translate mnemonic operation codes to their machine language equivalents.
 Assigning machine addresses to symbolic labels.

B. Interpreter:- is a computer program that translates high level

instructions/programs into machine code as they are encountered.
 It produces output of statement as they are interpreted
 It generally uses one of the following strategies for program execution:
i. execute the source code directly
ii. translate source code into some efficient intermediate representation and
immediately execute this
iii. explicitly execute stored precompiled code made by a compiler which is part of the
interpreter system

12
source program
Contd.
preprocessor
modified source program
compiler
target assembly program

assembler

Relocatable machine code

linker/loader Library
files
target machine code
C. Linker:- is a program that takes one or more objects generated by a
compiler and combines them into a single executable program.
D. Loader:- is the part of an operating system that is responsible for loading
programs from executables (i.e., executable files) into memory, preparing
13 them for execution and then executing them.
Compiler vs. Interpreter

Ideal concept:
Source code Compiler Executable
Input data
Executable Output data

Source code
Interpreter Output data
Input data

 Most languages are usually thought of as using either one or the other:

 Compilers: FORTRAN, COBOL, C, C++, Pascal, PL/1

 Interpreters: Lisp, scheme, BASIC, APL, Perl, Python, Smalltalk

14 
BUT: not always implemented this way
Basic Compiler Design
 Write a huge program that takes as input another program in the source
language for the compiler, and gives as output an executable that we can
run.
For modifying code easily, usually, we use modular design
(decomposition) methodology to design a compiler.
Two design strategies:
1. Write a “front end” of the compiler (i.e. the lexer, parser, semantic
analyzer, and assembly tree generator), and write a separate back end for
each platform that you want to support
2. Write an efficient highly optimized back end, and write a different front
end for several languages, such as Fortran, C, C++, and Java.
Source Intermediate Target
code
Front End code
Back End
code
15
The Analysis-Synthesis Model of Compilation
There are two parts to compilation: analysis & synthesis.
 During analysis, the operations implied by the source program are

determined and recorded in a hierarchical structure called a tree.

 During synthesis, the operations are involved in producing translated code.

1. Lexical Analysis  Breaks up source program into

Front

2. Syntax Analysis Analysis

End

constituent pieces
3. Semantic Analysis  Creates intermediate representation of
source program

 Construct target program from

4. Code
Back
End

Generation Synthesis intermediate representation

5. Optimization  Takes the tree structure and translates

16 the operations into the target program

Analysis
 In compiling, analysis has three phases:
1. Linear analysis: stream of characters read from left-to-right and grouped into

tokens; known as lexical analysis or scanning

 Converting input text into stream of known objects called tokens.
 It simplifies parsing process

2. Hierarchical analysis: tokens grouped hierarchically with collective meaning;

known as parsing or syntax analysis

 Translating code to rules of grammar.
 Building representation of code.

3. Semantic analysis: check if the program components fit together meaningfully

 Checks source program for semantic errors
 Gathers type information for subsequent code generation (type checking)
17  Identifies operator and operands of expressions and statements
Phases of Compilation

Stream of characters

scanner
Stream of tokens
parser
Parse/syntax tree
Semantic analyzer
Annotated tree
Intermediate code generator
General Structure of a Compiler Intermediate code
Code optimization
Intermediate code
Code generator
Target code
Code optimization
18
Target code
Phase I: Lexical Analysis
 The low-level text processing portion of the compiler
 The source file, a stream of characters, is broken into larger chunks called
token.
For example:
void main() It will be broken into 13 tokens as
{
int x; below:
x=3; void main ( ) { int x ; x = 3 ; }
}
 The lexical analyzer (scanner) reads a stream of characters and puts them
together into some meaningful (with respect to the source language) units
called tokens.
 Typically, spaces, tabs, end-of-line characters and comments are ignored by

19 the lexical analyzer.


Phase II: Parsing (Syntax Analysis)
A parser gets a stream of tokens from the scanner, and determines if the syntax
(structure) of the program is correct according to the (context-free) grammar of
the source language.
 Then, it produces a data structure, called a parse tree or an abstract syntax tree,
which describes the syntactic structure of the program.
 The parser ensures that the sequence of tokens returned by the lexical
analyzer forms a syntactically correct program
 It also builds a structured representation of the program called an abstract
syntax tree that is easier for the type checker to analyze than a stream of
tokens
 It catches the syntax errors as the statement below:
if if (x > 3) then x = x + 1
 Context-free grammars will be used (as the input) by the parser generator to
20 describe the syntax of the compiling language
Parse Tree
 Is output of parsing that shows the Top-down description of program syntax
 Root node is entire program and leaves are tokens that were identified during lexical

analysis
 Constructed by repeated application of rules in Context Free Grammar (CFG)
 Syntax structures are analyzed by DPDA (Deterministic Push Down Automata)
 Example: parse tree for position:=initial + rate*60

21
Phase III: Semantic Analysis
 It gets the parse tree from the parser together with information about some
syntactic elements
 It determines if the semantics (meanings) of the program is correct.
 It detects errors of the program, such as using variables before they are
declared, assign an integer value to a Boolean variable, …
 This part deals with static semantic.
 semantic of programs that can be checked by reading off from the
program only.
 syntax of the language which cannot be described in context-free
grammar.
 Mostly, a semantic analyzer does type checking (i.e. Gathers type information
for subsequent code generation.)
22
 It modifies the parse tree in order to get that (static) semantically correct code
Contd.
 The main tool used by the semantic analyzer is a symbol table
 Symbol table:- is a data structure with a record for each identifier and its
attributes
 Attributes include storage allocation, type, scope, etc
 All the compiler phases insert and modify the symbol table
 Discovery of meaning in a program using the symbol table
 Do static semantics check
 Simplify the structure of the parse tree ( from parse tree to abstract syntax tree
(AST) )
Static semantics check
 Making sure identifiers are declared before use
 Type checking for assignments and operators
23  Checking types and number of parameters to subroutines
Phase IV: Intermediate Code Generation
 An intermediate code generator

 takes a parse tree from the semantic analyzer

 generates a program in the intermediate language.
 In some compilers, a source program is translated into an intermediate code

first and then the intermediate code is translated into the target language.
 In other compilers, a source program is translated directly into the target

language.

 Compiler makes a second pass over the parse tree to produce the translated

code
 If there are no compile-time errors, the semantic analyzer translates the
24 abstract syntax tree into the abstract assembly tree
Contd.

 Using intermediate code is beneficial when compilers which translates a

single source language to many target languages are required.

 The front-end of a compiler:- scanner to intermediate code generator can

be used for every compilers.

 Different back-ends:- code optimizer and code generator is required for

each target language.

 One of the popular intermediate code is three-address code.

 A three-address code instruction is in the form of x = y op z.

25
Phase V: Assembly Code Generation

 Code generator coverts the abstract assembly tree into the actual assembly

code

 To do code generation

 The generator covers the abstract assembly tree with tiles (each tile

represents a small portion of an abstract assembly tree) and

 Output the actual assembly code associated with the tiles that we used to cover
Phase VI: Machine Code Generation and Linking
the tree
 The final phase of compilation coverts the assembly code into machine code

and links (by a linker) in appropriate language libraries

26
Code Optimization
 Replacing an inefficient sequence of instructions with a better sequence of

instructions.
 Sometimes called code improvement.
 Code optimization can be done:
 after semantic analyzing
performed on a parse tree
 after intermediate code generation
performed on a intermediate code
 after code generation
performed on a target code

 Two types of optimization

1. Local
27
2. Global
Local Optimization

 The compiler looks at a very small block of instructions and tries to determine

how it can improve the efficiency of this local code block

 Relatively easy; included as part of most compilers

Examples of possible local optimizations

1. Constant evaluation

2. Strength reduction

3. Eliminating unnecessary operations

28
Global Optimization

 The compiler looks at large segments of the program to decide how to improve

performance

 Much more difficult; usually omitted from all but the most sophisticated and

expensive production-level “optimizing compilers”

 Optimization cannot make an inefficient algorithm efficient

29
The Phases of a Compiler
Phase Output Sample
Programmer (source code producer) Source string A=B+C;
Scanner (performs lexical analysis) Token string ‘A’, ‘=’, ‘B’, ‘+’, ‘C’, ‘;’
And symbol table with names

Parser (performs syntax analysis based Parse tree or abstract syntax tree ;
on the grammar of the programming |
language) =
/\
A +
/\
B C
Semantic analyzer (type checking, etc) Annotated parse tree or abstract
syntax tree

Intermediate code generator Three-address code, quads, or RTL int2fp B t1

+ t1 C t2
:= t2 A

Optimizer Three-address code, quads, or RTL int2fp B t1

+ t1 #2.3 A

Code generator Assembly code MOVF #2.3,r1

ADDF2 r1,r2
MOVF r2,A
30
Summary of Phases of Compiler

31
Compiler Construction Tools

Software development tools are available to implement one or more compiler phases
 Scanner generators Other compiler tools:
 JavaCC, a parser generator for Java, including
 Parser generators. scanner generator and parser generator. Input
specifications are different than those suitable for
 Syntax-directed translation engines
Lex/YACC. Also, unlike YACC, JavaCC generates
 Automatic code generators a top-down parser.
 ANTLR, a set of language translation tools
 Data Flow Engines (formerly PCCTS). Includes scanner/parser
generators for C, C++, and Java.

 Scanner generators for C/C++: Flex, Lex.

 Parser generators for C/C++: Bison, YACC.
 Available scanner generators for Java:
 JLex, a scanner generator for Java, very similar to Lex.
 JFlex, flex for Java.
 Available parser generators for Java:
 CUP, a parser generator for Java, very similar to YACC.
 BYACC/J, a different version of Berkeley YACC for Java. It is an extension of
the standard YACC (a -j flag has been added to generate Java code).
32

Compiler Design Overview and Phases
No ratings yet
Compiler Design Overview and Phases
36 pages
CD - CH1 - Introduction
No ratings yet
CD - CH1 - Introduction
36 pages
Introduction to Compiler Design
No ratings yet
Introduction to Compiler Design
31 pages
Compiler Design: Finite Automata & Types
No ratings yet
Compiler Design: Finite Automata & Types
22 pages
Compiler Design and Types Explained
No ratings yet
Compiler Design and Types Explained
54 pages
Compiler Construction
No ratings yet
Compiler Construction
5 pages
01 - Introduction To Compilers Structure & Goals
No ratings yet
01 - Introduction To Compilers Structure & Goals
22 pages
Compilers
No ratings yet
Compilers
86 pages
Overview of Compiler Design Principles
No ratings yet
Overview of Compiler Design Principles
27 pages
Compiler Design Essentials
No ratings yet
Compiler Design Essentials
14 pages
Compiler Construction Overview and Tools
No ratings yet
Compiler Construction Overview and Tools
21 pages
Compiler Design Principles Explained
No ratings yet
Compiler Design Principles Explained
34 pages
Mechanism of APBN Compilation Explained
No ratings yet
Mechanism of APBN Compilation Explained
37 pages
Compiler Design and Implementation Guide
No ratings yet
Compiler Design and Implementation Guide
20 pages
Chapter 1
No ratings yet
Chapter 1
40 pages
CSC 321 Compiler Consturction 1 Note Main
No ratings yet
CSC 321 Compiler Consturction 1 Note Main
82 pages
Introduction to Compiler Design Basics
No ratings yet
Introduction to Compiler Design Basics
40 pages
Compiler Design
No ratings yet
Compiler Design
25 pages
Lecture 1
No ratings yet
Lecture 1
20 pages
Compiler Design: Stages and Types Explained
No ratings yet
Compiler Design: Stages and Types Explained
4 pages
Copch 1
No ratings yet
Copch 1
32 pages
Compiler Design Overview and Phases
No ratings yet
Compiler Design Overview and Phases
28 pages
Compiler Construction Techniques Overview
No ratings yet
Compiler Construction Techniques Overview
34 pages
ch1 Intro1
No ratings yet
ch1 Intro1
27 pages
Compiler Construction Overview and Tools
No ratings yet
Compiler Construction Overview and Tools
29 pages
Compiler Design for CS Students
No ratings yet
Compiler Design for CS Students
12 pages
Compiler Design Principles Guide
No ratings yet
Compiler Design Principles Guide
40 pages
Compiler Design Concepts Worked Out Examples and M
100% (1)
Compiler Design Concepts Worked Out Examples and M
100 pages
Principles of Compiler Design PDF
0% (1)
Principles of Compiler Design PDF
177 pages
Introduction to Compiler Design Basics
No ratings yet
Introduction to Compiler Design Basics
59 pages
Compiler Design Introduction
No ratings yet
Compiler Design Introduction
37 pages
Compiler Design Overview and Phases
No ratings yet
Compiler Design Overview and Phases
35 pages
Week 1 PDF
100% (1)
Week 1 PDF
38 pages
Compiler Design I MSC
No ratings yet
Compiler Design I MSC
160 pages
Compiler Design Unit I 2025
No ratings yet
Compiler Design Unit I 2025
75 pages
Compiler Design Notes V1u1t1 Introduction of Compiler
No ratings yet
Compiler Design Notes V1u1t1 Introduction of Compiler
5 pages
Lecture 1 - CSC 303
No ratings yet
Lecture 1 - CSC 303
40 pages
Compiler Design and Language Translation
No ratings yet
Compiler Design and Language Translation
60 pages
Compiler Construction Course Overview
No ratings yet
Compiler Construction Course Overview
38 pages
Compiler Construction Overview
No ratings yet
Compiler Construction Overview
53 pages
Compiler Design and Lexical Analysis Guide
No ratings yet
Compiler Design and Lexical Analysis Guide
117 pages
Compiler Design Overview
No ratings yet
Compiler Design Overview
3 pages
Unit 1 Phases of CD
No ratings yet
Unit 1 Phases of CD
33 pages
Understanding Compiler Construction Basics
No ratings yet
Understanding Compiler Construction Basics
40 pages
Indian Institute of Information Technology, Bhagalpur: Assignment - 1
No ratings yet
Indian Institute of Information Technology, Bhagalpur: Assignment - 1
26 pages
Compiler Design Book PDF
100% (1)
Compiler Design Book PDF
101 pages
Compiler Design-Notes
100% (2)
Compiler Design-Notes
212 pages
Introduction of Compiler Design
No ratings yet
Introduction of Compiler Design
63 pages
Compiler Design: Phases and Analysis
No ratings yet
Compiler Design: Phases and Analysis
12 pages
Comprehensive Guide to Compiler Design
No ratings yet
Comprehensive Guide to Compiler Design
97 pages
History and Importance of Compilers
No ratings yet
History and Importance of Compilers
43 pages
Compiler Construction: Nguyen Thi Thu Huong Department of Computer Science-HUST Email: Cell Phone 0903253796
No ratings yet
Compiler Construction: Nguyen Thi Thu Huong Department of Computer Science-HUST Email: Cell Phone 0903253796
35 pages
Chapter 1
No ratings yet
Chapter 1
49 pages
CSC303 - Compiler Design - 060624
No ratings yet
CSC303 - Compiler Design - 060624
49 pages
Introduction to Compiler Design
No ratings yet
Introduction to Compiler Design
6 pages
Compiler Design (Unit-1,2) (AKTU)
No ratings yet
Compiler Design (Unit-1,2) (AKTU)
99 pages
Compiler Design Fundamentals Guide
No ratings yet
Compiler Design Fundamentals Guide
158 pages
Automata Theory Workbook
No ratings yet
Automata Theory Workbook
138 pages
NLP Tokenization Techniques Lab Report
No ratings yet
NLP Tokenization Techniques Lab Report
175 pages
Formal Methods for Programming Syntax
No ratings yet
Formal Methods for Programming Syntax
15 pages
B.Tech CSE 2nd Year Syllabus & Evaluation
No ratings yet
B.Tech CSE 2nd Year Syllabus & Evaluation
14 pages
MCA (Integrated) 3rd Year CBCS 2019-20
No ratings yet
MCA (Integrated) 3rd Year CBCS 2019-20
14 pages
Language Learnability and Teacher Role
No ratings yet
Language Learnability and Teacher Role
5 pages
Compiler Symbol Table Overview
No ratings yet
Compiler Symbol Table Overview
19 pages
Theory of Computation MCQs Guide
No ratings yet
Theory of Computation MCQs Guide
79 pages
GATE CSE Subjectwise Revision
No ratings yet
GATE CSE Subjectwise Revision
5 pages
Chapter 3aEX-Syntax Analysis - CFG
No ratings yet
Chapter 3aEX-Syntax Analysis - CFG
14 pages
408 Reading Material
No ratings yet
408 Reading Material
32 pages
Closure Properties
No ratings yet
Closure Properties
4 pages
MCA Science 2013-14 - College
No ratings yet
MCA Science 2013-14 - College
35 pages
BNF and Syntax in Programming Languages
No ratings yet
BNF and Syntax in Programming Languages
58 pages
PHD Entrance Exam 100 MCQs REAL
No ratings yet
PHD Entrance Exam 100 MCQs REAL
13 pages
Topic
No ratings yet
Topic
2 pages
Introduction to Parsing in Compilers
No ratings yet
Introduction to Parsing in Compilers
16 pages
20210624-80604 Automata and Compiler Design
No ratings yet
20210624-80604 Automata and Compiler Design
59 pages
Context-Free Grammars and Languages Guide
No ratings yet
Context-Free Grammars and Languages Guide
49 pages
cs0355 Theory of Computation PDF
No ratings yet
cs0355 Theory of Computation PDF
5 pages
PDA and CFG Equivalence Proof
No ratings yet
PDA and CFG Equivalence Proof
3 pages
Context-Free Grammars Explained
No ratings yet
Context-Free Grammars Explained
52 pages
Context-Free Languages Tutorial Exercises
No ratings yet
Context-Free Languages Tutorial Exercises
3 pages
Syntax Analysis: CD: Compiler Design
No ratings yet
Syntax Analysis: CD: Compiler Design
36 pages
Theory of Automata and Formal Languages Vip Handwritten Notes by Kulbhushan Pro Unlocked
100% (1)
Theory of Automata and Formal Languages Vip Handwritten Notes by Kulbhushan Pro Unlocked
117 pages
Automata Solved MCQ
100% (3)
Automata Solved MCQ
27 pages
Syntax Analysis
67% (3)
Syntax Analysis
46 pages
Context-Free Languages Overview
No ratings yet
Context-Free Languages Overview
21 pages
Semantic Analysis in Compiler Design
No ratings yet
Semantic Analysis in Compiler Design
35 pages
IISER Bhopal EECS Curriculum Overview
No ratings yet
IISER Bhopal EECS Curriculum Overview
21 pages