0% found this document useful (0 votes)
16 views38 pages

Intro-To-compilation 1

The document provides an overview of the compilation process, detailing the roles of source code, object code, and compilers. It explains the structure of a compiler, including the front end and back end, and discusses the importance of lexical analysis, parsing, and intermediate representations. Additionally, it contrasts compilers and interpreters, highlighting their advantages and disadvantages in programming language execution.

Uploaded by

emirhanmetin534
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views38 pages

Intro-To-compilation 1

The document provides an overview of the compilation process, detailing the roles of source code, object code, and compilers. It explains the structure of a compiler, including the front end and back end, and discusses the importance of lexical analysis, parsing, and intermediate representations. Additionally, it contrasts compilers and interpreters, highlighting their advantages and disadvantages in programming language execution.

Uploaded by

emirhanmetin534
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Introduction to Compilation

1
10-6
Some Terms
• Source
– The language program was written in
• Object
– The machine language equivalent of the
program after compilation
• Compiler
– A software program that translates the
source code into object code
– Assembler is a special case of compiler
where the source was written in Assembly
language
Compiler
• Read and analyze entire program
• As a Discipline, Involves Multiple CSE Areas
– Programming Languages and Algorithms
– Software Engineering & Theory / Foundations
– Computer Architecture & Operating Systems
• But, Has Surprisingly Simplistic Intent:

4
What’s in a Compiler?

5
Standard Compiler Structure
Source code
(character stream)
Lexical analysis
Token stream
Parsing Front end
(machine-independent)
Abstract syntax tree
Intermediate Code Generation
Intermediate code
Optimization
Intermediate code Back end
(machine-dependent)
Code generation
Assembly code
6
Structure of a Compiler
• First approximation
– Front end: analysis
• Read source program and understand its
structure and meaning
– Back end: synthesis
• Generate equivalent target language program

Source Front End Back End Target

7
Implications
• Must recognize legal programs (& complain about illegal
ones)
• Must generate correct code
• Must manage storage of all variables
• Must agree with OS & linker on target format
• Need some sort of Intermediate Representation (IR)
• Front end maps source into IR
• Back end maps IR to target machine code

Source Front End Back End Target

8
Front End source
Scanner
tokens
Parser
IR

• Split into two parts


– Scanner: Responsible for converting character
stream to token stream
• Also strips out white space, comments
– Parser: Reads token stream; generates IR
• Both of these can be generated automatically
– Source language specified by a formal grammar
– Tools read the grammar and generate scanner &
parser (either table-driven or hard coded)

9
Lex – A Lexical Analyzer Generator
• A Unix Utility from early 1970s
• A Compiler that Takes as Source a Specification for:
– Tokens/Patterns of a Language
– Generates a “C” Lexical Analyzer Program
• Pictorially:

10
Tokens?

11
Tokens
• Token stream: Each significant lexical
chunk of the program is represented by a
token
– Operators & Punctuation: {}[]!+-=*;: …
– Keywords: if while return goto
– Identifiers: id & actual name
– Constants: kind & value; int, floating-point
character, string, …

12
Scanner Example
• Input text
// this statement does very little
if (x >= y) y = 42;
• Token Stream

IF LPAREN ID(x) GEQ ID(y)

RPAREN ID(y) BECOMES INT(42) SCOLON

– Note: tokens are atomic items, not character strings

13
Programming Steps
for Compilation
• Create/Edit source
• Compile source
• Link object modules together
• Test executable
• If errors, Start over
• Stop
Compilation process:
• Invoke compiler on source program to
generate machine language equivalent
• Compiler translates source to object
• Saves object output as disk file[s]
• Large Systems may have many source
programs
• Each has to be compiled
Link object modules together
• Combine them together to form executable
• Take multiple object modules
• LINKER then takes object module(s) and
creates executables for you
– Linker resolves references to other object
modules
– Handles calls to external libraries
– Creates an executable
Linking
• Libraries of subroutines
From Source Code to Executable Code

program
program gcd(input,
gcd(input, output);
output);
var
var i,
i, j:
j: integer;
integer;
begin
begin
read(i,
read(i, j);
j);
while
while ii <>
<> jj do
do
if
if ii >> jj then
then ii :=
:= ii –– j;
j;
Compilation
else
else jj :=
:= jj –– i;
i;
writeln(i)
writeln(i)
end.
end.
Machine code Generation

19
Assemblers

20
Reviewing the Entire Process

21
22
Why Study Compilers? (1)
• Compiler techniques are everywhere
– Parsing (little languages, interpreters)
– Database engines
– AI: domain-specific languages
– Text processing
• Tex/LaTex -> dvi -> Postscript -> pdf
– Hardware: VHDL; model-checking tools
– Mathematics (Mathematica, Matlab)

23
Why Study Compilers? (2)
• Fascinating blend of theory and
engineering
– Direct applications of theory to practice
• Parsing, scanning, static analysis
– Some very difficult problems (NP-hard or
worse)
• Resource allocation, “optimization”, etc.
• Need to come up with good-enough solutions

24
Why Study Compilers? (3)
• Ideas from many parts of CSE
– AI: Greedy algorithms, heuristic search
– Algorithms: graph algorithms, dynamic
programming, approximation algorithms
– Theory: Grammars DFAs and PDAs, pattern
matching, fixed-point algorithms
– Systems: Allocation & naming, synchronization,
locality
– Architecture: pipelines & hierarchy management,
instruction set use

25
Software Language Levels
• Machine Language (Binary)
• Assembly Language
– Assembler converts Assembly into machine
• High Level Languages (C, Perl, Shell)
– Compiled : C
– Interpreted : Perl, Shell
Programming Languages Offer …

• Abstractions
• At different levels
– From low
• Good for machines….
– To high
• Good for humans….
• Three Approaches
– Interpreted
– Compiled
– Mixed
27
Interpretation

• No linking
• No object code generated
• Source statements executed line
by line
Steps in interpretation

• Read a source line


• Parse line
• Do what the line says
– Allocate space for variables
– Execute arithmetic opts etc..
– Go to back to step 1
• Similar to instruction cycle
Interpreter
• Interpreter
– Execution engine
– Program execution interleaved with analysis
running = true;
while (running) {
analyze next statement;
execute that statement;
}
– May involve repeated analysis of some
statements (loops, functions)

30
Interpreters & Compilers
• Interpreter
– A program that reads a source program and
produces the results of executing that
program

• Compiler
– A program that translates a program from one
language (the source) to another (the target)

31
Common Issues
• Compilers and interpreters both must read
the input – a stream of characters – and
“understand” it; analysis

w h i l e ( k < l e n g t h ) { <nl> <tab> i f ( a [ k ] >


0
) <nl> <tab> <tab>{ n P o s + + ; } <nl> <tab> }

32
Compilation Advantages
• Faster Execution
• Single file to execute
• Compiler can do better diagnosis of syntax
and semantic errors, since it has more info
than an interpreter (Interpreter only sees
one line at a time)
• Compiler can optimize code
Compilation Disadvantages
• Harder to debug
• Takes longer to change source code,
recompile and relink
Interpreter Advantages
• Easier to debug
• Faster development time
Interpreter disadvantages
• Slower execution times
• No optimization
• Need all of source code available
• Source code larger than executable for
large systems
Mixed

+ =

37
Hybrid approaches
• Well-known example: Java
– Compile Java source to byte codes – Java Virtual
Machine language (.class files)
– Execution
• Interpret byte codes directly, or
• Compile some or all byte codes to native code
– (particularly for execution hot spots)
– Just-In-Time compiler (JIT)
• Variation: VS.NET
– Compilers generate MSIL
– All IL compiled to native code before execution

38

You might also like