0% found this document useful (0 votes)

18 views20 pages

VL I W Processor

Uploaded by

diya32755

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views20 pages

VL I W Processor

Uploaded by

diya32755

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

VLIW PROCESSORS

Department of E &TC, MITCOE, Pu

Introduction
o Very long instruction word or VLIW refers to a
processor architecture designed to take advantage of
instruction level parallelism
o Instruction of a VLIW processor consists of multiple
independent operations grouped together.
o There are Multiple Independent Functional Units in
VLIW processor architecture.
o Each operation in the instruction is aligned to a
functional unit.
o All functional units share the use of a common large
register file.
o This type of processor architecture is intended to allow
higher performance without the inherent complexity of
Different Approaches
Other approaches to improving performance in processor architectures
:
o Pipelining
Breaking up instructions into sub-steps so that instructions can be
executed partially at the same time

o Superscalar architectures
Dispatching individual instructions to be executed completely
independently in different parts of the processor

o Out-of-order execution
Executing instructions in an order different from the program
Instruction Level Parallelism (ILP )
o Instruction-level parallelism (ILP) is a measure of how
many of the operations in a computer program can be
performed simultaneously.

o The overlap among instructions is called instruction level

parallelism.

o Ordinary programs are typically written under a sequential

execution model where instructions execute one after the
other and in the order specified by the programmer.

o Goal of compiler and processor designers implementing ILP

is to identify and take advantage of as much ILP as possible.
What is ILP? (Example)
Consider the following program:
op 1 e = a + b
op2 f = c + d
op3 m = e * f

o Operation 3 depends on the results of operations 1 and 2, so

it cannot be calculated until both of them are completed
o However, operations 1 and 2 do not depend on any other
operation, so they can be calculated simultaneously
o If we assume that each operation can be completed in one
unit of time then these three instructions can be completed
in a total of two units of time
o giving an ILP of 3/2.
VLIW Compiler
o Compiler is responsible for static scheduling of instructions in
VLIW processor.

o Compiler finds out which operations can be executed in parallel

in the program.

o It groups together these operations in single instruction which is

the very large instruction word.

o Compiler ensures that an operation is not issued before its

operands are ready.
VLIW Instruction
o One VLIW instruction word encodes multiple operations which
allows them to be initiated in a single clock cycle.

o The operands and the operation to be performed by the various

functional units are specified in the instruction itself.

o One instruction encodes at least one operation for each execution

unit of the device.

o So length of the instruction increases with the number of execution

units

o To accommodate these operation fields, VLIW instructions are

usually at least 64 bits wide, and on some architectures are much
wider up to 1024 bits.
VLIW Instruction

Add r1,r2,r3; Sub r4,r5,r6; Ld r7,data; St r8,data

REGESTER FILES

LOAD LOAD
ALU ALU
/STORE /STORE
ILP in VLIW
o Consider the computation of y = a1x1 + a2x2 + a3x3
On a sequential processor On the VLIW processor with
2 load/store units, 1 multiply unit
and 1 add unit
cycle 1: load a1 cycle 1: load a1
cycle 2: load x1 load x1
cycle 3: load a2 cycle 2: load a2
cycle 4: load x2 load x2
cycle 5: multiply z1 a1 x1 Multiply z1 a1 x1
cycle 6: multiply z2 a2 x2 cycle 3: load a3
cycle 7: add y z1 z2 load x3
cycle 8: load a3 Multiply z2 a2 x2
cycle 9: load x3 cycle 4: multiply z3 a3 x3
cycle 10: multiply z1 a3 x3 add y z1 z2
cycle 11: add y y z2 cycle 5: add y y z3

requires 11 cycles. requires 5 cycles.

Block Diagram
Diagram (Conceptual Instruction
Execution)
Working
o Long instruction words are fetched from the memory
o A common multi-ported register file for fetching the operands
and storing the results.
o Parallel random access to the register file is possible through
the read/write cross bar.
o Execution in the functional units is carried out concurrently
with the load/store operation of data between RAM and the
register file.
o One or multiple register files for FX and FP data.
o Rely on compiler to find parallelism and schedule dependency
free program code.
Difference Between VLIW &
Superscalar Architecture
VLIW vs. Superscalar
Architecture
o Instruction formulation
 Superscalar:
⁻ Receive conventional instructions conceived for sequential
processors.

 VLIW:
⁻ Receive long instruction words, each comprising a field (or
opcode) for each execution unit.
⁻ Instruction word length depends number of execution units and
code length to control each unit (such as opcode length,
registers).
⁻ Typical word length is 64 – 1024 bits, much longer than
conventional machine word length.
VLIW vs. Superscalar
Architecture
o Instruction scheduling

 Superscalar:
⁻ Done dynamically at run-time by the hardware.
⁻ Data dependency is checked and resolved in
hardware.
⁻ Need a look ahead hardware window for instruction
fetch.

 VLIW:
⁻ Done statically at compile time by compiler.
⁻ Data dependency is checked by compiler.
⁻ In case of un-filled opcodes in a VLIW, memory
space and instruction bandwidth are wasted.
Comparison: CISC, RISC,
VLIW
ARCHITECTURE CISC RISC VLIW
CHARACTERIST
C
Instruction Size Varies One size, usually 32 bits One size

Instruction Varies from simple to Almost always one Many simple,

Semantics complex; possibly simple independent
many operation operations
dependent operations
per instruction
Registers Few, sometimes Many, general-purpose Many, general-
special purpose
Hardware Design Exploit microcode Exploit implementations Exploit
implementations with one pipeline and & implementations with
no microcode multiple pipelines, no
microcode &
no complex dispatch
logic
Advantages of VLIW
o Dependencies are determined by compiler and used to
schedule according to function unit latencies .
o Function units are assigned by compiler and correspond
to the position within the instruction packet.
o Reduces hardware complexity.
• Tasks such as decoding, data dependency detection,
instruction issues etc. becoming simple.
• Ensures potentially higher Clock Rate.
• Ensures Low power consumption
Disadvantages of VLIW
o Higher complexity of the compiler
o Compatibility across implementations : Compiler optimization
needs to consider technology dependent parameters such as
latencies and load-use time of cache.
o Unscheduled events (e.g. cache miss) stall entire processor .
o Code density: In case of un-filled opcodes in a VLIW, memory
space and instruction bandwidth are wasted i.e. low slot utilization.
o Code expansion: Causes high power consumption
Applications
o VLIW architecture is suitable for Digital Signal Processing
applications.
o Processing of media data like compression/decompression
of Image and speech data.
Examples of VLIW processor
o VLIW Mini supercomputers:
Multiflow TRACE 7/300, 14/300, 28/300
Multiflow TRACE /500
Cydrome Cydra 5
IBM Yorktown VLIW Computer
o Single-Chip VLIW Processors:
Intel iWarp, Philip’s LIFE Chips
o Single-Chip VLIW Media (through-put) Processors:
Trimedia, Chromatic, Micro-Unity
o DSP Processors (TI TMS320C6x )

VLIW Processors Explained
No ratings yet
VLIW Processors Explained
20 pages
VLIW Processor Architecture Guide
No ratings yet
VLIW Processor Architecture Guide
23 pages
Name: Rafi Dar: Very Large Instruction Word
No ratings yet
Name: Rafi Dar: Very Large Instruction Word
18 pages
VLIW Processor Architecture Overview
No ratings yet
VLIW Processor Architecture Overview
3 pages
VLIW Architecture
No ratings yet
VLIW Architecture
5 pages
Very Large Scale Instruction Word
No ratings yet
Very Large Scale Instruction Word
22 pages
Pipelining - Lec 2-3-4
No ratings yet
Pipelining - Lec 2-3-4
72 pages
VLIW vs. Superscalar Processors
No ratings yet
VLIW vs. Superscalar Processors
35 pages
VLIW Architecture and Processor Performance
No ratings yet
VLIW Architecture and Processor Performance
30 pages
Pipelining - Lec 3-Modified
No ratings yet
Pipelining - Lec 3-Modified
36 pages
Vliw Architecture
No ratings yet
Vliw Architecture
30 pages
Vliw
No ratings yet
Vliw
22 pages
Lec9 Multiple Issue Processors
No ratings yet
Lec9 Multiple Issue Processors
33 pages
VLIW Processors Explanation
No ratings yet
VLIW Processors Explanation
2 pages
Module3
No ratings yet
Module3
49 pages
18 20210619 Computer Architecture Super Pipelined VLIW Processor Architecture
No ratings yet
18 20210619 Computer Architecture Super Pipelined VLIW Processor Architecture
15 pages
VLIW Architecture in ECC Processors
No ratings yet
VLIW Architecture in ECC Processors
19 pages
VLIW Processors Explained
No ratings yet
VLIW Processors Explained
17 pages
VLIW Philips
No ratings yet
VLIW Philips
11 pages
Superscalar and Vliw Architectures
No ratings yet
Superscalar and Vliw Architectures
2 pages
Expliting More ILP
No ratings yet
Expliting More ILP
20 pages
Aca Notes
No ratings yet
Aca Notes
23 pages
Chapter 6 PPTV 2004 Short V1
No ratings yet
Chapter 6 PPTV 2004 Short V1
21 pages
1.vliw & Epic
No ratings yet
1.vliw & Epic
5 pages
VLIW Processor Architecture Overview
No ratings yet
VLIW Processor Architecture Overview
7 pages
Zareen 13
No ratings yet
Zareen 13
13 pages
Lecture #2
No ratings yet
Lecture #2
11 pages
(Signal Processing and Communications 13) Hu, Yu Hen - Programmable Digital Signal Processors - Architecture, Programming, and App PDF
No ratings yet
(Signal Processing and Communications 13) Hu, Yu Hen - Programmable Digital Signal Processors - Architecture, Programming, and App PDF
386 pages
VLIW Processors for CS Students
No ratings yet
VLIW Processors for CS Students
19 pages
Module II
No ratings yet
Module II
60 pages
Lec 15
No ratings yet
Lec 15
15 pages
Very Long Instruction Word A Higher Performance PR
No ratings yet
Very Long Instruction Word A Higher Performance PR
6 pages
CSE 431 Computer Architecture Fall 2005 Lecture 17: VLIW Processors
No ratings yet
CSE 431 Computer Architecture Fall 2005 Lecture 17: VLIW Processors
18 pages
ACA Mod2
No ratings yet
ACA Mod2
45 pages
Cs2354 Advanced Computer Architecture 2 Marks
No ratings yet
Cs2354 Advanced Computer Architecture 2 Marks
10 pages
Computer Architecture 09-Superscalar
No ratings yet
Computer Architecture 09-Superscalar
83 pages
Lect3 - DSP
No ratings yet
Lect3 - DSP
17 pages
Design of 64-Bit Decode Stage For VLIW Processor Architecture
No ratings yet
Design of 64-Bit Decode Stage For VLIW Processor Architecture
3 pages
Advantages of Tomasulo's Scheme in CPUs
No ratings yet
Advantages of Tomasulo's Scheme in CPUs
30 pages
A. Instruction-Level Parallelism: Ntroduction
No ratings yet
A. Instruction-Level Parallelism: Ntroduction
3 pages
Instruction Level Parallelism Guide
No ratings yet
Instruction Level Parallelism Guide
31 pages
ILP Techniques: Trace Scheduling & Limits
No ratings yet
ILP Techniques: Trace Scheduling & Limits
21 pages
Cs152 Sp16 F Sol VLIW
No ratings yet
Cs152 Sp16 F Sol VLIW
40 pages
Architecture PDF
No ratings yet
Architecture PDF
19 pages
Challenges in VLIW Architecture Design
No ratings yet
Challenges in VLIW Architecture Design
2 pages
Speculation in Computer Architecture
No ratings yet
Speculation in Computer Architecture
22 pages
Me FIRST
No ratings yet
Me FIRST
4 pages
Advanced Computer Architecture Prof Thriveni T K
No ratings yet
Advanced Computer Architecture Prof Thriveni T K
59 pages
Very Long Instruction Word
No ratings yet
Very Long Instruction Word
1 page
Advanced ILP Techniques for Developers
No ratings yet
Advanced ILP Techniques for Developers
104 pages
Module 01 Introduction To Microprocessors V2
No ratings yet
Module 01 Introduction To Microprocessors V2
8 pages
Computer Architecture Unit 3
No ratings yet
Computer Architecture Unit 3
8 pages
Module 3 - Processors
No ratings yet
Module 3 - Processors
22 pages
Mod5 1
No ratings yet
Mod5 1
18 pages
IESVE 30-Day Free Trial Guide
No ratings yet
IESVE 30-Day Free Trial Guide
3 pages
Data Communication and Computer Network Questions and Answer
No ratings yet
Data Communication and Computer Network Questions and Answer
17 pages
Hotel Management System Project
No ratings yet
Hotel Management System Project
14 pages
General Coding Skills Evaluation Framework CodeSignal Skills Evaluation Lab Short
No ratings yet
General Coding Skills Evaluation Framework CodeSignal Skills Evaluation Lab Short
9 pages
Business Analysis & Information Economics (Ba-Iec) Online Imit Course Paper & Presentation Instructions
No ratings yet
Business Analysis & Information Economics (Ba-Iec) Online Imit Course Paper & Presentation Instructions
5 pages
Arduino Info LCD Blue I2C
No ratings yet
Arduino Info LCD Blue I2C
17 pages
Claytronics: Future of Programmable Matter
No ratings yet
Claytronics: Future of Programmable Matter
11 pages
(B) Figure Q3 (B) Shows The Routing Table Result Fo...
No ratings yet
(B) Figure Q3 (B) Shows The Routing Table Result Fo...
1 page
CISSP Notes
No ratings yet
CISSP Notes
14 pages
Sustainability 13 01224 v2
No ratings yet
Sustainability 13 01224 v2
29 pages
JavaScript - Tutorial (W3schools)
No ratings yet
JavaScript - Tutorial (W3schools)
1 page
Sania CD
No ratings yet
Sania CD
22 pages
Selenium Testing Basics and Commands
No ratings yet
Selenium Testing Basics and Commands
8 pages
Data Passing Methods in .NET
No ratings yet
Data Passing Methods in .NET
14 pages
7.-Revised-Tle-As-Css-10-Q3-Testing Installed Devices
No ratings yet
7.-Revised-Tle-As-Css-10-Q3-Testing Installed Devices
4 pages
Win32 Serial Communication Guide
No ratings yet
Win32 Serial Communication Guide
31 pages
Barracuda Web Application Firewall - Foundation
100% (1)
Barracuda Web Application Firewall - Foundation
15 pages
End User Computing - Intune MECM SME
No ratings yet
End User Computing - Intune MECM SME
2 pages
05 Movies Data Analysis Using Mapreduce
No ratings yet
05 Movies Data Analysis Using Mapreduce
20 pages
History of Computers Overview
No ratings yet
History of Computers Overview
16 pages
Network Traffic Analysis Guide
No ratings yet
Network Traffic Analysis Guide
11 pages
Xii Cs Study Material (2022-23)
No ratings yet
Xii Cs Study Material (2022-23)
305 pages
Yr 10-Input Devices
No ratings yet
Yr 10-Input Devices
1 page
05 Laboratory E1
No ratings yet
05 Laboratory E1
4 pages
Windows 10 Installatiegids
No ratings yet
Windows 10 Installatiegids
20 pages
Keywords: Smart Blind Stick, Node MCU ESP32, Infrared Sensors, Buzzer, APR 9600 Voice
No ratings yet
Keywords: Smart Blind Stick, Node MCU ESP32, Infrared Sensors, Buzzer, APR 9600 Voice
48 pages
Data Security: Protecting Against Hacking
No ratings yet
Data Security: Protecting Against Hacking
7 pages
Advanced OS Concepts for IT Students
No ratings yet
Advanced OS Concepts for IT Students
100 pages
بحث علمي عن لغة الرينج ومميزاتها وتطويرها
No ratings yet
بحث علمي عن لغة الرينج ومميزاتها وتطويرها
39 pages
JDBC Programming Overview
No ratings yet
JDBC Programming Overview
78 pages

VL I W Processor

Uploaded by

VL I W Processor

Uploaded by

VLIW PROCESSORS

Department of E &TC, MITCOE, Pu

o The overlap among instructions is called instruction level

o Ordinary programs are typically written under a sequential

o Goal of compiler and processor designers implementing ILP

o Operation 3 depends on the results of operations 1 and 2, so

o Compiler finds out which operations can be executed in parallel

o It groups together these operations in single instruction which is

o Compiler ensures that an operation is not issued before its

o The operands and the operation to be performed by the various

o One instruction encodes at least one operation for each execution

o So length of the instruction increases with the number of execution

o To accommodate these operation fields, VLIW instructions are

Add r1,r2,r3; Sub r4,r5,r6; Ld r7,data; St r8,data

requires 11 cycles. requires 5 cycles.

Instruction Varies from simple to Almost always one Many simple,

You might also like