0% found this document useful (0 votes)

84 views47 pages

Superscalar Processors and ILP Explained

The document discusses instruction level parallelism and superscalar processors. It provides an overview of ILP and superscalar execution, compares ordinary pipelining to superscalar implementation, reviews superscalar concepts, and discusses limitations to exploiting parallelism such as resource conflicts, control dependencies, and data dependencies.

Uploaded by

zelalem2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views47 pages

Superscalar Processors and ILP Explained

Uploaded by

zelalem2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Chapter 5

Instruction Level Parallelism and

Superscalar Processors

1
Outline
 Overview
 Introduction
 Comparison
 Limitation to ILP
 Instruction Issue Policies
 Superscalar Execution Review
 Superscalar Implementation

2
Overview
 Pipelining exploits the potential parallelism among
instructions:
◦ Instruction Level Parallelism (ILP)
 The parallelism among instructions
 Refers to the degree to which, on average, the instructions of a
program can be executed in parallel
 Exists when instructions in a sequence are independent and thus can be
executed in parallel by overlapping

3
Overview…
 There are two methods for increasing the potential
amount of ILP:
◦ Increase the depth of the pipeline
 Enables the overlap of more instructions
 Amount of parallelism being exploited is higher
◦ Replicate the internal components of the computer
 Enables the launching of multiple instructions in every pipeline
stage
 This techniques is called Multiple Issue

4
Overview…
 Two ways to implement multiple issue processor
◦ Static multiple issue processors
 Also called VLIW (Very Long Instruction Word) Processor
 Many decisions are made by the compiler before execution
 Focus of Chapter 6, next chapter
◦ Dynamic multiple issue processors
 Also called Superscalar processors
 Many decisions are made during execution by the processor
 Focus of this chapter, Chapter 5
 The major difference between the two:
 The division of work between the compiler and the hardware

5
Introduction
 Superscalar Architecture /Superscalar Processor
◦ In a superscalar architecture (SSA), several scalar
instructions can be initiated simultaneously and executed
independently
◦ Pipelining allows also several instructions to be executed at
the same time, but they have to be in different pipeline
stages at a given moment
◦ SSA includes all features of pipelining but, in addition, there
can be several instructions executing simultaneously in the
same pipeline stage

6
Introduction…
 Superscalar Architecture /Superscalar Processor
◦ RISC machines lends itself readily to superscalar techniques
 But it can be used on either RISC or CISC architectures
◦ Superscalar approach is now the standard method for
implementing
 High performance microprocessors

7
Introduction…
 General Superscalar Organization
◦ There are multiple functional units
 Each of which is implemented as a pipeline
◦ In the diagram, the following operations can be executed at the same time:
 Two integer operations
 Two floating point operations and
 One memory (load or store) operations

8
Introduction…
 How does Superscalar processor work ?
◦ A SS processor fetches multiple instructions at a time, and
attempts to find nearby instructions that are independent of
each other and therefore can be executed in parallel

◦ Based on the dependency analysis, the processor may issue

and execute instructions in an order that differs from that of
the original machine code

◦ The processor may eliminate some unnecessary dependencies

by the use of additional registers and renaming of register
references

9
Comparison
 Ordinary Pipeline (Base Pipeline) Vs
Superscalar
◦ Ordinary Pipeline (Base Machine):
 Issues one instruction per clock cycle
 Perform one pipeline stage per clock cycle
 Although several instructions are executing
concurrently
 Only one instruction is in its execution stage at any
one time
 In the figure the pipeline has 4-stages:
 Instruction fetch, Operation decode, Operation
execution , Result write back
◦ Superscalar Implementation:
 Two instructions are executed concurrently in each
pipeline stage (in the figure)
 Superscalar of degree 2
 Duplication of hardware is required

10
Superscalar Recap
 Allows several instructions to be issued and completed per
clock cycle
 Consists of a number of pipelines that are working in parallel
 Depending on the number and kind of parallel functional units
available, a certain number of instructions can be executed in
parallel

11
Limitations to ILP
 The situations which prevent instructions to be executed in
parallel by SSA are very similar to those which prevent
efficient execution on an ordinary pipeline:
◦ Resource conflicts
◦ Control (procedural) dependency
◦ Data dependencies

 Their consequences on SSA are more severe than those on

simple pipelines, because the potential of parallelism in
SSA is greater and thus a larger amount of performance will
be lost

12
Limitations to ILP…
 Resource Conflicts
◦ Several instructions compete for the same hardware resource at
the same time
 Example:
 Memories, caches, buses, register file ports and functional units (ALU
adder)
 Two arithmetic instructions need the same floating-point unit for execution
 Similar to structural hazards in pipeline
◦ Can be solved partly by introducing several hardware units for the
same functions --- duplication of resources
 E.g., have two floating-point units
 The hardware units can also be pipelined to support several operations
at the same time

13
Limitations to ILP…
 Procedural Dependency
◦ The presence of branches in an instruction sequence complicates the
pipeline operation
 Cannot execute instructions after a branch until the branch is
executed
 The instruction following the branch is said to have a procedural
dependency on the branch instruction
 Similar to control hazards in pipeline
◦ If instructions are of variable length, they cannot be fetched and issued
in parallel, since an instruction has to be decoded in order to identify
the following one
 Another type of procedural dependency
 Therefore, superscalar techniques are more efficiently applicable to RISCs,
with fixed instruction length and format

14
Limitations to ILP…
 Data Conflicts
◦ Caused by data dependencies between instructions in the
program
 Similar to data hazards in pipeline
◦ To address the problem and to increase the degree of parallel
execution, SSA provides a great liberty in the order in which
instructions are issued and executed
◦ Therefore data dependencies have to be considered and dealt
with much more carefully

15
Limitations to ILP…
 Data Conflicts…
◦ Due to data dependencies, only some part of the instructions are potential
subjects for parallel execution
◦ In order to find instructions to be issued in parallel, the processor has to
select from a sufficiently large instruction sequence
 There are usually a lot of data dependencies in a short instruction
sequence
◦ Window of execution is defined as the set of instructions that is
considered for execution at a certain moment
◦ The number of instructions in the window should be as large as possible.
However, this is limited by:
 Capacity to fetch instructions at a high rate
 The problem of branches
 The cost of hardware needed to analyze data dependencies

16
Limitations to ILP…
 Data Dependencies
◦ All instructions in the window of execution may begin
execution, subject to data dependence and resource
constraints

◦ Three types of data dependencies can be identified:

 True data dependency
 Output dependency
 Anti-dependency

17
Limitations to ILP…
 True Data Dependency
◦ Also called write-read dependency/flow dependency
◦ Exists when the output of one instruction is required as an input to a
subsequent instruction:
 MUL R4,R3,R1 (R4 := R3 * R1)
...
ADD R2,R4,R5 (R2 := R4 + R5)
 Can fetch and decode second instruction in parallel with first
 Can NOT execute second instruction until first is finished
◦ They have to be detected and handled by hardware
 The addition above cannot be executed before the result of the multiplication is
available
 The simplest solution is to stall the adder until the multiplier has finished
 In order to avoid the adder to be idle, the hardware can find other instructions
which can be executed by the adder

18
Limitations to ILP…
 True Data Dependency…
◦ They are intrinsic features of the user’s program, and cannot
be eliminated by compiler or hardware techniques
◦ There are often a lot of true data dependencies in a small
region of a program
 Increasing the window size can reduce the impact of these
dependencies

19
Limitations to ILP…
 True Data Dependency…
◦ Another Example:
 ADD r1, r2 (r1 := r1+r2;)
 MOVE r3,r1 (r3 := r1;)
 Can fetch and decode second instruction in parallel with first
 Can NOT execute second instruction until first is finished
◦ Exercise: Consider the following code, conclude about the
relationship between data dependency and region of code
L2 move r3,r7
load r8,(r3)
add r3,r3,#4
load r9,(r3)
ble r8,r9,L2

20
Limitations to ILP…
 Output Dependency
◦ Also called write-write dependency
◦ Occurs if two instructions are writing into the same location
 If the second instruction writes before the first one, an
error occurs:
 MUL R4,R3,R1 (R4 := R3 * R1)
...
ADD R4,R2,R5 (R4 := R2 + R5)

21
Limitations to ILP…
 Output Dependency…
◦ Another Example:
 I1: R3:= R3 + R5
 I2: R4:= R3 + 1
 I3: R3:= R5 + 1
 I4: R7:= R3 + R4
◦ What is the relationship between
 I1 & I2 ?
 True data dependency
 I3 & I4 ?
 True data dependency
◦ What about I1 & I3 ?
 No true data dependency
 If I3 completes before I1
 The wrong value of the contents of R3 will be fetched for the execution of I4
 I3 must complete after I1 to produce the correct output

22
Limitations to ILP…
 Anti-Dependency
◦ Also called read-write dependency
◦ Exists if an instruction uses a location as an operand while a following one is
writing into that location
◦ The constrain is similar to that of true data dependency but reversed
 The second instruction destroys a value that the first instruction uses

◦ Example
 If the first one is still using the location when the second one writes into it, an error
occurs:
 MUL R4,R3,R1 (R4 := R3 * R1)
...
ADD R3,R2,R5 (R3 := R2 + R5)

23
Limitations to ILP…
 Anti-Dependency
◦ Another Example:
 I1: R3:= R3 + R5
 I2: R4:= R3 + 1
 I3: R3:= R5 + 1
 I4: R7:= R3 + R4
 I3 can not complete before I2 starts as I2 needs a value in R3 and I3
changes R3

24
Limitations to ILP…
 Output and Anti-Dependencies
◦ Output dependencies and anti-dependencies are not intrinsic features of the
executed program
 They are not real data dependencies but storage conflicts
 They are due to the competition of several instructions for the same
register
◦ They are only the consequence of the manner in which the programmer or
the compiler are using registers (or memory locations)
◦ In the previous examples the conflicts are produced only because:
 The output dependency:
 R4 is used by both instructions to store the result (due to, for example,
optimization of register usage)
 The anti-dependency:
 R3 is used by the second instruction to store the result

25
Limitations to ILP…
 Output and Anti-Dependencies …
◦ Output dependencies and anti-dependencies can usually be eliminated
by using additional registers
◦ This technique is called register renaming

 MUL R4,R3,R1 (R4 := R3 * R1)

...
ADD R4,R2,R5 (R4 := R2 + R5)

MUL R4,R3,R1 (R4 := R3 * R1)

...
ADD R3,R2,R5 (R3 := R2 + R5)

26
Limitations to ILP…
 Output and Anti-Dependencies …
◦ Register renaming another example:
 I1: R3b:=R3a + R5a
 I2: R4b:=R3b + 1
 I3: R3c:=R5a + 1
 I4: R7b:=R3c + R4b
◦ Without subscript refers to logical register in instruction
◦ With subscript is hardware register allocated
◦ Note R3a, R3b, R3c
◦ Creation of R3c, avoids:
 Anti-dependency on the second instruction
 Output dependency on the first instruction
 Doesn't interfere with the correct value being accessed by I4

27
Limitations to ILP…
 Effect of Dependencies

28
Instruction Issue Policies
 In SS processors
◦ Instructions can be executed in an order different from the strictly sequential
one, with the requirement that the results must be the same
 To optimize utilization of the various pipeline elements
 Three types of ordering are important:
◦ The order in which instruction are fetched
◦ The order in which instructions are executed
◦ The order in which instructions update the contents of register /memory
locations

 Instruction Issue:
◦ Refer to the process of initiating instruction execution in the processor’s
functional units
◦ Occurs when instruction moves from the decode stage of the pipeline to the first
execute stage of the pipeline

29
Instruction Issue Policies…
 Instruction Issue Policy
◦ Refer to the protocol used to issue instructions

 Superscalar instruction issue policies fall into the following

categories:
◦ In-Order Issue with In-Order Completion
 IOI with IOC
◦ In-Order Issue with Out-of-Order Completion
 IOI with OOC
◦ Out-of-Order Issue with Out-of-Order Completion
 OOI with OOC

30
Instruction Issue Policies…
 In-Order Issue with In-Order Completion
◦ The simplest instruction issue policy
◦ Instructions are issued in exact program order, and completed in the
same order (with parallel issue and completion, of course!)
 An instruction cannot be issued before the previous one has been issued
 In order issue
 An instruction cannot be completed before the previous one has been
completed
 In order completion
◦ To guarantee in-order completion:
 Issuing an instruction will stall temporarily, when
 There is a conflict and
 A a unit requires more than one cycle to execute

31
Instruction Issue Policies…
 In-Order Issue with In-Order Completion…
◦ Example:
 Assume a superscalar pipeline with the following capabilities:
 Can issue and decode two instructions per cycle
 Has three functional units
 Two, single-cycle integer units,
 One, Two-cycle floating-point unit, and
 Can complete and write back two results per cycle
 Also, assume an instruction sequence with the characteristics given
below:
 I1 – needs two execute cycles (floating-point)
 I2 –
 I3 –
 I4 – needs the same functional unit as I3
 I5 – needs data value produced by I4
 I6 – needs the same functional unit as I5

32
Instruction Issue Policies…
 In-Order Issue with In-Order Completion…
◦ The processor detects and handles (by stalling) true data
dependencies and resource conflicts
◦ As instructions are issued and completed in their strict order
 The exploited parallelism is very much dependent on the
way the program has been written or compiled
 Example:
 If I3 and I6 switch position,
 the pairs I4/I6 and I3/I5 can be executed in parallel (see the
following slides)
 To exploit such parallelism improvement, the compiler needs
to perform elaborate data-flow analysis

33
Instruction Issue Policies…
 In-Order Issue with In-Order Completion…
◦ Example
 I1 – needs two execute cycles (floating-point)
 I2 –
 I6 – needs the same functional unit as I5
 I4 – needs the same functional unit as I3
 I5 – needs data value produced by I4
 I3 –

 improvement, the compiler needs to perform elaborate data-flow analysis

34
Instruction Issue Policies…
 In-Order Issue with In-Order Completion…
◦ The basic idea of SSA is not to rely on compiler-based
technique
◦ SSA allows the hardware alone to detect instructions which
can be executed in parallel and to do that accordingly
◦ IOI with IOC is not very efficient, but it simplifies the
hardware

35
Instruction Issue Policies…
 In-Order Issue with Out-of-Order Completion
◦ With out-of-order completion, a later instruction may complete before a
previous one
◦ Requires
 More complex instruction issue logic that in-order completion
 To pay attention when an interrupt occurs
◦ Used to improve the performance of instructions that require multiple
cycles
 Example: Long-latency operations such as division
 I1 – needs two cycles
 I2 –
 I3 –
 I4 – conflict with I3
 I5 – depending on I4
 I6 – conflict with I5

36
Instruction Issue Policies…
 Out-of-Order Issue with Out-of-Order Completion
◦ With in-order issue
 The processor will only decode instruction up to the point of a
dependency or conflict
 No additional instructions are decoded until the conflict is resolved
 The processor cannot look ahead of the point of conflict to subsequent
instructions
 That may be independent of those already in the pipeline and
 That may be usefully introduced into the pipeline

37
Instruction Issue Policies…
 Out-of-Order Issue with Out-of-Order Completion…
◦ Out-of-order issue takes a set of decoded instructions, issues
any instruction, in any order, as long as the program
execution is correct
 Decouple decode pipeline from execution pipeline, by introducing an
instruction window
 When a functional unit becomes available an instruction can be executed
 Since instructions have been decoded, processor can look ahead

38
Instruction Issue Policies…
 Out-of-Order Issue with Out-of-Order Completion…
◦ Example
 Instructions have a similar relationship as indicated in the previous slide
 I1 – needs two execute cycles
 I2 –
 I3 –
 I4 – conflicts with I3
 I5 – depends on I4
 I6 – conflict with I5

39
Superscalar Execution Review
 The instruction fetch process, which includes branch predictions
◦ Used to from a dynamic stream of instructions
◦ This stream examined for dependencies and the processor may remove artificial
dependencies

 The processor then dispatches the instruction into a widow of execution

◦ In the window
 Instructions are structured according to their true data dependencies
 No longer form a sequential stream
 The processor performs the execution station of each instruction
◦ In an order determined by
 The true data dependencies
 The hardware resource availability
 Finally, instructions are conceptually put back into sequential order and their
results are ordered
◦ Referred to as committing or retiring the instruction

40
Superscalar Execution Review…
 Committing or retiring instruction needed for the following reason:
◦ Use of parallel , multiple pipelining
 Instructions may completed in an order different from that shown in
the static program
◦ Use of branch prediction and speculative execution
 Some instructions may be abandoned
 Permanent storage an program-visible registers:
◦ Can not be updated immediately when instruction complete
execution
◦ Results held in some sort of temporary storage
 Usable by dependent instructions
 Made permanent when it is determined that the sequential model
would have executed the instruction

41
Superscalar Execution Review…
 Figure: Superscalar Execution

42
Superscalar Implementation
 Instruction fetch strategies that simultaneous fetch multiple instructions
◦ Often by predicting the outcomes and
◦ Fetching beyond conditional branch instructions
 It requires use of
 Multiple pipeline fetch and decode stage
 Branch prediction logic
 Logic for determining true dependencies involving register values and
mechanism for communicating these values to where they are needed
during execution
 Mechanism for initiating, or issuing , multiple instructions in parallel
 Resources for parallel execution of multiple instructions, including
◦ Multiple pipelined functional units
◦ Memory hierarchies capable of simultaneously servicing multiple memory
references
 Mechanism for committing the process state in correct order

43
Superscalar Execution Review…
 Figure
◦ In the figure two floating point and two integer operations can be issued and
executed simultaneously
◦ Each unit is also pipelined and can execute several operations in different
pipeline stages

44
Superscalar Execution Review…
 Figure…
◦ Another view of superscalar processor organization

45
Summary
 The following techniques are main features for superscalar processors:
◦ Several pipelined units which are working in parallel
◦ Out-of-order issue and out-of-order completion
◦ Register renaming
 All of the above techniques are aimed to enhance performance
 Experiments have shown:
◦ Only adding additional functional units is not very efficient
◦ Out-of-order issue is extremely important, which allows to look ahead
for independent instructions
◦ Register renaming can improve performance with more than 30%; in
this case performance is limited only by true dependencies
◦ It is important to provide a fetching/decoding capacity so that the
window of execution is sufficiently large

46
Reading Assignment
 Superpipelining processor
◦ Comparison with
 Ordinary pipeline
 Superscalar
◦ Advantage of superpipelined
 Superpipelined superscalar processors
◦ Characteristics
 Instruction Level Parallelism Vs Machine Parallelism
 CPI Vs IPC
 Clock cycles Per instruction
 Instructions Per Clock cycles

Instruction Level Parallelism Explained
No ratings yet
Instruction Level Parallelism Explained
45 pages
5th Sem - Unit 2-Ec355tbf
No ratings yet
5th Sem - Unit 2-Ec355tbf
104 pages
ITEC582-Chapter 16m
No ratings yet
ITEC582-Chapter 16m
55 pages
P14-15 Superscalar
No ratings yet
P14-15 Superscalar
28 pages
Superscalar Processors and Parallelism
No ratings yet
Superscalar Processors and Parallelism
28 pages
Understanding Instruction Level Parallelism
No ratings yet
Understanding Instruction Level Parallelism
19 pages
Chapter 2 ILP
No ratings yet
Chapter 2 ILP
89 pages
CH18 COA11e
No ratings yet
CH18 COA11e
37 pages
Instruction Level Parallelism Overview
No ratings yet
Instruction Level Parallelism Overview
42 pages
7TH - Unit 2-21ec74h6 - Ca
No ratings yet
7TH - Unit 2-21ec74h6 - Ca
95 pages
Lec5 PDF
No ratings yet
Lec5 PDF
39 pages
Instruction Level Parallelism Guide
No ratings yet
Instruction Level Parallelism Guide
31 pages
L27,28 Superscaler
No ratings yet
L27,28 Superscaler
28 pages
Computer Organization and Architecture What Does Superscalar Mean?
No ratings yet
Computer Organization and Architecture What Does Superscalar Mean?
14 pages
William Stallings Computer Organization and Architecture: Instruction Level Parallelism and Superscalar Processors
No ratings yet
William Stallings Computer Organization and Architecture: Instruction Level Parallelism and Superscalar Processors
28 pages
Module3
No ratings yet
Module3
49 pages
Instruction Level Pipelining
100% (1)
Instruction Level Pipelining
113 pages
Instruction-Level Parallelism Explained
No ratings yet
Instruction-Level Parallelism Explained
28 pages
COA Report
No ratings yet
COA Report
13 pages
EC483 Fall2024 W7
No ratings yet
EC483 Fall2024 W7
40 pages
Superscalar Processors Explained
No ratings yet
Superscalar Processors Explained
34 pages
2 TypesofParallelism
No ratings yet
2 TypesofParallelism
69 pages
Instruction Level Parallelism Overview
No ratings yet
Instruction Level Parallelism Overview
38 pages
3a.ILP Dipendenze e Superscalare
No ratings yet
3a.ILP Dipendenze e Superscalare
24 pages
Superscalar Processor Architecture Overview
No ratings yet
Superscalar Processor Architecture Overview
36 pages
Superscalar Processors & Parallelism
No ratings yet
Superscalar Processors & Parallelism
50 pages
Computer Architecture Unit 3
No ratings yet
Computer Architecture Unit 3
8 pages
Instruction-Level Parallelism Explained
No ratings yet
Instruction-Level Parallelism Explained
22 pages
Instruction-Level Parallelism in Superscalar Processors
No ratings yet
Instruction-Level Parallelism in Superscalar Processors
22 pages
Superscalar Processor Architecture Overview
No ratings yet
Superscalar Processor Architecture Overview
21 pages
Out-of-Order Execution and ILP Techniques
No ratings yet
Out-of-Order Execution and ILP Techniques
77 pages
Enhancing Instruction Level Parallelism
No ratings yet
Enhancing Instruction Level Parallelism
15 pages
Superscalar Instruction Issue Policy
No ratings yet
Superscalar Instruction Issue Policy
16 pages
Understanding Instruction-Level Parallelism
No ratings yet
Understanding Instruction-Level Parallelism
16 pages
Instruction-Level Parallelism and Superscalar Processors
100% (1)
Instruction-Level Parallelism and Superscalar Processors
22 pages
10 Week
No ratings yet
10 Week
35 pages
Instruction-Level Parallelism Explained
No ratings yet
Instruction-Level Parallelism Explained
8 pages
Introduction to Multiprocessors & ILP
No ratings yet
Introduction to Multiprocessors & ILP
41 pages
Instruction-Level Parallelism in CPUs
No ratings yet
Instruction-Level Parallelism in CPUs
31 pages
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
No ratings yet
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
17 pages
Advanced Computer Architecture Concepts
No ratings yet
Advanced Computer Architecture Concepts
10 pages
Superscalar Processors & ILP Explained
No ratings yet
Superscalar Processors & ILP Explained
2 pages
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
No ratings yet
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
35 pages
WINSEM2022-23 CSE4001 ETH VL2022230503160 Reference Material I 22-12-2022 2.1 ILP
No ratings yet
WINSEM2022-23 CSE4001 ETH VL2022230503160 Reference Material I 22-12-2022 2.1 ILP
34 pages
Topic2c Ss Dynamicscheduling
No ratings yet
Topic2c Ss Dynamicscheduling
94 pages
Computer Architecture Insights
No ratings yet
Computer Architecture Insights
41 pages
Instruction Level Parallelism Overview
No ratings yet
Instruction Level Parallelism Overview
25 pages
Instruction-Level Parallelism Overview
No ratings yet
Instruction-Level Parallelism Overview
20 pages
Understanding Instruction-Level Parallelism
No ratings yet
Understanding Instruction-Level Parallelism
2 pages
CAQA5e ch3
No ratings yet
CAQA5e ch3
45 pages
Instruction Pipelining and SuperScalar Development - 2019
No ratings yet
Instruction Pipelining and SuperScalar Development - 2019
53 pages
Superscalar Architecture Explained
No ratings yet
Superscalar Architecture Explained
43 pages
Q: What Is Instruction Level Parallelism (ILP) ? Explain Its Concepts
No ratings yet
Q: What Is Instruction Level Parallelism (ILP) ? Explain Its Concepts
18 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
17 pages
Instruction Level Parallelism Overview
No ratings yet
Instruction Level Parallelism Overview
56 pages
Overview of Computer Architecture Concepts
No ratings yet
Overview of Computer Architecture Concepts
19 pages
Instruction-Level Parallelism in Superscalar Processors
No ratings yet
Instruction-Level Parallelism in Superscalar Processors
18 pages
I/O Memory Interface and Data Transfer Techniques
No ratings yet
I/O Memory Interface and Data Transfer Techniques
26 pages
8088/8086 Microcomputer Interrupts Guide
No ratings yet
8088/8086 Microcomputer Interrupts Guide
40 pages
Overview of Programmable Interrupt Controller
No ratings yet
Overview of Programmable Interrupt Controller
15 pages
Introduction to DEBUG and DEBUG32 Tools
No ratings yet
Introduction to DEBUG and DEBUG32 Tools
6 pages
Pentium Assembly Language Lab Guide
No ratings yet
Pentium Assembly Language Lab Guide
6 pages
Interfacing with 8255 PPI and Standards
No ratings yet
Interfacing with 8255 PPI and Standards
52 pages
8251A USART Overview and Programming
No ratings yet
8251A USART Overview and Programming
31 pages
8259A Programmable Interrupt Controller Overview
No ratings yet
8259A Programmable Interrupt Controller Overview
28 pages
I/O Interfaces for 8088 Microcomputer
No ratings yet
I/O Interfaces for 8088 Microcomputer
24 pages
8259A Programmable Interrupt Controller
No ratings yet
8259A Programmable Interrupt Controller
26 pages
Intel 8088 Microprocessor Overview
No ratings yet
Intel 8088 Microprocessor Overview
25 pages
05-Addressing Mode
No ratings yet
05-Addressing Mode
20 pages
Computer Architecture: Caches Overview
No ratings yet
Computer Architecture: Caches Overview
9 pages
Instruction Pipelining and Hazards
No ratings yet
Instruction Pipelining and Hazards
40 pages
RISC Architecture: Key Features & Benefits
No ratings yet
RISC Architecture: Key Features & Benefits
33 pages
Implementing Automation For Cisco Enterprise Solutions Enaui
No ratings yet
Implementing Automation For Cisco Enterprise Solutions Enaui
4 pages
Implementing and Operating Cisco Enterprise Network Core Technologies Encor
No ratings yet
Implementing and Operating Cisco Enterprise Network Core Technologies Encor
5 pages
Implementing Cisco Enterprise Wireless Networks Enwlsi
No ratings yet
Implementing Cisco Enterprise Wireless Networks Enwlsi
3 pages
Omya
No ratings yet
Omya
13 pages
Proj
No ratings yet
Proj
734 pages
Parent Communication Log Book
No ratings yet
Parent Communication Log Book
100 pages
Python for Biomed Students
No ratings yet
Python for Biomed Students
119 pages
Setting Up 2 Factor Authentication For Office365
No ratings yet
Setting Up 2 Factor Authentication For Office365
16 pages
Baza Cisco RK2
No ratings yet
Baza Cisco RK2
178 pages
GST On E-Commerce Operators
No ratings yet
GST On E-Commerce Operators
45 pages
ESET PROTECT Advanced Brochure 2023
No ratings yet
ESET PROTECT Advanced Brochure 2023
4 pages
Elster EnCal 3000 H2S Rev 05
No ratings yet
Elster EnCal 3000 H2S Rev 05
2 pages
PPC & Loom Planning User Guide
No ratings yet
PPC & Loom Planning User Guide
8 pages
MS Word MODULE 2
No ratings yet
MS Word MODULE 2
32 pages
Hologic Selenia Dimensions Site Planning and Pre Installation Guide
No ratings yet
Hologic Selenia Dimensions Site Planning and Pre Installation Guide
38 pages
Statiscal Study of Mobile Phones
No ratings yet
Statiscal Study of Mobile Phones
34 pages
Komatsu HD785-7 Transmission Leak Report
No ratings yet
Komatsu HD785-7 Transmission Leak Report
2 pages
Unit 1 Commands
No ratings yet
Unit 1 Commands
15 pages
Impact of Technology On Environment & Society
No ratings yet
Impact of Technology On Environment & Society
3 pages
Ijet 144971
No ratings yet
Ijet 144971
6 pages
10 Minute Mailasd
No ratings yet
10 Minute Mailasd
2 pages
Understanding Data and Big Data Evolution
No ratings yet
Understanding Data and Big Data Evolution
11 pages
Steve Jobs Talks About Managing People
No ratings yet
Steve Jobs Talks About Managing People
3 pages
CGI Security Exploits and Vulnerabilities
No ratings yet
CGI Security Exploits and Vulnerabilities
18 pages
1.4 Scheduling Queues and Types of Scheduler in Operating System
No ratings yet
1.4 Scheduling Queues and Types of Scheduler in Operating System
7 pages
Linux Driver Development Guide
No ratings yet
Linux Driver Development Guide
54 pages
Managerial Roles and Decision-Making Models
No ratings yet
Managerial Roles and Decision-Making Models
8 pages
7-11 Manual
No ratings yet
7-11 Manual
19 pages
Math Half-Yearly Exam Class VI 2024
No ratings yet
Math Half-Yearly Exam Class VI 2024
1 page
A Review of Existing Inventory Management Systems
No ratings yet
A Review of Existing Inventory Management Systems
12 pages
Valid Documents List-11
No ratings yet
Valid Documents List-11
1 page
EASA Concept Paper Guidance For Level 1and2 Machine Learning Applications Issue 02
No ratings yet
EASA Concept Paper Guidance For Level 1and2 Machine Learning Applications Issue 02
285 pages
Remote Deposit Capture Project Part 1: Project Integration Management
No ratings yet
Remote Deposit Capture Project Part 1: Project Integration Management
7 pages

Superscalar Processors and ILP Explained

Uploaded by

Superscalar Processors and ILP Explained

Uploaded by

Chapter 5

Instruction Level Parallelism and

◦ Based on the dependency analysis, the processor may issue

◦ The processor may eliminate some unnecessary dependencies

 Their consequences on SSA are more severe than those on

◦ Three types of data dependencies can be identified:

 MUL R4,R3,R1 (R4 := R3 * R1)

MUL R4,R3,R1 (R4 := R3 * R1)

 Superscalar instruction issue policies fall into the following

 improvement, the compiler needs to perform elaborate data-flow analysis

 The processor then dispatches the instruction into a widow of execution

You might also like