4 - W4-Intro To Parallel and Distributed Computing
4 - W4-Intro To Parallel and Distributed Computing
CS443
Introduction to Parallel and Distributed Computing
(WEEK 4)
LECTURE - 7 & 8
GPU architecture and programming
Lecturer
Muhammad Sheraz Tariq
Sheraz.t@scocs.edu.pk
2
Course Grading Policy
Quizzes = 20%
Mid-Term = 30%
2
3
Course Outline
1. Identify the key characteristics that distinguish a CPU from other kinds of GPU’s.
2. Understand How is a GPU different from a CPU.
3. Discuss how the way in which The Graphics Pipeline are structured, with parallel
programming concepts.
4. Discuss some of the issues which arise in managing the shader processor
distributed tasks.
6
Remaining part of lecture 6
A system that fails is not adequately providing the services it was designed for.
If we consider a distributed system as a collection of servers that communicate with one another
and with their clients, not adequately providing services means that servers, communication
channels, or possibly both, are not doing what they are supposed to do.
However, a malfunctioning server itself may not always be the fault we are looking for.
If such a server depends on other servers to adequately provide its services, the cause of an
error may need to be searched for somewhere else.
8
Fault Tolerance – Failure Models
To get a better grasp on how serious a failure actually is, several classification schemes have
been developed.
One such scheme is shown in Table-1, and is based on schemes described in Cristian (1991) and
Hadzilacos and Toueg (1993).
9
Fault Tolerance – Failure Models
Table-1
10
Fault Tolerance- crash failure
A crash failure occurs when a server prematurely halts, but was working correctly until it stopped.
An important aspect of crash failures is that once the server has halted, nothing is heard from it
anymore.
A typical example of a crash failure is an operating system that comes to a unending halt, and for
which there is only one solution: reboot it.
Many personal computer systems suffer from crash failures so often that people have come to expect
them to be normal.
So, moving the reset button from the back of a cabinet to the front was done for good reason
11
Fault Tolerance- omission failure
Other types of omission failures not related to communication may be caused by software errors such
as infinite loops or improper memory management by which the server is said to "hang."
13
Fault Tolerance- timing failure
Timing failures occur when the response lies outside a specified real-time interval.
As we discussed earlier, providing data too soon may easily cause trouble for a
recipient if there is not enough buffer space to hold all the incoming data.
More common, however, is that a server responds too late, in which case a performance failure is said
to occur.
14
Fault Tolerance- response failure
A serious type of failure is a response failure, by which the server's response is simply incorrect.
In the case of a value failure, a server simply provides the wrong reply to a request.
For example, a search engine that systematically returns Web pages not related to any of the search
terms used has failed.
15
Fault Tolerance- response failure
This kind of failure happens when the server reacts unexpectedly to an incoming
request.
For example, if a server receives a message it cannot recognize, a state transition failure happens if no
measures have been taken to handle such messages.
In particular, a faulty server may incorrectly take default actions it should never have initiated.
16
Fault Tolerance- arbitrary failure
The most serious are arbitrary failures, also known as Byzantine failures.
In effect, when arbitrary failures occur, clients should be prepared for the worst.
In particular, it may happen that a server is producing output it should never have produced, but
which cannot be detected as being incorrect Worse yet a faulty server may even be maliciously
working together with other servers to produce intentionally wrong answers.
This situation illustrates why security is also considered an important requirement when talking about
dependable systems.
17
Fault Tolerance- arbitrary failure
The most serious are arbitrary failures, also known as Byzantine failures.
In effect, when arbitrary failures occur, clients should be prepared for the worst.
In particular, it may happen that a server is producing output it should never have produced, but
which cannot be detected as being incorrect Worse yet a faulty server may even be maliciously
working together with other servers to produce intentionally wrong answers.
This situation illustrates why security is also considered an important requirement when talking about
dependable systems.
18
Fault Tolerance- arbitrary failure
The definition of crash failures as presented above is the most benign way for a server to halt.
They are also referred to as fail-stop failures. In effect, a fail-stop server will simply stop producing
output in such a way that its halting can be detected by other processes.
In the best case, the server may have been so friendly to announce it is about to crash; otherwise it
simply stops.
19
Fault Tolerance- arbitrary failure
Finally, there are also occasions in which the server is producing random output, but this output can
be recognized by other processes as plain junk.
The server is then exhibiting arbitrary failures, but in a benign way. These faults are also referred to
as being fail-safe.
20
GPU computing
GPU computing is the use of a GPU (graphics processing unit) as a co-processor to accelerate CPUs
for general-purpose scientific and engineering computing.
The GPU accelerates applications running on the CPU by free from some of the compute-intensive and
time consuming portions of the code.
The rest of the application still runs on the CPU.
From a user's perspective, the application runs faster because it's using the massively parallel
processing power of the GPU to boost performance.
This is known as "heterogeneous" or "hybrid" computing.
21
GPU computing
A CPU consists of four to eight CPU cores, while the GPU consists of hundreds of smaller cores.
This massively parallel architecture is what gives the GPU its high compute performance.
There are a number of GPU-accelerated applications that provide an easy way to access high-
performance computing (HPC).
22
How is a GPU different from a CPU?
Extremely parallel
Different pixels and elements of the image can be operated on independently
Hundreds of cores executing at the same time to take advantage of this fundamental parallelism
Software developers could set parameters (textures, light reflection colors, blend modes) but the function was
completely controlled by the hardware.
Pipeline stages are now programs running on processor cores inside the GPU, instead of fixed-function ASICs
Vertex shaders = programs running on vertex processors, fragment shaders = programs running on fragment
processors
Combining different types of shader cores into a single unified shader core
Dynamic task scheduling to balance the load on all cores
Frames with many “edges” (vertices) require more vertex shaders
Frames with large primitives require more pixel shaders
30
Solution: Unified Shader
Pixel shaders, geometry shaders, and vertex shaders run on the same core - a unified shader core
Shader cores are programmed using graphics APIs like OpenGL and Direct3D
31
Static Task Distribution
1. Distributed Systems: Principles and Paradigms, A. S. Tanenbaum and M. V. Steen, Prentice Hall,
2nd Edition, 2007
2. Distributed and Cloud Computing: Clusters, Grids, Clouds, and the Future Internet, K Hwang, J
Dongarra and GC. C. Fox, Elsevier, 1st Ed.