Lecture 3 Multi-core computing
Lecture 3 Multi-core computing
0: Multi-core
computing
c c c c
o o o o
r r r r
e e e e
1 2 3 4
Programming for multi-core
Programmers must use threads or processes.
Distributed memory:
In this model, each processor has its own
(small) local memory, and its content is not
replicated anywhere else.
Microprocessor Design
Taking the idea of superscalar operations to
the next level, it is possible to put multiple
microprocessor cores onto a single chip, and
have the cores operate in parallel with one
another.
Symmetric Multi-core Processor(SMP)
A symmetric multi-core processor is one that has
multiple cores on a single chip, and all of those cores are
identical.
Example: Intel Core 2:
The Intel Core 2 is an example of a symmetric
multi-core processor.
The Core 2 can have either 2 cores on chip ("Core 2
Duo") or 4 cores on chip ("Core 2 Quad").
Each core in the Core 2 chip is symmetrical, and
can function independently of one another.
It requires a mixture of scheduling software and
hardware to farm tasks out to each core.
Symmetric Multi-core Processor
Server / Super
Computer
Asymmetric Multi-core Processor
An asymmetric multi-core processor is one that has multiple
cores on a single chip, but those cores might be different
designs. For instance, there could be 2 general purpose cores
and 2 vector cores on a single chip.
▪ Example: Cell Processor:
▪ IBM's Cell processor, used in the Sony PlayStation 3 video game
console is an asymmetrical multi-core processor. The Cell has 9
processor cores on board, one general purpose processor, and 8 data-
processing cores.
▪ The one multipurpose core, known as the Power Processor Element
(PPE) controls the communication between the other cores, and
distributes computing tasks to the other cores for processing. The
other 8 cores are known as Synergistic Processor Elements (SPE),
and are specially designed to have high floating-point throughput,
especially with vector operations.
Asymmetric Multi-core Processor
▪ In an asymmetric multi-core processor, the chip
has multiple cores onboard, but the cores might be
different designs.
▪ Each core will have different capabilities.
Example: IBM Cell Processor
An example of an asymmetric multi-core processor is the IBM Cell
processor.
The IBM Cell processor has 1 Power Processor Element (PPE)
that controls the chip, and 8 Synergistic Processor Elements
(SPEs) that are designed for high mathematical throughput. The
IBM Cell processor is designed as follows:
Notice how the SPE cores only connect to the PPE, and not to
each other. Notice also that the PPE core is much larger then the
individual SPE cores.
Asymmetric Multi-core Processor(ASMP)
– Cell Processor
• Applications
Super Computing:
▪ IBM's latest
supercomputer, IBM
Roadrunner, is a hybrid
of General Purpose
CISC Opteron as well as
Cell processors.
Asymmetric Multi-core Processor(ASMP)
– Cell Processor
• Applications
Home cinema
▪ Toshiba is considering
producing HDTVs using Cell.
They have already presented a
system to decode 48 standard
definition MPEG-2 streams.
This can enable a viewer to
choose a channel based on
dozens of thumbnail videos
displayed on the screen in the
same time.
Asymmetric Multi-core Processor(ASMP)
– Cell Processor
• Applications
Video Processing Card
▪ Some companies, such as
Leadtek, have plans to
release a PCI-E card based
upon the Cell to allow for
"faster than real time"
transcoding of H.264,
MPEG-2 and MPEG-4
video.
Asymmetric Multi-core Processor(ASMP)
– Cell Processor
• Applications
Console Video Games
▪ The first major commercial
application of Cell was in Sony's
PlayStation 3 game console.
▪ This video game console contains
the first production application of
the Cell processor, clocked at 3.2
GHz and containing seven out of
eight operational SPEs, to allow
Sony to increase the yield on the
processor manufacture. Only six
of the seven SPEs are accessible
to developers as one is reserved
by the OS.
Challenges resulting from multi-core
Relies on effective exploitation of multiple-thread parallelism
Need for parallel computing model and parallel programming model
Magnifies memory wall
Memory bandwidth
▪ Way to get data out of memory banks
▪ Way to get data into multi-core processor array
Memory latency
Fragments L3 cache
Pins become strangle point
▪ Rate of pin growth projected to slow and flatten
▪ Rate of bandwidth per pin (pair) projected to grow slowly
Requires mechanisms for efficient inter-processor coordination
Synchronization
Mutual exclusion
Context switching
Advantages of Multi-core
Cache coherency circuitry can operate at a much
higher clock rate than is possible if the signals
have to travel off-chip.
Signals between different CPUs travel shorter
distances, those signals degrade less.
These higher quality signals allow more data to be
sent in a given time period since individual signals
can be shorter and do not need to be repeated as
often.
A dual-core processor uses slightly less power
than two coupled single-core processors.
Disadvantages of Multi-core
Ability of multi-core processors to increase application
performance depends on the use of multiple threads within
applications.
Most Current video games will run faster on a 3 GHz
single-core processor than on a 2GHz dual-core processor
(of the same core architecture).
Two processing cores sharing the same system bus and
memory bandwidth limits the real-world performance
advantage.
If a single core is close to being memory bandwidth
limited, going to dual-core might only give 30% to 70%
improvement.
If memory bandwidth is not a problem, a 90%
improvement can be expected.
Conclusion
All computers are now parallel computers!
Multi-core processors represent an important new trend in computer
architecture.
Decreased power consumption and heat generation.
Minimized wire lengths and interconnect latencies.
They enable true thread-level parallelism with great energy efficiency and
scalability.
To utilize their full potential, applications will need to move from a single
to a multi-threaded model.
Parallel programming techniques likely to gain importance.
• the difficult problem is not building multi-core hardware, but programming it in a way
that lets mainstream applications benefit from the continued exponential growth in
CPU performance.
the software industry needs to get back into the state where existing
applications run faster on new hardware.
References
http://en.wikipedia.org/wiki/Multi-core_(computing)
Olukotun, Kunle and Hammond, Lance. The future of
microprocessors.Queue, Volume 3, Issue 7, September
2005.
www.princeton.edu/~jdonald/research/hyperthreading/ga
rg_report.pdf
Zheltov, Sergey N. and Bratanov, Stanislav V. Multi-
threading for Experts: Synchronization. Technical
Report. Intel. 2005. (WWWdocument, referenced
17.11.2005). Available:
http://www.intel.com/cd/ids/developer/asmo-na/eng/183
321.htm