2023 CSC14120 Lecture00 CourseIntroduction
2023 CSC14120 Lecture00 CourseIntroduction
Course Introduction
2
CPU vs GPU
3
CPU vs GPU
CPU - Multicore GPU – Many core
- Have a few cores, each core is - Have many many cores, each
powerful and complex core is weak and simple
- Focus on execution speed - Focus on throughput
13
Challenges in parallel programming
14
Challenges in parallel programming
15
3 Ways to Accelerate Applications
16
Libraries: Easy, High-Quality
• Ease of use: enables GPU acceleration without in-depth
knowledge of GPU programming
• “Drop-in”: Many GPU-accelerated libraries follow
standard APIs, thus enabling acceleration with minimal
code changes
• Quality: Libraries offer high-quality implementations of
functions encountered in a broad range of applications
17
NVIDIA GPU Accelerated Libraries
https://developer.nvidia.com/gpu-accelerated-libraries
18
Compiler Directives: Easy, Portable
• Ease of use: Compiler takes care of details of parallelism
management and data movement
• Portable: The code is generic, not specific to any type of
hardware and can be deployed into multiple languages
• Uncertain: Performance of code can vary across compiler
versions
19
Compiler Directives: OpenACC
https://ulhpc-tutorials.readthedocs.io/en/latest/gpu/openacc/basics/
20
Programming Languages: Most
Performance and Flexible
• Performance: Programmer has best control of parallelism
and data movement
• Flexible: The computation does not need to fit into a
limited set of library patterns or directive types
• Verbose: The programmer often needs to express more
details
21
Programming Languages: Most
Performance and Flexible
22
After successful completing the
Course topics: course, the student will be able
Introduction to CUDA; example: to:
vector addition, convolution, … Parallelize common tasks to run
(3 weeks)
on GPU using CUDA
GPU parallel execution in
CUDA; example: reduction, … Apply knowledge of GPU
(4 weeks) parallel execution in CUDA to
Types of GPU memories in speed up a CUDA program
CUDA; example: reduction,
convolution, … (3 weeks)
Apply knowledge of GPU
memories in CUDA to speed up a
Example: scan, histogram, sort
(4 weeks) CUDA program
Optimizing a CUDA program; Apply the optimization process
additional topics in parallel to optimize a CUDA program
programming (1 week)
Apply teamwork skills to
complete final project
23
Course assessment
• Individual exercises throughout the course: 50% of the
grade
• Group final project: 50% of the grade, 2 students /
group
24
Course assessment
Remember: the main goal is to learn, truly learn
If you violate this rule, you will get 0 score for the course
25
Advices
• In this course, we will focus on parallel programming on
GPU (Graphics Processing Unit)
• Don’t worry if you don’t have GPU ;-)
• We will use Google Colab for this course.
26
Setup coding environment
• Where to find a machine with CUDA-enabled GPU?
• Google Colab, it’s free and ready to run CUDA programs ☺
• Even if you have your own GPU, you should use Google Colab because
teacher will use it to run and grade your programs
• Code, compile, and run:
• Write and save code (.cu file) in your local machine by your favorite editor
(with editors not recognizing .cu file automatically and not highlighting syntax
with colors, the simple way is to set language/syntax as C/C++)
• Open a notebook in Colab (you must sign in to your gmail), select “Runtime,
Change runtime type” and set “Hardware accelerator” as GPU, upload .cu file
• In a Colab cell, compile: !nvcc file-name.cu -o run-file-name
• If we don’t specify run-file-name, it will default to a.out
• In a Colab cell, run: !./run-file-name
• Demo …
27
RESOURCES
• Wen-Mei, W. Hwu, David B. Kirk, and Izzat El Hajj.
Programming Massively Parallel Processors: A Hands-on
Approach. Morgan Kaufmann, 2022.
• David B. Kirk, Wen-mei W. Hwu. Programming Massively
Parallel Processors. Morgan Kaufmann, 2016
• Cheng John, Max Grossman, and Ty
McKercher. Professional Cuda C Programming. John Wiley
& Sons, 2014
• Lê Hoài Bắc, Vũ Thanh Hưng, Trần Trung Kiên. Lập trình
song song trên GPU. NXB KH & KT, 2015
• NVIDIA. Intro to Parallel Programming. Udacity
• NVIDIA. CUDA Toolkit Documentation
28
Reference
• [1] Slides from Illinois-NVIDIA GPU Teaching Kit
• [2] Wen-Mei, W. Hwu, David B. Kirk, and Izzat El Hajj.
Programming Massively Parallel Processors: A Hands-on
Approach. Morgan Kaufmann, 2022
29
THE END
30