0% found this document useful (0 votes)
9 views3 pages

DPC-Lab2-Parallelism in Intel's Modern Architecture

Lab manual for Distributed and Parallel Computing course.

Uploaded by

Fawnizu Hussin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
9 views3 pages

DPC-Lab2-Parallelism in Intel's Modern Architecture

Lab manual for Distributed and Parallel Computing course.

Uploaded by

Fawnizu Hussin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 3

EDB4603 (DPC) Jan’22 – Lab 1

Part 1: Vectorization
1. For this lab exercise, we will go through several code optimization by taking
advantage of modern processors SIMD instructions—vectorization.
2. We will use the Coursera’s online course on “Fundamentals of Parallelism on
Intel Architecture”, available at https://www.coursera.org/learn/parallelism-ia
3. This lab exercise is based on the Week 2 materials on “Fundamentals of
Parallelism on Intel Architecture”.
4. Follow the instruction in the ULearn page to download and setup the Ubuntu
Virtual Machine (UVM) on your personal computer. The UVM has been setup
with the necessary tools and libraries required for the lab exercises.

Exercise 1.1 – Vectorization of Stencil Code


1. Download the stencil code at https://www.coursera.org/learn/parallelism-
ia/supplement/2d9MO/code-download and copy to your UVM’s home directory.
2. Review this video to understand what the stencil code does.
https://www.coursera.org/learn/parallelism-ia/lecture/dUuoy/2-8-1-stencil-
introduction
3. Using demonstration video “2.8 Stencil”, vectorize the stencil code by
considering the following optimization options.
a. Vectorize the main computation loop in stencil.cc by including the
necessary #pragmas to inform the compilers that there is no vector
dependence in the inner loop.
b. Further improve the code execution by taking advantage of the vector ISA
extensions available to your UVM. Hint: you need to check the latest ISA
extensions available to you on your UVM and inform the compiler to
compile to that target microarchitecture.

4. Remember to use the intel compilers and the -qopt-report options to verify
vectorization or the cause of non-vectorization. In your report, you should include
the following:

a. Details of the relevant specifications of the computer running the UVM,


followed by each optimization steps such as but not limited to the CPU
name and vector ISA extensions supported.
b. For each optimization step taken, report the steps taken, the resulting
estimated performance gain from the qopt-report and the actual average
execution time for the stencil code.
c. Explain your results and observations on average memory bandwidth and
computation speed.

Exercise 1.2 – Vectorization of Numerical Integration Code

1. Download the stencil code at https://www.coursera.org/learn/parallelism-


ia/supplement/LFvRn/code-download and copy to your UVM’s home directory.

1
EDB4603 (DPC) Jan’22 – Lab 1
2. Review this video to understand what the numerical integration code does.
https://www.coursera.org/learn/parallelism-ia/lecture/o94Je/numerical-integration-
introduction
3. Using demonstration video “2.9 Integral Vectorization” as reference, vectorize
the numerical integration code by considering the following optimization options.
a. Make the function that is being called by the ComputeIntegral function in
worker.cc, vectorizable.
b. Subsequent to part (a), the main “for loop” in worker.cc is assumed to have
vector dependence by the compiler because of the use of global variable.
Use the necessary #pragma to force reduction of the global variable in
order to enable vectorization.
c. Further improve the code execution by taking advantage of the latest
vector ISA supported on your CPU architecture running the UVM.

4. Remember to use the intel compilers and the -qopt-report options to verify
vectorization or the cause of non-vectorization. In your report, you should include
the following:

d. Details of the relevant specifications of the computer running the UVM,


followed by each optimization steps such as but not limited to the CPU
name and vector ISA extensions supported.
e. For each optimization step taken, report the steps taken, the resulting
estimated performance gain from the qopt-report and the actual average
execution time for the integral code.
f. Explain your results and observations on average memory bandwidth and
computation speed.

Part 2: Multithreading and Scalar Optimizations


1. For this lab exercise, we will go through several code optimization by taking
advantage of modern processors multithreading capabilities.
2. This lab exercise is based on the Week 3 and Week 4 materials on
“Multithreading” and “Memory Organization”, respectively.

Exercise 2.1 – Multithreading of Stencil Code


1. Use the vectorized version of your code from Exercise 1.1.
2. Using demonstration video “Stencil Demonstration” in week 3, enhance the
stencil code by parallelizing across multiple threads.
a. Use the “#pragma parallel for” to parallelize the inner loop in ApplyStencil
function and note the performance gain/drop due to this parallelization.
b. Investigate what would happen if the outer loop is parallelized using the
“#pragma parallel for” instead.

2
EDB4603 (DPC) Jan’22 – Lab 1
Requirements
1. You are required to complete a report for Part 1 and 2.
(a) The report must be typed.
(b) This report is to be submitted individually.
(c) Each part must include a step-by-step explanation followed by the
corresponding compilation results and performance comparison.
2. The report must be submitted in ULearn within FOUR (4) days from the final lab
session for this exercise.

You might also like