Multi - Core Architectures and Programming - Lecture Notes, Study Material and Important Questions, Answers
All Topics with neat figure(Diagram) and Explanation. Important Questions and answers. Study Online / Download as PDF format.
Multi - Core Architectures and Programming - Lecture Notes, Study Material and Important Questions, Answers
Multi - Core Architectures and Programming - Lecture Notes, Study Material and Important Questions, Answers
All Topics with neat figure(Diagram) and Explanation. Important Questions and answers. Study Online / Download as PDF format.
Multi - Core Architectures and Programming - Lecture Notes, Study Material and Important Questions, Answers
Subject : Multi - Core Architectures and Programming
An Introduction to Parallel Programming by Peter S Pacheco Chapter 1 Why Parallel Computing 1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here)
Chapter 2 Parallel Hardware and Parallel Software
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here)
Chapter 3 Distributed Memory Programming with MPI
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here)
Chapter 4 Shared Memory Programming with Pthreads
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here)
Multicore Application Programming For Windows Linux and
Oracle Solaris by Darryl Gove Chapter 1 Hardware and Processes and Threads 1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here)
Chapter 2 Coding for Performance
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here)
Chapter 3 Identifying Opportunities for Parallelism
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here)
Chapter 4 Synchronization and Data Sharing
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here)
Chapter 5 Using POSIX Threads
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here) 144. Using POSIX Threads - Answer (click here) 145. Creating Threads - Answer (click here) 146. Compiling Multithreaded Code - Answer (click here) 147. Process Termination - Answer (click here) 148. Sharing Data Between Threads - Answer (click here) 149. Variables and Memory - Answer (click here) 150. Multiprocess Programming - Answer (click here) 151. Sockets - Answer (click here) 152. Reentrant Code and Compiler Flags - Answer (click here) 153. Windows Threading - Answer (click here)
Chapter 6 Windows Threading
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here) 144. Using POSIX Threads - Answer (click here) 145. Creating Threads - Answer (click here) 146. Compiling Multithreaded Code - Answer (click here) 147. Process Termination - Answer (click here) 148. Sharing Data Between Threads - Answer (click here) 149. Variables and Memory - Answer (click here) 150. Multiprocess Programming - Answer (click here) 151. Sockets - Answer (click here) 152. Reentrant Code and Compiler Flags - Answer (click here) 153. Windows Threading - Answer (click here) 154. Creating Native Windows Threads - Answer (click here) 155. Terminating Threads - Answer (click here) 156. Creating and Resuming Suspended Threads - Answer (click here) 157. Using Handles to Kernel Resources - Answer (click here) 158. Methods of Synchronization and Resource Sharing - Answer (click here) 159. An Example of Requiring Synchronization Between Threads - Answer (click here) 160. Protecting Access to Code with Critical Sections - Answer (click here) 161. Protecting Regions of Code with Mutexes - Answer (click here) 162. Slim Reader/Writer Locks - Answer (click here) 163. Signaling Event Completion to Other Threads or Processes - Answer (click here) 164. Wide String Handling in Windows - Answer (click here) 165. Creating Processes - Answer (click here) 166. Sharing Memory Between Processes - Answer (click here) 167. Inheriting Handles in Child Processes - Answer (click here) 168. Naming Mutexes and Sharing Them Between Processes - Answer (click here) 169. Communicating with Pipes - Answer (click here) 170. Communicating Using Sockets - Answer (click here) 171. Atomic Updates of Variables - Answer (click here) 172. Allocating Thread-Local Storage - Answer (click here) 173. Setting Thread Priority - Answer (click here)
Chapter 7 Using Automatic Parallelization and OpenMP
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here) 144. Using POSIX Threads - Answer (click here) 145. Creating Threads - Answer (click here) 146. Compiling Multithreaded Code - Answer (click here) 147. Process Termination - Answer (click here) 148. Sharing Data Between Threads - Answer (click here) 149. Variables and Memory - Answer (click here) 150. Multiprocess Programming - Answer (click here) 151. Sockets - Answer (click here) 152. Reentrant Code and Compiler Flags - Answer (click here) 153. Windows Threading - Answer (click here) 154. Creating Native Windows Threads - Answer (click here) 155. Terminating Threads - Answer (click here) 156. Creating and Resuming Suspended Threads - Answer (click here) 157. Using Handles to Kernel Resources - Answer (click here) 158. Methods of Synchronization and Resource Sharing - Answer (click here) 159. An Example of Requiring Synchronization Between Threads - Answer (click here) 160. Protecting Access to Code with Critical Sections - Answer (click here) 161. Protecting Regions of Code with Mutexes - Answer (click here) 162. Slim Reader/Writer Locks - Answer (click here) 163. Signaling Event Completion to Other Threads or Processes - Answer (click here) 164. Wide String Handling in Windows - Answer (click here) 165. Creating Processes - Answer (click here) 166. Sharing Memory Between Processes - Answer (click here) 167. Inheriting Handles in Child Processes - Answer (click here) 168. Naming Mutexes and Sharing Them Between Processes - Answer (click here) 169. Communicating with Pipes - Answer (click here) 170. Communicating Using Sockets - Answer (click here) 171. Atomic Updates of Variables - Answer (click here) 172. Allocating Thread-Local Storage - Answer (click here) 173. Setting Thread Priority - Answer (click here) 174. Using Automatic Parallelization and OpenMP - Answer (click here) 175. Using Automatic Parallelization to Produce a Parallel Application - Answer (click here) 176. Identifying and Parallelizing Reductions - Answer (click here) 177. Automatic Parallelization of Codes Containing Calls - Answer (click here) 178. Assisting Compiler in Automatically Parallelizing Code - Answer (click here) 179. Using OpenMP to Produce a Parallel Application - Answer (click here) 180. Using OpenMP to Parallelize Loops - Answer (click here) 181. Runtime Behavior of an OpenMP Application - Answer (click here) 182. Variable Scoping Inside OpenMP Parallel Regions - Answer (click here) 183. Parallelizing Reductions Using OpenMP - Answer (click here) 184. Accessing Private Data Outside the Parallel Region - Answer (click here) 185. Improving Work Distribution Using Scheduling - Answer (click here) 186. Using Parallel Sections to Perform Independent Work - Answer (click here) 187. Nested Parallelism - Answer (click here) 188. Using OpenMP for Dynamically Defined Parallel Tasks - Answer (click here) 189. Keeping Data Private to Threads - Answer (click here) 190. Controlling the OpenMP Runtime Environment - Answer (click here) 191. Waiting for Work to Complete - Answer (click here) 192. Restricting the Threads That Execute a Region of Code - Answer (click here) 193. Ensuring That Code in a Parallel Region Is Executed in Order - Answer (click here) 194. Collapsing Loops to Improve Workload Balance - Answer (click here) 195. Enforcing Memory Consistency - Answer (click here) 196. An Example of Parallelization - Answer (click here)
Chapter 8 Hand Coded Synchronization and Sharing
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here) 144. Using POSIX Threads - Answer (click here) 145. Creating Threads - Answer (click here) 146. Compiling Multithreaded Code - Answer (click here) 147. Process Termination - Answer (click here) 148. Sharing Data Between Threads - Answer (click here) 149. Variables and Memory - Answer (click here) 150. Multiprocess Programming - Answer (click here) 151. Sockets - Answer (click here) 152. Reentrant Code and Compiler Flags - Answer (click here) 153. Windows Threading - Answer (click here) 154. Creating Native Windows Threads - Answer (click here) 155. Terminating Threads - Answer (click here) 156. Creating and Resuming Suspended Threads - Answer (click here) 157. Using Handles to Kernel Resources - Answer (click here) 158. Methods of Synchronization and Resource Sharing - Answer (click here) 159. An Example of Requiring Synchronization Between Threads - Answer (click here) 160. Protecting Access to Code with Critical Sections - Answer (click here) 161. Protecting Regions of Code with Mutexes - Answer (click here) 162. Slim Reader/Writer Locks - Answer (click here) 163. Signaling Event Completion to Other Threads or Processes - Answer (click here) 164. Wide String Handling in Windows - Answer (click here) 165. Creating Processes - Answer (click here) 166. Sharing Memory Between Processes - Answer (click here) 167. Inheriting Handles in Child Processes - Answer (click here) 168. Naming Mutexes and Sharing Them Between Processes - Answer (click here) 169. Communicating with Pipes - Answer (click here) 170. Communicating Using Sockets - Answer (click here) 171. Atomic Updates of Variables - Answer (click here) 172. Allocating Thread-Local Storage - Answer (click here) 173. Setting Thread Priority - Answer (click here) 174. Using Automatic Parallelization and OpenMP - Answer (click here) 175. Using Automatic Parallelization to Produce a Parallel Application - Answer (click here) 176. Identifying and Parallelizing Reductions - Answer (click here) 177. Automatic Parallelization of Codes Containing Calls - Answer (click here) 178. Assisting Compiler in Automatically Parallelizing Code - Answer (click here) 179. Using OpenMP to Produce a Parallel Application - Answer (click here) 180. Using OpenMP to Parallelize Loops - Answer (click here) 181. Runtime Behavior of an OpenMP Application - Answer (click here) 182. Variable Scoping Inside OpenMP Parallel Regions - Answer (click here) 183. Parallelizing Reductions Using OpenMP - Answer (click here) 184. Accessing Private Data Outside the Parallel Region - Answer (click here) 185. Improving Work Distribution Using Scheduling - Answer (click here) 186. Using Parallel Sections to Perform Independent Work - Answer (click here) 187. Nested Parallelism - Answer (click here) 188. Using OpenMP for Dynamically Defined Parallel Tasks - Answer (click here) 189. Keeping Data Private to Threads - Answer (click here) 190. Controlling the OpenMP Runtime Environment - Answer (click here) 191. Waiting for Work to Complete - Answer (click here) 192. Restricting the Threads That Execute a Region of Code - Answer (click here) 193. Ensuring That Code in a Parallel Region Is Executed in Order - Answer (click here) 194. Collapsing Loops to Improve Workload Balance - Answer (click here) 195. Enforcing Memory Consistency - Answer (click here) 196. An Example of Parallelization - Answer (click here) 197. Hand-Coded Synchronization and Sharing - Answer (click here) 198. Atomic Operations - Answer (click here) 199. Using Compare and Swap Instructions to Form More Complex Atomic Operations - Answer (click here) 200. Enforcing Memory Ordering to Ensure Correct Operation - Answer (click here) 201. Compiler Support of Memory-Ordering Directives - Answer (click here) 202. Reordering of Operations by the Compiler - Answer (click here) 203. Volatile Variables - Answer (click here) 204. Operating System–Provided Atomics - Answer (click here) 205. Lockless Algorithms - Answer (click here) 206. Dekker’s Algorithm - Answer (click here) 207. Producer-Consumer with a Circular Buffer - Answer (click here) 208. Scaling to Multiple Consumers or Producers - Answer (click here) 209. Scaling the Producer-Consumer to Multiple Threads - Answer (click here) 210. Modifying the Producer-Consumer Code to Use Atomics - Answer (click here) 211. The ABA Problem - Answer (click here)
Chapter 9 Scaling with Multicore Processors
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here) 144. Using POSIX Threads - Answer (click here) 145. Creating Threads - Answer (click here) 146. Compiling Multithreaded Code - Answer (click here) 147. Process Termination - Answer (click here) 148. Sharing Data Between Threads - Answer (click here) 149. Variables and Memory - Answer (click here) 150. Multiprocess Programming - Answer (click here) 151. Sockets - Answer (click here) 152. Reentrant Code and Compiler Flags - Answer (click here) 153. Windows Threading - Answer (click here) 154. Creating Native Windows Threads - Answer (click here) 155. Terminating Threads - Answer (click here) 156. Creating and Resuming Suspended Threads - Answer (click here) 157. Using Handles to Kernel Resources - Answer (click here) 158. Methods of Synchronization and Resource Sharing - Answer (click here) 159. An Example of Requiring Synchronization Between Threads - Answer (click here) 160. Protecting Access to Code with Critical Sections - Answer (click here) 161. Protecting Regions of Code with Mutexes - Answer (click here) 162. Slim Reader/Writer Locks - Answer (click here) 163. Signaling Event Completion to Other Threads or Processes - Answer (click here) 164. Wide String Handling in Windows - Answer (click here) 165. Creating Processes - Answer (click here) 166. Sharing Memory Between Processes - Answer (click here) 167. Inheriting Handles in Child Processes - Answer (click here) 168. Naming Mutexes and Sharing Them Between Processes - Answer (click here) 169. Communicating with Pipes - Answer (click here) 170. Communicating Using Sockets - Answer (click here) 171. Atomic Updates of Variables - Answer (click here) 172. Allocating Thread-Local Storage - Answer (click here) 173. Setting Thread Priority - Answer (click here) 174. Using Automatic Parallelization and OpenMP - Answer (click here) 175. Using Automatic Parallelization to Produce a Parallel Application - Answer (click here) 176. Identifying and Parallelizing Reductions - Answer (click here) 177. Automatic Parallelization of Codes Containing Calls - Answer (click here) 178. Assisting Compiler in Automatically Parallelizing Code - Answer (click here) 179. Using OpenMP to Produce a Parallel Application - Answer (click here) 180. Using OpenMP to Parallelize Loops - Answer (click here) 181. Runtime Behavior of an OpenMP Application - Answer (click here) 182. Variable Scoping Inside OpenMP Parallel Regions - Answer (click here) 183. Parallelizing Reductions Using OpenMP - Answer (click here) 184. Accessing Private Data Outside the Parallel Region - Answer (click here) 185. Improving Work Distribution Using Scheduling - Answer (click here) 186. Using Parallel Sections to Perform Independent Work - Answer (click here) 187. Nested Parallelism - Answer (click here) 188. Using OpenMP for Dynamically Defined Parallel Tasks - Answer (click here) 189. Keeping Data Private to Threads - Answer (click here) 190. Controlling the OpenMP Runtime Environment - Answer (click here) 191. Waiting for Work to Complete - Answer (click here) 192. Restricting the Threads That Execute a Region of Code - Answer (click here) 193. Ensuring That Code in a Parallel Region Is Executed in Order - Answer (click here) 194. Collapsing Loops to Improve Workload Balance - Answer (click here) 195. Enforcing Memory Consistency - Answer (click here) 196. An Example of Parallelization - Answer (click here) 197. Hand-Coded Synchronization and Sharing - Answer (click here) 198. Atomic Operations - Answer (click here) 199. Using Compare and Swap Instructions to Form More Complex Atomic Operations - Answer (click here) 200. Enforcing Memory Ordering to Ensure Correct Operation - Answer (click here) 201. Compiler Support of Memory-Ordering Directives - Answer (click here) 202. Reordering of Operations by the Compiler - Answer (click here) 203. Volatile Variables - Answer (click here) 204. Operating System–Provided Atomics - Answer (click here) 205. Lockless Algorithms - Answer (click here) 206. Dekker’s Algorithm - Answer (click here) 207. Producer-Consumer with a Circular Buffer - Answer (click here) 208. Scaling to Multiple Consumers or Producers - Answer (click here) 209. Scaling the Producer-Consumer to Multiple Threads - Answer (click here) 210. Modifying the Producer-Consumer Code to Use Atomics - Answer (click here) 211. The ABA Problem - Answer (click here) 212. Scaling with Multicore Processors - Answer (click here) 213. Constraints to Application Scaling - Answer (click here) 214. Hardware Constraints to Scaling - Answer (click here) 215. Bandwidth Sharing Between Cores - Answer (click here) 216. False Sharing - Answer (click here) 217. Cache Conflict and Capacity - Answer (click here) 218. Pipeline Resource Starvation - Answer (click here) 219. Operating System Constraints to Scaling - Answer (click here) 220. Multicore Processors and Scaling - Answer (click here)
Chapter 10 Other Parallelization Technologies
1. Why Parallel Computing? - Answer (click here) 2. Why We Need Ever-Increasing Performance - Answer (click here) 3. Why We’re Building Parallel Systems - Answer (click here) 4. Why we Need to Write Parallel Programs - Answer (click here) 5. How Do We Write Parallel Programs? - Answer (click here) 6. Concurrent, Parallel, Distributed - Answer (click here) 7. Parallel Hardware and Parallel Software - Answer (click here) 8. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here) 9. Modifications to the Von Neumann Model - Answer (click here) 10. Parallel Hardware - Answer (click here) 11. Parallel Software - Answer (click here) 12. Input and Output - Answer (click here) 13. Performance of Parallel Programming - Answer (click here) 14. Parallel Program Design with example - Answer (click here) 15. Writing and Running Parallel Programs - Answer (click here) 16. Assumptions - Parallel Programming - Answer (click here) 17. Distributed-Memory Programming with MPI - Answer (click here) 18. The Trapezoidal Rule in MPI - Answer (click here) 19. Dealing with I/O - Answer (click here) 20. Collective Communication - Answer (click here) 21. MPI Derived Datatypes - Answer (click here) 22. Performance Evaluation of MPI Programs - Answer (click here) 23. A Parallel Sorting Algorithm - Answer (click here) 24. Shared-Memory Programming with Pthreads - Answer (click here) 25. Processes, Threads, and Pthreads - Answer (click here) 26. Pthreads - Hello, World Program - Answer (click here) 27. Matrix-Vector Multiplication - Answer (click here) 28. Critical Sections - Answer (click here) 29. Busy-Waiting - Answer (click here) 30. Mutexes - Answer (click here) 31. Producer-Consumer Synchronization and Semaphores - Answer (click here) 32. Barriers and Condition Variables - Answer (click here) 33. Read-Write Locks - Answer (click here) 34. Caches, Cache Coherence, and False Sharing - Answer (click here) 35. Thread-Safety - Answer (click here) 36. Shared-Memory Programming with OpenMP - Answer (click here) 37. The Trapezoidal Rule - Answer (click here) 38. Scope of Variables - Answer (click here) 39. The Reduction Clause - Answer (click here) 40. The parallel For Directive - Answer (click here) 41. More About Loops in Openmp: Sorting - Answer (click here) 42. Scheduling Loops - Answer (click here) 43. Producers and Consumers - Answer (click here) 44. Caches, Cache Coherence, and False Sharing - Answer (click here) 45. Thread-Safety - Answer (click here) 46. Parallel Program Development - Answer (click here) 47. Two n-Body Solvers - Answer (click here) 48. Parallelizing the basic solver using OpenMP - Answer (click here) 49. Parallelizing the reduced solver using OpenMP - Answer (click here) 50. Evaluating the OpenMP codes - Answer (click here) 51. Parallelizing the solvers using pthreads - Answer (click here) 52. Parallelizing the basic solver using MPI - Answer (click here) 53. Parallelizing the reduced solver using MPI - Answer (click here) 54. Performance of the MPI solvers - Answer (click here) 55. Tree Search - Answer (click here) 56. Recursive depth-first search - Answer (click here) 57. Nonrecursive depth-first search - Answer (click here) 58. Data structures for the serial implementations - Answer (click here) 59. Performance of the serial implementations - Answer (click here) 60. Parallelizing tree search - Answer (click here) 61. A static parallelization of tree search using pthreads - Answer (click here) 62. A dynamic parallelization of tree search using pthreads - Answer (click here) 63. Evaluating the Pthreads tree-search programs - Answer (click here) 64. Parallelizing the tree-search programs using OpenMP - Answer (click here) 65. Performance of the OpenMP implementations - Answer (click here) 66. Implementation of tree search using MPI and static partitioning - Answer (click here) 67. Implementation of tree search using MPI and dynamic partitioning - Answer (click here) 68. Which API? - Answer (click here) 69. Hardware, Processes, and Threads - Answer (click here) 70. Examining the Insides of a Computer - Answer (click here) 71. The Motivation for Multicore Processors - Answer (click here) 72. Supporting Multiple Threads on a Single Chip - Answer (click here) 73. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here) 74. Using Caches to Hold Recently Used Data - Answer (click here) 75. Using Virtual Memory to Store Data - Answer (click here) 76. Translating from Virtual Addresses to Physical Addresses - Answer (click here) 77. The Characteristics of Multiprocessor Systems - Answer (click here) 78. How Latency and Bandwidth Impact Performance - Answer (click here) 79. The Translation of Source Code to Assembly Language - Answer (click here) 80. The Performance of 32-Bit versus 64-Bit Code - Answer (click here) 81. Ensuring the Correct Order of Memory Operations - Answer (click here) 82. The Differences Between Processes and Threads - Answer (click here) 83. Coding for Performance - Answer (click here) 84. Defining Performance - Answer (click here) 85. Understanding Algorithmic Complexity - Answer (click here) 86. Why Algorithmic Complexity Is Important - Answer (click here) 87. Using Algorithmic Complexity with Care - Answer (click here) 88. How Structure Impacts Performance - Answer (click here) 89. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here) 90. Using Libraries to Structure Applications - Answer (click here) 91. The Impact of Data Structures on Performance - Answer (click here) 92. The Role of the Compiler - Answer (click here) 93. The Two Types of Compiler Optimization - Answer (click here) 94. Selecting Appropriate Compiler Options - Answer (click here) 95. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here) 96. Using Profile Feedback - Answer (click here) 97. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here) 98. Identifying Where Time Is Spent Using Profiling - Answer (click here) 99. Commonly Available Profiling Tools - Answer (click here) 100. How Not to Optimize - Answer (click here) 101. Performance by Design - Answer (click here) 102. Identifying Opportunities for Parallelism - Answer (click here) 103. Using Multiple Processes to Improve System Productivity - Answer (click here) 104. Multiple Users Utilizing a Single System - Answer (click here) 105. Improving Machine Efficiency Through Consolidation - Answer (click here) 106. Using Containers to Isolate Applications Sharing a Single System - Answer (click here) 107. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here) 108. Using Parallelism to Improve the Performance of a Single Task - Answer (click here) 109. One Approach to Visualizing Parallel Applications - Answer (click here) 110. How Parallelism Can Change the Choice of Algorithms - Answer (click here) 111. Amdahl’s Law - Answer (click here) 112. Determining the Maximum Practical Threads - Answer (click here) 113. How Synchronization Costs Reduce Scaling - Answer (click here) 114. Parallelization Patterns - Answer (click here) 115. Data Parallelism Using SIMD Instructions - Answer (click here) 116. Parallelization Using Processes or Threads - Answer (click here) 117. Multiple Independent Tasks - Answer (click here) 118. Multiple Loosely Coupled Tasks - Answer (click here) 119. Multiple Copies of the Same Task - Answer (click here) 120. Single Task Split Over Multiple Threads - Answer (click here) 121. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here) 122. Division of Work into a Client and a Server - Answer (click here) 123. Splitting Responsibility into a Producer and a Consumer - Answer (click here) 124. Combining Parallelization Strategies - Answer (click here) 125. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here) 126. Antidependencies and Output Dependencies - Answer (click here) 127. Using Speculation to Break Dependencies - Answer (click here) 128. Critical Paths - Answer (click here) 129. Identifying Parallelization Opportunities - Answer (click here) 130. Synchronization and Data Sharing - Answer (click here) 131. Data Races - Answer (click here) 132. Using Tools to Detect Data Races - Answer (click here) 133. Avoiding Data Races - Answer (click here) 134. Synchronization Primitives - Answer (click here) 135. Mutexes and Critical Regions - Answer (click here) 136. Spin Locks - Answer (click here) 137. Semaphores - Answer (click here) 138. Readers-Writer Locks - Answer (click here) 139. Barriers - Answer (click here) 140. Atomic Operations and Lock-Free Code - Answer (click here) 141. Deadlocks and Livelocks - Answer (click here) 142. Communication Between Threads and Processes - Answer (click here) 143. Storing Thread-Private Data - Answer (click here) 144. Using POSIX Threads - Answer (click here) 145. Creating Threads - Answer (click here) 146. Compiling Multithreaded Code - Answer (click here) 147. Process Termination - Answer (click here) 148. Sharing Data Between Threads - Answer (click here) 149. Variables and Memory - Answer (click here) 150. Multiprocess Programming - Answer (click here) 151. Sockets - Answer (click here) 152. Reentrant Code and Compiler Flags - Answer (click here) 153. Windows Threading - Answer (click here) 154. Creating Native Windows Threads - Answer (click here) 155. Terminating Threads - Answer (click here) 156. Creating and Resuming Suspended Threads - Answer (click here) 157. Using Handles to Kernel Resources - Answer (click here) 158. Methods of Synchronization and Resource Sharing - Answer (click here) 159. An Example of Requiring Synchronization Between Threads - Answer (click here) 160. Protecting Access to Code with Critical Sections - Answer (click here) 161. Protecting Regions of Code with Mutexes - Answer (click here) 162. Slim Reader/Writer Locks - Answer (click here) 163. Signaling Event Completion to Other Threads or Processes - Answer (click here) 164. Wide String Handling in Windows - Answer (click here) 165. Creating Processes - Answer (click here) 166. Sharing Memory Between Processes - Answer (click here) 167. Inheriting Handles in Child Processes - Answer (click here) 168. Naming Mutexes and Sharing Them Between Processes - Answer (click here) 169. Communicating with Pipes - Answer (click here) 170. Communicating Using Sockets - Answer (click here) 171. Atomic Updates of Variables - Answer (click here) 172. Allocating Thread-Local Storage - Answer (click here) 173. Setting Thread Priority - Answer (click here) 174. Using Automatic Parallelization and OpenMP - Answer (click here) 175. Using Automatic Parallelization to Produce a Parallel Application - Answer (click here) 176. Identifying and Parallelizing Reductions - Answer (click here) 177. Automatic Parallelization of Codes Containing Calls - Answer (click here) 178. Assisting Compiler in Automatically Parallelizing Code - Answer (click here) 179. Using OpenMP to Produce a Parallel Application - Answer (click here) 180. Using OpenMP to Parallelize Loops - Answer (click here) 181. Runtime Behavior of an OpenMP Application - Answer (click here) 182. Variable Scoping Inside OpenMP Parallel Regions - Answer (click here) 183. Parallelizing Reductions Using OpenMP - Answer (click here) 184. Accessing Private Data Outside the Parallel Region - Answer (click here) 185. Improving Work Distribution Using Scheduling - Answer (click here) 186. Using Parallel Sections to Perform Independent Work - Answer (click here) 187. Nested Parallelism - Answer (click here) 188. Using OpenMP for Dynamically Defined Parallel Tasks - Answer (click here) 189. Keeping Data Private to Threads - Answer (click here) 190. Controlling the OpenMP Runtime Environment - Answer (click here) 191. Waiting for Work to Complete - Answer (click here) 192. Restricting the Threads That Execute a Region of Code - Answer (click here) 193. Ensuring That Code in a Parallel Region Is Executed in Order - Answer (click here) 194. Collapsing Loops to Improve Workload Balance - Answer (click here) 195. Enforcing Memory Consistency - Answer (click here) 196. An Example of Parallelization - Answer (click here) 197. Hand-Coded Synchronization and Sharing - Answer (click here) 198. Atomic Operations - Answer (click here) 199. Using Compare and Swap Instructions to Form More Complex Atomic Operations - Answer (click here) 200. Enforcing Memory Ordering to Ensure Correct Operation - Answer (click here) 201. Compiler Support of Memory-Ordering Directives - Answer (click here) 202. Reordering of Operations by the Compiler - Answer (click here) 203. Volatile Variables - Answer (click here) 204. Operating System–Provided Atomics - Answer (click here) 205. Lockless Algorithms - Answer (click here) 206. Dekker’s Algorithm - Answer (click here) 207. Producer-Consumer with a Circular Buffer - Answer (click here) 208. Scaling to Multiple Consumers or Producers - Answer (click here) 209. Scaling the Producer-Consumer to Multiple Threads - Answer (click here) 210. Modifying the Producer-Consumer Code to Use Atomics - Answer (click here) 211. The ABA Problem - Answer (click here) 212. Scaling with Multicore Processors - Answer (click here) 213. Constraints to Application Scaling - Answer (click here) 214. Hardware Constraints to Scaling - Answer (click here) 215. Bandwidth Sharing Between Cores - Answer (click here) 216. False Sharing - Answer (click here) 217. Cache Conflict and Capacity - Answer (click here) 218. Pipeline Resource Starvation - Answer (click here) 219. Operating System Constraints to Scaling - Answer (click here) 220. Multicore Processors and Scaling - Answer (click here) 221. Other Parallelization Technologies - Answer (click here) 222. GPU-Based Computing - Answer (click here) 223. Language Extensions - Answer (click here) 224. Alternative Languages - Answer (click here) 225. Clustering Technologies - Answer (click here) 226. Transactional Memory - Answer (click here) 227. Vectorization - Answer (click here)