In [https://computing.llnl.gov/tutorials/parallel_comp/#ExamplesArray Blaise Barney's Notes on Array Processing], an example of Array Processing is discussed. The example "demonstrates calculations on 2-dimensional array element; a function is evaluated on each array element."
I used the pseudo-code provided to create a program that creates a 2-dimensional array. The purpose of the program is to create and populate a 2-dimensional array of size n (provided by the user) with random numbers. The code is available in the link below:
[https://github.com/jsidhu26/DPS915 Link to Array Processing code]
From the call graph file, it is evident that the generateRandom() function is an obvious hotspot
. It is hogging 100% of the execution time. The function consists of 2 for loops, one nested in the other, which makes the function have a Big-O notation of O(n^2).
The computations involved with each element in the array is independent from the rest of the elements, and therefore this function is a deserving candidate for parallelization. Additionally, the array elements can be evenly distributed into sub-arrays and a process can be assigned to each sub-array.
For Assignment 2, we decided to parallelize the application selected by Bruno.
In the code, the function that took up a significant amount of time was the calculateDimensions() function. The flat profile indicates that this function takes 97.67% of the execution time.
==== Identifying Parallelize-able Code ====
=== Assignment 3 ===
To optimize our code, we used shared memory inside the kernel. This reduced the run time for each problem size