For this assignment, I selected one application that I wished to parallelize. I profiled it to find the hotspot of the application and determine if it was feasible to speed up using the GPU.
My application that I selected is a PI approximation
method. PI can be approximated in a number of ways however I chose to use the dartboard algorithm. Although not the fastest, the algorithm is very feasible to parallelize. The main idea behind it can be compared to a dartboard – you throw a random number (n) of darts at the board and note down the darts that have landed within it and those that have not. The image below demonstrates this concept:
When the target number is less than a million, the program runs in under a second. Every million items after that though took one more cumulative second then the previous. 1M = 1.104212 secs, 2M = 2.912168 secs, 3M = 5.026149 secs, 4M = 7.468770 secs, and 5M = 10.654563 secs. As you can see, while this is fairly efficient, the program could still use some help spreading the load around.
This seems like a pretty good candidate for a program to be parallelized as It calculates tons of numbers sequentially when ideally it could just do a bunch of them at a time
=== Assignment 2 ===
=== Assignment 3 ===