The image below demonstrates the Random Points within a Square to Calculate Pi concept:
The number of tasks are equals with the number of thrown dart. Every task executed on the GPU, and performs the calculation which verifies if the point is inside of the circle or not. This tasks are able to execute the work independently, because it does not requires any information’s from the other task. Finally the host gather all the synchronize data from the device, and calculates the size of the PI.
This loop is the hot spot of the previous program.
thumb| widthpx| ]]
During this assignment we converted our program structure to be more feasible for parallelization. We rewrote the program and changed the “for loop” from the previous program and we created a kernel which will execute the task on the device.