Changes

Jump to: navigation, search

Kernal Blas

56 bytes added, 08:59, 4 April 2018
Assignment 2
However, the parallelized results seem to stay accurate throughout the iterations. <br>
It seems as though the calculation time doesn't change much and stays consistent. <br>
 
[[File:Cudamalloc.png]] <br>
[[File:prof.png]] <br>
 
Profiling the code shows that '''memcpy''' takes up most of the time spent. Even when <br>
there are 10 iterations, the time remains at 300 milliseconds. <br>
As the iteration passes 25 million, we have a bit of memory leak which results in inaccurate results. <br><br>
 
 
In order to optimize the code, we must find a way reduce the time memcpy takes.<br>
96
edits

Navigation menu