Open main menu

CDOT Wiki β

Changes

AAA Adrina Arsa Andriy

2,052 bytes added, 14:26, 4 December 2014
Assignment 3
<u>Regular</u>
 
Guess a letter: a
Search Time: - took - <u>0.012000 secs</u>
=== Assignment 3 ===
For or assignment 3 we did a few things to speed up the program, and we were able to observe an approximate speed up of around 50%.
 
[[Image:Hang man graph.png|thumb|800px|center]]
 
To observe this speed up we removed thread divergence from the kernels, and we removed some unnecessary memory copies.
 
By removing thread divergence, we initially saw a speed up of around 10%. We expected this speed up to be small, since when running our code, the majority of time was spent in the memory copy phase. To speed the process up slightly more, we also used shared memory within the kernel. After the kernels had all the updates applied, we realized a speed up slightly over 10%.
 
When we took a look at the memory copies we used in the previous version, we realized that there were several copies that did not need to be done. We also realized that some of the copies could be moved around for more efficiency. After moving these copies, we realized a speed up of just over 40%.
 
Our last version of the program now runs approximately 50% faster than the previous version.
 
 
'''To calculate the number of blocks per thread we used the CUDA calculator.'''
[[Image:Nvidia Occupancy Calculator on Code.jpg|thumb|800px|center]]
In the previous version we dynamically found the number of threads per block, we could not dynamically use the information in this version due to the fact that shared memory was used. On the school lab computers the NBPT was 1024.
 
'''Real World Application'''
To make the application more "real world" friendly, we were able to make test data load form a large dictionary file. This makes it so you can search for real words instead of gibberish.
 
'''What Would We Do Different?'''
We would have spent more time on our A1s. When we picked out A1 programs we tried to find programs that were cool, and had unique uses. We profiled the programs without taking an in depth look at the code base, and when it came to picking a topic for A2, we were stuck with only one program, since the other two were much too complex.