Open main menu

CDOT Wiki β

Changes

Sudo

1,687 bytes added, 19:39, 10 December 2015
Progress
=== Assignment 2 ===
 
 
 
The application which I had profiled in assignment 1 didn't end up being very viable for parallelization due to the design of the application. I was forced to search for a new application to parallelize, and I managed to come across a small steganography application which upon profiling it became quite apparent that by performing the encoding on the GPU would yield improvements in runtime.In order to fairly compare the serial application with the parallelized version I had to rewrite the serial code slightly to perform many of the same operations, with the only difference being the encoding algorithm.
 
 
 
 
The following graph displays the runtime (measured in microseconds) of encoding various txt file sizes (measured in kilobytes).
 
[[File:CudaSteganographyRuntime.png]]
 
 
 
 
The following is the serial code:
 
[[File:CudaSteganographySerial.png]]
 
 
 
 
 
The following is the kernel I wrote:
 
[[File:CudaSteganographyKernel.png]]
 
=== Assignment 3 ===
 
 
I improved on my previous application by cleaning some logic and adding optimization to it. I noticed that even on cards with a compute capability of 3.0 that it did not accept a grid with x dimensions larger than 65535, therefore I had to rewrite my code to adhere to the limitation. Within the kernel itself, there was opportunity to pre-fetch values into register memory, in order to reduce latency during operations on those values. There was no requirement for shared memory due to the fact that threads did not need to share any memory at all.
 
 
The following is the new kernel :
 
 
[[File:CudaSteganographyKernelA3.png]]
 
 
 
The following is run-time comparison between my old and my new kernel :
 
 
 
[[File:CudaSteganographyRuntimeA3.png]]