Changes

GPUSquad

1,209 bytes added, 19:26, 11 April 2018

→‎Assignment 3

</source>

A NOTE ON SCALABILITY:

In our attempts to make the kernel scalable with ghost cells, we scaled along one dimension. However, we were inconsistent in our scaling. The 1D kernel scaled along the n (y) dimension while the 2d kernels scaled along the m (x) dimension. Scaling along the x dimension, while allowing results to be testable between serial and 2D parallelized versions of the code, produced distributions that were strangely banded and skewed. In other words, we made the code render weird things faster:

[[File:mDimensionScale.png]]

FINAL TIMINGS <pre style="color: red"> THE GRAPH IMMEDIATELY BELOW IS INCORRECT: there was an error recording the 1D runtimes for assignment 2</pre>

Unfortunately our group's inability to effectively use profiling tools has left this discrepancy as a mystery.

In conclusion, while it may be possible to parallelize the algorithm we chose well, the effort to do so would involve ensuring that shared memory is properly synchronized in two block dimensions (2 dimensions of ghost cells rather than the 1 we implemented), and to ensure that shared memory is allocated appropriately such that maximum occupancy is established within the GPU. Unfortunately, our attempts fell short, and while implementing constant memory seemed to speed up the kernel a bit, our solution was not fully scalable in both dimensions, and shared memory was not implemented in a way that improved kernel efficiency.

Moverall

41

edits

Changes

GPUSquad

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools