Changes

Jump to: navigation, search

Sirius

6 bytes added, 10:55, 9 April 2018
Assignment 3
Solution:
----
One way to address low compute utilization is to attempt to increase occupancy of each SM. According to Cuda's occupancy calculator the machine we were using for testing had a compute capability of 6.1. This means that each SM had 32 resident blocks and 2048 resident threads. To achieve maximum occupancy you would have 2048/32 = 64 threads/ block. To determine an appropriate grid size we would divide the total number of pixels by the 64 threads/block. This allows us to use dynamic grid sizing depending on the size of the image passed in.
<br><br>
66
edits

Navigation menu