Changes

Jump to: navigation, search

GPU610/gpuchill

3,161 bytes added, 23:16, 15 February 2019
Sudoku Brute Force Solver
I believe that if a GPU was used to enhance this program one would see a great increase of speed. All of the check functions essentially do the same thing, iterating through possible inserted values for any that violate the rules. If one is able to unload all of these iterations onto the GPU then there should be a corresponding increase in speed.
 
===Christopher Ginac Image Processing Library===
 
I decided to profile a single user created image processing library written by Christopher Ginac, you can follow his post of the library [https://www.dreamincode.net/forums/topic/76816-image-processing-tutorial/ here]. His library enables the user to play around with .PGM image formats. If given the right parameters, users have the following options:
 
<pre>
What would you like to do:
[1] Get a Sub Image
[2] Enlarge Image
[3] Shrink Image
[4] Reflect Image
[5] Translate Image
[6] Rotate Image
[7] Negate Image
</pre>
 
I went with the Enlarge option to see how long that would take. In order for me to do this, I had to test both the limits of the program and my own seneca machine allowed space, in order to do this, I had to use a fairly large image. However, since the program creates a second image, my Seneca account ran out of space for the new image, so the program could not write out the newly enlarged image. So I had to settle on an image that was 16.3MB max, so that it could write a new one, totally in 32.6MB of space.
 
<pre>
real 0m10.595s
user 0m5.325s
sys 0m1.446s
</pre>
Which isn't really bad, but when we look deeper, we see where most of our time is being spent
 
<pre>
Flat profile:
 
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
21.74 1.06 1.06 1 1.06 1.06 Image::operator=(Image const&)
21.33 2.10 1.04 2 0.52 0.52 Image::Image(int, int, int)
18.66 3.01 0.91 154056114 0.00 0.00 Image::getPixelVal(int, int)
15.59 3.77 0.76 1 0.76 2.34 Image::enlargeImage(int, Image&)
14.97 4.50 0.73 1 0.73 1.67 writeImage(char*, Image&)
3.69 4.68 0.18 2 0.09 0.09 Image::Image(Image const&)
2.67 4.81 0.13 17117346 0.00 0.00 Image::setPixelVal(int, int, int)
0.82 4.85 0.04 1 0.04 0.17 readImage(char*, Image&)
0.62 4.88 0.03 1 0.03 0.03 Image::getImageInfo(int&, int&, int&)
0.00 4.88 0.00 4 0.00 0.00 Image::~Image()
0.00 4.88 0.00 3 0.00 0.00 std::operator|(std::_Ios_Openmode, std::_Ios_Openmode)
0.00 4.88 0.00 1 0.00 0.00 _GLOBAL__sub_I__ZN5ImageC2Ev
0.00 4.88 0.00 1 0.00 0.00 readImageHeader(char*, int&, int&, int&, bool&)
0.00 4.88 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int)
</pre>
 
It seems most of our time in this part of the code is spent assigning our enlarged image to the now one, and also creating our image object in the first place. I think if we were to somehow use a GPU for this process, we would see an decrease in run-time for this part of the library. Also, there also seems to be room for improvement on the very 'Image::enlargeImage' function itself. I feel like by loading said functionality onto thje GPU, we can reduce it's 0.76s to something even lower.
=== Assignment 2 ===
=== Assignment 3 ===
29
edits

Navigation menu