← Older edit
Newer edit →
Test Team Please Ignore
1,529 bytes added
01:16, 17 December 2015
=== Assignment 3 ===
This chart illustrates the program's run time comparison through the assignments, going from a serial approach, to a naive GPU approach, to a more optimized approach.
The original algorithm used doubles to do all of the calculations so the biggest increase in speed was gained from switching from the data type to float.
While this reduced the precision, the end result looked about the same, and extreme precision isn't needed for this visualization anyways.
Additional speed increases were gained from reducing global memory access in the kernels and taking some initialization work outside.
Furthermore, we've used Thrust to speed up a few more parts of the calculations that were not using the GPU previously.
Lastly, we've reduced the number of memory allocations by allocating a single large block and splitting it up manually.
There was one large part of the program's run time we could not optimize during this course. The image was generated in the ppm format, which is an ascii text file.
The slowest part was going through the entire array of pixels and composing the string output. Even an image of 5000 pixels on a side would result in a file gigabytes in size.
While I didn't realize it at the time, we could calculate a maximum width for each row and then allocate enough memory and split the work by giving each thread a row to work on.
An alternative to this is to create the image in a more familiar binary format like PNG or JPG, which lend themselves to optimization much easier.
Retrieved from "
SICT AR Meeting Area
get involved with CDOT
as a Student
as an Open Source Community Member
as a Company
Real World Mozilla
Course Project List
Potential Course Projects
About CDOT Wiki