Unique Project Page

From CDOT Wiki
Revision as of 14:00, 24 February 2017 by Jkraitberg (talk | contribs)
Jump to: navigation, search

Profiling

Introduction : GPU Benchmarking/Testing using Mandelbrot Sets : Kartik Nagarajan

This program generates Mandelbrot sets using CPU's and then saves them to the folder as png's using the freeimage library.

The program is open-source and can be fetched directly from GitHub from https://github.com/sol-prog/Mandelbrot_Set

To compile the program, FreeImage is required to be installed.


Compilation Instructions:

For Unix based systems:

         g++ -std=c++11 save_image.cpp utils.cpp mandel.cpp -lfreeimage

OSX:

         clang++ -std=c++11 save_image.cpp utils.cpp mandel.cpp -lfreeimage

The program can then be executed by running the compiled binary and it will display the time it took to generate the Mandelbrot set and save the pictures.

Observations

The program takes a significant amount of time to run as the calculations are being done on the CPU. There are nested loops present within the program that can be parallelized to make the program faster.

The code also has the size of the image and the iterations hard-coded which can be modified to make the program significantly longer to process and make it tough on the GPU's for benchmarking and stability testing by running the process in a loop. The code is relatively straight forward and the parallelization should also be easy to implement and test.


Hotspot

Hotspot for the program was found in the fractal() function which calls the get_iterations() function that contains 2-nested for loops and a call to escape() which contains a while loop. Profiling the runtime with Instruments on OSX displayed that the fractal() function took up the most amount of runtime and this is the function that will be parallelized using CUDA. Once the function is parallelized, the iterations and size of the image can be increased in order to make the computation relatively stressful on the GPU to get a benchmark or looped in order to do stress testing for GPUs.


Profiling Data Screenshots

Profile - Profile

Hotspot Code - Hotspot Code


Introduction : GPU Benchmarking/Testing for NBody : Joshua Kraitberg

This program uses Newtonian mechanics and a four-order symplectic Candy-Rozmus integration (a symplectic algorithm guarantees exact conservation of energy and angular momentum). The initial conditions are obtained from JPL Horizons, ahd constants (like masses, gravitational constant) are those recommended by the International Astronomical Union. The program currently does not take into account effects like general relativity, the non-spherical shapes of celestial objects, tidal effects on Earth, etc. It also does not take the 500 asteroids used by JPL Horizons into accound in its model of the Solar System.

Source

Compilation Instructions:

For Unix/Linux based systems:

   g++ -std=c++11 c++/nbody.cpp

Observations

The program is quite fast for being a single-threaded CPU application. Almost all the CPU time is spent manipulating data or iterating vectors.

Hotspot

Essentially all the time spent running is spent in the dowork function. The do work function iteratively calls the CRO_step function found in integrators.h file. The CRO_step function is where most of the vector calculations take place.

Profiling Data and Screenshots