Difference between revisions of "BetaT"

From CDOT Wiki
Jump to: navigation, search
(Potential Speed Increase with Gustafsons Law)
Line 1: Line 1:
 
BetaT
 
BetaT
  
Assignment 1
+
== Assignment 1 ==
  
Profile Assessment
+
== Profile Assessment ==
  
  

Revision as of 19:14, 15 February 2017

BetaT

Assignment 1

Profile Assessment

Naiver Strokes equation for Flow Velocity.

There are a lot of different waves and equations, this one is based off the naiver-stokes equation.

All this program does is calculate the deviation of the wave. It will calculate the velocity and dip. The wave is only going in one direction and is going to drop but at what degree and velocity.

The wave equation takes the distance between wave trophs, so two waves and the distance between them. The height of the wave, and the amount of time it takes each wave to reach its destination. It will perform a calculation to give us the speed per second.

Navier–Stokes equations are useful because they describe the physics of many phenomena of scientific and engineering interest. They may be used to model the weather, ocean currents, water flow in a pipe and air flow around a wing. The Navier–Stokes equations in their full and simplified forms help with the design of aircraft and cars, the study of blood flow, the design of power stations, the analysis of pollution, and many other things. Coupled with Maxwell's equations they can be used to model and study magnetohydrodynamics. courtesy of wikipedia ("https://en.wikipedia.org/wiki/Navier%E2%80%93Stokes_equations")

The problem with this application comes in the main function trying to calculate the finite-difference

 // Finite-difference loop:
 for (int it=1; it<=nt-1; it++)
   {
     for (int k=0; k<=nx-1; k++)
   {
     un[k][it-1] = u[k][it-1];
   }
     for (int i=1; i<=nx-1; i++)
   {
     u[0][it] = un[1][it-1];
     u[i][it] = un[i][it-1] - c*dt/dx*(un[i][it-1]-un[i-1][it-1]);
   }
   }


The user inputs 2 values which will be used as a reference for the loop.

Testing the application

Tests ran with no optimization

Naiver Equation
n Time in Milliseconds
100 x 100 24
500 x 500 352
1000 x 1000 1090
2000 x 2000 3936
5000 x 5000 37799
5000 x 10000 65955
10000 x 10000 118682
12500 x 12500 220198


gprof

it gets a bit messy down there, but basically 89.19% of the program is spent in the main() calculating those for loops shown above. The additional time is spent allocating the memory which might cause some slowdown when transferring it to the GPU across the bus int he future.

But the main thing to take away here is that main() is 89.19% and takes 97 seconds.

Each sample counts as 0.01 seconds.

 %   cumulative   self              self     total           
time   seconds   seconds    calls   s/call   s/call  name    
89.19     97.08    97.08                             main
 4.73    102.22     5.14 1406087506     0.00     0.00  std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::operator[](unsigned int)
 4.49    107.11     4.88 1406087506     0.00     0.00  std::vector<double, std::allocator<double> >::operator[](unsigned int)

Potential Speed Increase with Amdahls Law

Using Amdahls Law ---- > Sn = 1 / ( 1 - P + P/n )

We can examine how fast out program is capable of increasing its speed.

P = is the part of the program we want to optimize which from above is 89.17% n = the amount of processors we will use. One GPU card has 384 processors or CUDA cores and another GPU we will use has 1020 processor or CUDA cores.

Applying the algorithm gives us.

Amdahls Law for GPU with 384 Cores---- > Sn = 1 / ( 1 - 0.8919 + 0.8919/384 )

                                        Sn = 9.0561125222

Amdahls Law for GPU with 1024 Cores---- > Sn = 1 / ( 1 - 0.8919 + 0.8919/1024 )

                                         Sn = 9.176753777

Therefor According to Amdahls law we can expect a 9x increase in speed. 97 seconds to execute main / 9 amdahls law = 10.7777 seconds to execute after using GPU Interestingly according to the law the difference in GPU cores does not significantly increase speed. Future tests will confirm or deny these results.


Potential Speed Increase with Gustafsons Law

Gustafsons Law S(n) = n - ( 1 - P ) ∙ ( n - 1 )

(Quadro K2000 GPU) S = 380 - ( 1 - .8918 ) * ( 380 - 1 ) = 339.031

(GeForce GTX960 GPU) S = 1024 - ( 1 - .8918 ) * ( 1024 - 1 ) = 913.3114


Using Gustafsons law we see drastic changes in the amount speed increase, this time the additional Cores made a big difference and applying these speed ups we get:

(Quadro K2000 GPU) 97 seconds to execute / 339.031 = 0.29

(GeForce GTX960 GPU) 97 seconds to execute / 913.3114 = 0.11