Changes

Jump to: navigation, search

GPU610/SSD

3,520 bytes added, 15:35, 19 April 2013
Closest Pair
This code performs harmonic analysis of turn-by-turn data from a circular acceleration (measured data or from numerical simulation).
It is heavily used in the optics re-construction for the LHC( [http://en.wikipedia.org/wiki/Large_Hadron_Collider Large Hadron Collider]) and the same tools are being exported to the lower-energy machines at CERN.  SUSSIX is a FORTRAN program for the post processing of turn-by-turn BPM (Beat beat per minute) data, which computes the frequency, amplitude, and phase of tunes and resonant lines to a high degree of precision through the use of an interpolated FFT ([http://en.wikipedia.org/wiki/Fast_Fourier_transform fast Fourier transform]).
For analysis of LHC BPM data a specific version sussix4drive (the FORTRAN file code), run through the C steering code ''Drive God lin'' (the C file code), has been implemented in the CCC by the beta-beating team
To see the whole Document [http://matrix.senecac.on.ca/~sganouts/SUSSIX/Sussix_Project.doc click here]
 
== Team Members ==
# [mailto:sganouts@myseneca.ca?subject=GPU610 Sezar Gantous]
I decided to profile CERN project - Drive_God_Lin.
After gprof the project with the test data provided I learned the following:
 
System specifications:
 
OS: Xubuntu (Ubuntu 12.10 (quantal)) 36-bit
CUP: Intel(R) Pentium(R) 4 CPU 3.00GHz
RAM: 1GB DDR2
GPU: nVIDIA GEFORCE GT 620
(summery of gprof)
=== Assignment 2 ===
 
Progress so far:
<br/>
- Fortran code converted to C with a simple conversion program found [http://www.webmo.net/support/f2c_linux.html Here]
 
 
- The Project is on [https://github.com/sezar-gantous/GPU610-CERN Github](with the modified make file and c converted code)
 
The profile with the C converted code
 
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
38.75 19.47 19.47 314400 0.06 0.06 zfunr_
30.04 34.56 15.09 524 28.80 28.80 ordres_
9.83 39.50 4.94 f__cabs
 
As you can see there aren't any real difference/improvements just yet...
 
System specifications:
 
OS: Xubuntu (Ubuntu 12.10 (quantal)) 36-bit
CUP: Intel(R) Pentium(R) 4 CPU 3.00GHz
RAM: 1GB DDR2
GPU: nVIDIA GEFORCE GT 620
 
== Closest Pair ==
 
'''Unfortunately we had to abandon the project since it was not really compatible with CUDA(there is no time to look more into it as the semester is almost over...). Team TudyBert, who are also working on the same project, have found that CUDA uses dynamic libraries where Dirve_God_Lin uses static libraries and there isn’t much to be done about that. As a result, we will be using Stephanie’s closet pair program.'''
 
 
The closest pair problem can be explained simply by imagining a random set of points spread in a [http://en.wikipedia.org/wiki/Metric_space metric space]. Now the process of finding two points with the smallest distance between them is called the closest pair problem.
 
 
One of the more common algorithms used to find the closest pair is the Brute-force algorithm; which is calculating the distances of all the points ( O(n^2) notation):
 
n=total number of points
n(n-1)/2
 
 
then simply look for the pair of points that has the smallest distance between each other. However, this algorithm was evident to be slow.
This is the function used in the closestPair.c program that Stephanie used for her assignment 1:
 
double brute_force(point* pts, int max_n, point *a, point *b)
{
int i, j;
double d, min_d = MAXDOUBLE;
for (i = 0; i < max_n; i++) {
for (j = i + 1; j < max_n; j++) {
d = dist(pts[i], pts[j]);
if (d >= min_d ) continue;
*a = pts[i];
*b = pts[j];
min_d = d;
}
}
return min_d;
}
 
This is the profile:
 
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
99.78 41.58 41.58 16384 2.54 2.54 brute_force
0.14 41.64 0.06 closest
0.05 41.66 0.02 cmp_x
0.02 41.67 0.01 cmp_y
 
 
System specifications:
 
OS: Ubuntu 12.04 32-bit
CPU: Intel(R) Core 2 Duo(R) CPU E4600 @ 2.40GHz x 2
RAM: 2GB DDR2
GPU: nVIDIA GEFORCE GT 620
 
 
As a consequence, we presume that by making the brute force function parallel using CUDA technology will speed up the process significantly.
 
Github link [https://github.com/sezar-gantous/GPU610-ClosestPair here]
 
=== Assignment 3 ===
1
edit

Navigation menu