Difference between revisions of "TudyBert"

From CDOT Wiki
Jump to: navigation, search
Line 1: Line 1:
 
{{GPU610/DPS915 Index | 20131}}
 
{{GPU610/DPS915 Index | 20131}}
 
= Drive_God =
 
= Drive_God =
Tudy and I are working on a branch of the LHC BPM analysis code to be made parallel using GPUs. The branch can be found [https://github.com/rogeliotomas/Drive_GPU here] on GitHub.   
+
Tudy and I are working on a branch of the LHC (Large Hadron Collider) BPM (beam position measurement) analysis code to be made parallel using GPUs. The branch can be found [https://github.com/rogeliotomas/Drive_GPU here] on GitHub.   
 +
This code performs harmonic analysis of turn-by-turn data from a circular acceleration (measured data or from numerical simulation). It is heavily used in the optics re-construction for the LHC and the same tools are being exported to the lower-energy machines at CERN.
  
 +
== Team Members ==
 +
# [mailto:rwstanica@myseneca.ca?subject=dps915 Robert Stanica]
 +
# [mailto:tminea-manolachi@myseneca.ca?subject=dps915 Tudy Minea]
 +
# [mailto:rwstanica@myseneca.ca,tminea@myseneca.ca?subject=dps915 Email All]
 +
 +
== Progress ==
 +
=== Assignment 1 ===
 +
==[[User:rwstanica| Robert Stanica]]:==
 +
Analysis of all LHC BPMs is a major real time computational bottleneck. As such, an effort has been underway to decrease the real computational time of the SUSSIX codes by paralellizing it. This has been achieved using the OpenMP API to parallelize both the Fortran and C implementations of the code. We can then safely assume that the code can also be successfully paralellized to run off the GPU. 
 +
 +
 +
For Assignment 1, I profiled '''sussix4drivexxNoO.f''' and '''Drive_God_lin.c.'''. The most time was spent on two subroutines in the Fortran portion of the program. 
  
“This code performs harmonic analysis of turn-by-turn data from a circular acceleration (measured data
 
  
or from numerical simulation). It is heavily used in the optics re-construction for the LHC and the same
+
time  seconds  seconds    calls  ms/call  ms/call  name   
 +
47.04      5.80    5.80  311575    0.02    0.02  zfunr_
 +
25.63      8.96    3.16      522    6.05    6.05  ordres_
 +
  8.43    10.00    1.04  312021    0.00    0.00  cfft_
 +
  3.89    10.48    0.48  312559    0.00    0.02  tunelasr_
  
tools are being exported to the lower-energy machines at CERN.”
 
  
== Team Members ==
 
# [mailto:rwstanica@myseneca.ca?subject=dps915 Robert Stanica], Some responsibility
 
# [mailto:tminea-manolachi@myseneca.ca?subject=dps915 Tudy Minea], Some other responsibility
 
# [mailto:rwstanica@myseneca.ca,tminea@myseneca.ca?subject=dps915 Email All]
 
  
== Progress ==
+
'''zfunr''' and '''ordres''' take the majority of the total run time so we'll focus on parallelizing these two subroutines first. 
=== Assignment 1 ===
+
 
 
=== Assignment 2 ===
 
=== Assignment 2 ===
 
=== Assignment 3 ===
 
=== Assignment 3 ===

Revision as of 17:29, 12 February 2013


GPU610/DPS915 | Student List | Group and Project Index | Student Resources | Glossary

Drive_God

Tudy and I are working on a branch of the LHC (Large Hadron Collider) BPM (beam position measurement) analysis code to be made parallel using GPUs. The branch can be found here on GitHub. This code performs harmonic analysis of turn-by-turn data from a circular acceleration (measured data or from numerical simulation). It is heavily used in the optics re-construction for the LHC and the same tools are being exported to the lower-energy machines at CERN.

Team Members

  1. Robert Stanica
  2. Tudy Minea
  3. Email All

Progress

Assignment 1

Robert Stanica:

Analysis of all LHC BPMs is a major real time computational bottleneck. As such, an effort has been underway to decrease the real computational time of the SUSSIX codes by paralellizing it. This has been achieved using the OpenMP API to parallelize both the Fortran and C implementations of the code. We can then safely assume that the code can also be successfully paralellized to run off the GPU.


For Assignment 1, I profiled sussix4drivexxNoO.f and Drive_God_lin.c.. The most time was spent on two subroutines in the Fortran portion of the program.


time   seconds   seconds    calls  ms/call  ms/call  name    
47.04      5.80     5.80   311575     0.02     0.02  zfunr_
25.63      8.96     3.16      522     6.05     6.05  ordres_
 8.43     10.00     1.04   312021     0.00     0.00  cfft_
 3.89     10.48     0.48   312559     0.00     0.02  tunelasr_


zfunr and ordres take the majority of the total run time so we'll focus on parallelizing these two subroutines first.

Assignment 2

Assignment 3