Changes

← Older edit

GPU610/TeamKCM

370 bytes added, 15:20, 5 December 2014

no edit summary

===== Sample Data = 500 =====

[[File:a3 as3v1-500.png]]

===== Sample Data = 1000 =====

[[File:a3 as3v1-2500.png]]

===== Sample Data = 5000 =====

[[File:a3 as3v1-5000.png]]

===== Sample Data = 10000 =====

[[File:a3 as3v1-10000.png]]

==== Chart ====

[[File:a3 duration chart.png]]

=== Assignment 3 v2 ===

==== Description of v2 ====

Changed Double precision to single precision float

==== Sample Date = 500 ====

[[File:as3v2-500.png]]

==== Sample Date = 1000 ====

[[File:as3v2-1000.png]]

==== Sample Date = 5000 ====

[[File:as3v2-5000.png]]

==== Sample Date = 10000 ====

[[File:as3v2-10000.png]]

==== Char ====

[[File:Table.png]]

==== Conclusions / Problems Encountered ====

Using CUDA, Our team achieved around 8000% speed up in total run time compare to original project and final result. We were certain that by implementing kernel to the original project will result in huge speed up because a calculation was done in a loop(serial) to get each heat points in specific time. To accomplish the project, first we focused on understanding the original project to find out the "hot spot" in the code and different variables and their uses. Working with heat equation problem was challenging, because we had to understand how heat points are calculated with a equation and had to find out which part/variables are needed and not to develop a kernel. The original project did not have any dependent variables given from user input, so we had to decide which part of the equation we want to be dependent. As we develop kernel and work with different resources in the program we had to decide how to work with and approach different memories in the program(accessing global memory, using shared memory..etc).One of the principle logic the program used is, to get a new heat value a last heat value is needed for calculation. So, when developing kernel our team had to figure out how many times memory copies are need and what parts can be done in GPU side(shared,register memories...) In result, we managed to minimize the number of memory copies from CPU to GPU and use more shared and register memories.

Symoon

1

edit

Changes

GPU610/TeamKCM

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools