# Difference between revisions of "Group 6"

(→Assignment 1 - Select and Assess) |
(→Assignment 1 - Select and Assess) |
||

Line 9: | Line 9: | ||

== Progress == | == Progress == | ||

− | == Assignment 1 - Select and Assess == | + | === Assignment 1 - Select and Assess === |

=== Array Processing === | === Array Processing === | ||

Subject: Array Processing | Subject: Array Processing | ||

− | https://computing.llnl.gov/tutorials/parallel_comp/ | + | Blaise Barney introduced Parallel Computing https://computing.llnl.gov/tutorials/parallel_comp/ |

+ | Array processing could become one of the parallel example, which "demonstrates calculations on 2-dimensional array elements; a function is evaluated on each array element." | ||

+ | |||

+ | Standard random method is used to initialize a 2-dimentional array. The purpose of this program is to perform a 2-dimension array calculation, which is a matrix-matrix multiplication in this example. | ||

+ | |||

+ | In this following profile example, n = 1000 | ||

+ | |||

+ | Flat profile: | ||

+ | |||

+ | Each sample counts as 0.01 seconds. | ||

+ | % cumulative self self total | ||

+ | time seconds seconds calls Ts/call Ts/call name | ||

+ | 100.11 1.48 1.48 multiply(float**, float**, float**, int) | ||

+ | 0.68 1.49 0.01 init(float**, int) | ||

+ | 0.00 1.49 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z4initPPfi | ||

+ | |||

+ | |||

+ | Call graph | ||

+ | |||

+ | |||

+ | granularity: each sample hit covers 2 byte(s) for 0.67% of 1.49 seconds | ||

+ | |||

+ | index % time self children called name | ||

+ | <spontaneous> | ||

+ | [1] 99.3 1.48 0.00 multiply(float**, float**, float**, int) [1] | ||

+ | ----------------------------------------------- | ||

+ | <spontaneous> | ||

+ | [2] 0.7 0.01 0.00 init(float**, int) [2] | ||

+ | ----------------------------------------------- | ||

+ | 0.00 0.00 1/1 __libc_csu_init [16] | ||

+ | [10] 0.0 0.00 0.00 1 _GLOBAL__sub_I__Z4initPPfi [10] | ||

+ | ----------------------------------------------- | ||

+ | � | ||

+ | Index by function name | ||

+ | |||

+ | [10] _GLOBAL__sub_I__Z4initPPfi (arrayProcessing.cpp) [2] init(float**, int) [1] multiply(float**, float**, float**, int) | ||

+ | |||

+ | From the call graph, multiply() took major runtime more than 99%, as it contains 3 for-loop, which is O(n^3). Besides, init() also became the second busy one, which has a O(n^2). | ||

+ | |||

+ | As the calculation of elements is independent of one another - leads to an embarrassingly parallel solution. Arrays elements are evenly distributed so that each process owns a portion of the array (subarray). It can be solved in less time with multiple compute resources than with a single compute resource. | ||

=== The Monte Carlo Simulation (PI Calculation) === | === The Monte Carlo Simulation (PI Calculation) === | ||

Line 27: | Line 66: | ||

[[File:Yihang.JPG]] | [[File:Yihang.JPG]] | ||

− | + | === Zhijian === | |

Subject: | Subject: | ||

## Revision as of 22:58, 16 March 2019

GPU610/DPS915 | Student List | Group and Project Index | Student Resources | Glossary

## Contents

# Group 6

## Team Members

## Progress

### Assignment 1 - Select and Assess

### Array Processing

Subject: Array Processing

Blaise Barney introduced Parallel Computing https://computing.llnl.gov/tutorials/parallel_comp/ Array processing could become one of the parallel example, which "demonstrates calculations on 2-dimensional array elements; a function is evaluated on each array element."

Standard random method is used to initialize a 2-dimentional array. The purpose of this program is to perform a 2-dimension array calculation, which is a matrix-matrix multiplication in this example.

In this following profile example, n = 1000

Flat profile:

Each sample counts as 0.01 seconds.

% cumulative self self total time seconds seconds calls Ts/call Ts/call name

100.11 1.48 1.48 multiply(float**, float**, float**, int)

0.68 1.49 0.01 init(float**, int) 0.00 1.49 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z4initPPfi

Call graph

granularity: each sample hit covers 2 byte(s) for 0.67% of 1.49 seconds

index % time self children called name

<spontaneous>

[1] 99.3 1.48 0.00 multiply(float**, float**, float**, int) [1]

<spontaneous>

[2] 0.7 0.01 0.00 init(float**, int) [2]

0.00 0.00 1/1 __libc_csu_init [16]

[10] 0.0 0.00 0.00 1 _GLOBAL__sub_I__Z4initPPfi [10]

� Index by function name

[10] _GLOBAL__sub_I__Z4initPPfi (arrayProcessing.cpp) [2] init(float**, int) [1] multiply(float**, float**, float**, int)

From the call graph, multiply() took major runtime more than 99%, as it contains 3 for-loop, which is O(n^3). Besides, init() also became the second busy one, which has a O(n^2).

As the calculation of elements is independent of one another - leads to an embarrassingly parallel solution. Arrays elements are evenly distributed so that each process owns a portion of the array (subarray). It can be solved in less time with multiple compute resources than with a single compute resource.

### The Monte Carlo Simulation (PI Calculation)

Subject: The Monte Carlo Simulation (PI Calculation) Got the code from here: https://rosettacode.org/wiki/Monte_Carlo_methods#C.2B.2B A Monte Carlo Simulation is a way of approximating the value of a function where calculating the actual value is difficult or impossible.

It uses random sampling to define constraints on the value and then makes a sort of "best guess."

### Zhijian

Subject: