Difference between revisions of "Test Team Please Ignore"

From CDOT Wiki
Jump to: navigation, search
(Assignment 1)
Line 26: Line 26:
 
   0.00      0.55    0.00        1    0.00    0.00  _GLOBAL__sub_I_main
 
   0.00      0.55    0.00        1    0.00    0.00  _GLOBAL__sub_I_main
 
   0.00      0.55    0.00        1    0.00    0.00  Image::~Image()
 
   0.00      0.55    0.00        1    0.00    0.00  Image::~Image()
 +
 +
 +
 +
Erquan Bi
 +
code source: https://people.sc.fsu.edu/~jburkardt/cpp_src/mandelbrot/mandelbrot.cpp
 +
 +
This program computer an image of the Mandelbrot set through function:
 +
 +
that 1, carry out the iteration for each pixel: void iterPixel(int n, int* count, int count_max, double x_max, double x_min, double y_max, double y_min); Which inludes a three nested loop, a hotspot, consuming around 70% of the total time
 +
 +
2, Determine the coloring of each pixel: void pixelColor(int& c_max, int n, int *count);
 +
 +
3,  Set the image data: void setImageData(int n, int *r, int *g, int *b, int c_max, int* count); which includes 2 nested loop, a hotspot, taking up 26% of the time.
 +
 +
4, Then, write an image file: bool ppma_write(string file_out_name, int xsize, int ysize, int *r, int *g, int *b);
 +
 +
The Big-O class of  iterPixel is O(n^3).  For each iteration which is the (n+1)th row and the (n+1) column, the extra steps that needed to be taken for the mulitplication are  (n+1)^3, which are O(n^2) - cubic mulitiply(n), is the hotspot logic of this program. It consumes up 75% of the elasped time and grow significantly.  The program will be faster if this function can be speed up.
 +
 +
ebi@matrix:~/610/a1> time A1 501
 +
 +
real    0m0.336s
 +
 +
user    0m0.220s
 +
 +
sys    0m0.068s
 +
 +
ebi@matrix:~/610/a1> time A1 1001
 +
 +
real    0m1.289s
 +
 +
user    0m0.948s
 +
 +
sys    0m0.188s
 +
 +
ebi@matrix:~/610/a1> time A1 1501
 +
 +
real    0m2.859s
 +
 +
user    0m2.204s
 +
 +
sys    0m0.368s
 +
 +
A1.501.flt
 +
 +
Flat profile:
 +
 +
Each sample counts as 0.01 seconds.
 +
 +
%  cumulative  self              self    total
 +
 +
time  seconds  seconds    calls  Ts/call  Ts/call  name
 +
 +
75.00      0.09    0.09                            iterPixel(int, int*, int, double, double, double, double)
 +
 +
25.00      0.12    0.03                            setImageData(int, int*, int*, int*, int, int*)
 +
 +
0.00      0.12    0.00        1    0.00    0.00  _GLOBAL__sub_I_main
 +
 +
0.00      0.12    0.00        1    0.00    0.00  ppma_write_data(std::basic_ofstream<char, std::char_traits<char> >&, int, int, int*, int*, int*)
 +
 +
0.00      0.12    0.00        1    0.00    0.00  ppma_write_header(std::basic_ofstream<char, std::char_traits<char> >&, std::string, int, int, int)
 +
 +
A1.1001.flt
 +
 +
Flat profile:
 +
 +
Each sample counts as 0.01 seconds.
 +
 +
%  cumulative  self              self    total
 +
 +
time  seconds  seconds    calls  ms/call  ms/call  name
 +
 +
67.31      0.35    0.35                            iterPixel(int, int*, int, double, double, double, double)
 +
 +
26.92      0.49    0.14                            setImageData(int, int*, int*, int*, int, int*)
 +
 +
5.77      0.52    0.03        1    30.00    30.00  ppma_write_data(std::basic_ofstream<char, std::char_traits<char> >&, int, int, int*, int*, int*)
 +
 +
0.00      0.52    0.00        1    0.00    0.00  _GLOBAL__sub_I_main
 +
 +
0.00      0.52    0.00        1    0.00    0.00  ppma_write_header(std::basic_ofstream<char, std::char_traits<char> >&, std::string, int, int, int)
 +
 +
A1.1501.flt
 +
 +
Flat profile:
 +
 +
Each sample counts as 0.01 seconds.
 +
 +
%  cumulative  self              self    total
 +
 +
time  seconds  seconds    calls  ms/call  ms/call  name
 +
 +
67.31      0.35    0.35                            iterPixel(int, int*, int, double, double, double, double)
 +
 +
26.92      0.49    0.14                            setImageData(int, int*, int*, int*, int, int*)
 +
 +
5.77      0.52    0.03        1    30.00    30.00  ppma_write_data(std::basic_ofstream<char, std::char_traits<char> >&, int, int, int*, int*, int*)
 +
 +
0.00      0.52    0.00        1    0.00    0.00  _GLOBAL__sub_I_main
 +
 +
0.00      0.52    0.00        1    0.00    0.00  ppma_write_header(std::basic_ofstream<char, std::char_traits<char> >&, std::string, int, int, int)
  
 
=== Assignment 2 ===
 
=== Assignment 2 ===
 
=== Assignment 3 ===
 
=== Assignment 3 ===

Revision as of 18:16, 11 November 2015

Test Team Please Ignore

Team Members

  1. Kirill Lepetinskiy
  2. Shigemi Yoshimori
  3. Erquan Bi

mailto:ebi@senecacollege.ca?subject=gpu610 Email All

Progress

Assignment 1

Image Rotation I profiled a code found on http://www.dreamincode.net/forums/topic/76816-image-processing-tutorial/ There are multiple functions available within the code, and I decided to try three of them (enlarge, flip, and rotate image) It turned out that rotation takes the longest time and good place to apply parallelization.

Flat profile:

Each sample counts as 0.01 seconds.

 %   cumulative   self              self     total
time   seconds   seconds    calls  ms/call  ms/call  name
34.55      0.19     0.19                             Image::rotateImage(int, Image&)
25.45      0.33     0.14                             Image::Image(Image const&)
18.18      0.43     0.10        1   100.00   100.00  Image::operator=(Image const&)
12.73      0.50     0.07        1    70.00    70.00  Image::Image(int, int, int)
 5.45      0.53     0.03                             writeImage(char*, Image&)
 3.64      0.55     0.02                             readImage(char*, Image&)
 0.00      0.55     0.00        1     0.00     0.00  _GLOBAL__sub_I_main
 0.00      0.55     0.00        1     0.00     0.00  Image::~Image()


Erquan Bi code source: https://people.sc.fsu.edu/~jburkardt/cpp_src/mandelbrot/mandelbrot.cpp

This program computer an image of the Mandelbrot set through function:

that 1, carry out the iteration for each pixel: void iterPixel(int n, int* count, int count_max, double x_max, double x_min, double y_max, double y_min); Which inludes a three nested loop, a hotspot, consuming around 70% of the total time

2, Determine the coloring of each pixel: void pixelColor(int& c_max, int n, int *count);

3, Set the image data: void setImageData(int n, int *r, int *g, int *b, int c_max, int* count); which includes 2 nested loop, a hotspot, taking up 26% of the time.

4, Then, write an image file: bool ppma_write(string file_out_name, int xsize, int ysize, int *r, int *g, int *b);

The Big-O class of iterPixel is O(n^3). For each iteration which is the (n+1)th row and the (n+1) column, the extra steps that needed to be taken for the mulitplication are (n+1)^3, which are O(n^2) - cubic mulitiply(n), is the hotspot logic of this program. It consumes up 75% of the elasped time and grow significantly. The program will be faster if this function can be speed up.

ebi@matrix:~/610/a1> time A1 501

real 0m0.336s

user 0m0.220s

sys 0m0.068s

ebi@matrix:~/610/a1> time A1 1001

real 0m1.289s

user 0m0.948s

sys 0m0.188s

ebi@matrix:~/610/a1> time A1 1501

real 0m2.859s

user 0m2.204s

sys 0m0.368s

A1.501.flt

Flat profile:

Each sample counts as 0.01 seconds.

% cumulative self self total

time seconds seconds calls Ts/call Ts/call name

75.00 0.09 0.09 iterPixel(int, int*, int, double, double, double, double)

25.00 0.12 0.03 setImageData(int, int*, int*, int*, int, int*)

0.00 0.12 0.00 1 0.00 0.00 _GLOBAL__sub_I_main

0.00 0.12 0.00 1 0.00 0.00 ppma_write_data(std::basic_ofstream<char, std::char_traits<char> >&, int, int, int*, int*, int*)

0.00 0.12 0.00 1 0.00 0.00 ppma_write_header(std::basic_ofstream<char, std::char_traits<char> >&, std::string, int, int, int)

A1.1001.flt

Flat profile:

Each sample counts as 0.01 seconds.

% cumulative self self total

time seconds seconds calls ms/call ms/call name

67.31 0.35 0.35 iterPixel(int, int*, int, double, double, double, double)

26.92 0.49 0.14 setImageData(int, int*, int*, int*, int, int*)

5.77 0.52 0.03 1 30.00 30.00 ppma_write_data(std::basic_ofstream<char, std::char_traits<char> >&, int, int, int*, int*, int*)

0.00 0.52 0.00 1 0.00 0.00 _GLOBAL__sub_I_main

0.00 0.52 0.00 1 0.00 0.00 ppma_write_header(std::basic_ofstream<char, std::char_traits<char> >&, std::string, int, int, int)

A1.1501.flt

Flat profile:

Each sample counts as 0.01 seconds.

% cumulative self self total

time seconds seconds calls ms/call ms/call name

67.31 0.35 0.35 iterPixel(int, int*, int, double, double, double, double)

26.92 0.49 0.14 setImageData(int, int*, int*, int*, int, int*)

5.77 0.52 0.03 1 30.00 30.00 ppma_write_data(std::basic_ofstream<char, std::char_traits<char> >&, int, int, int*, int*, int*)

0.00 0.52 0.00 1 0.00 0.00 _GLOBAL__sub_I_main

0.00 0.52 0.00 1 0.00 0.00 ppma_write_header(std::basic_ofstream<char, std::char_traits<char> >&, std::string, int, int, int)

Assignment 2

Assignment 3