Latest revision as of 09:53, 10 April 2017

Thunderbird

Team Members

eMail All

Progress

Assignment 1

Profiling: LZW algorithm

It's a simple version of LZW compression algorithm with 12 bit codes.

 void compress(string input, int size, string filename) {
   unordered_map<string, int> compress_dictionary(MAX_DEF);
     //Dictionary initializing with ASCII
     for ( int unsigned i = 0 ; i < 256 ; i++ ){
       compress_dictionary[string(1,i)] = i;
     }
     string current_string;
     unsigned int code;
     unsigned int next_code = 256;
     //Output file for compressed data
     ofstream outputFile;
     outputFile.open(filename + ".lzw");
 
     for(char& c: input){
     current_string = current_string + c;
     if ( compress_dictionary.find(current_string) ==compress_dictionary.end() ){
             if (next_code <= MAX_DEF)
                 compress_dictionary.insert(make_pair(current_string, next_code++));
             current_string.erase(current_string.size()-1);
             outputFile << convert_int_to_bin(compress_dictionary[current_string]);
             current_string = c;
         }   
     }   
     if (current_string.size())
             outputFile << convert_int_to_bin(compress_dictionary[current_string]);
     outputFile.close();
 }

Using compiler settings (gcc version 5.2.0):

 g++ -c -O2 -g -pg -std=c++14 lzw.cpp

10 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 10.txt
 real	0m4.302s
 user	0m3.072s
 sys	0m0.632s

 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  45.83      0.55     0.55                             compress(string, int, string) 
  36.67      0.99     0.44 14983735    29.37    29.37  _M_find_before_node(unsigned int, string const&, unsigned int) const 
   7.50      1.08     0.09 10489603     8.58     8.58  show_usage() 
   5.83      1.15     0.07  4493878    15.58    44.94  operator[](string const&)  
   4.17      1.20     0.05                             _Z22convert_char_to_stringB5cxx11PKci  
   0.00      1.20     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)  
   0.00      1.20     0.00     3841     0.00    29.37  _ZNSt10_HashtableINSt7  
   0.00      1.20     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      1.20     0.00        1     0.00     0.00  ~_Hashtable()

20 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 20.txt
 real	0m8.924s
 user	0m6.504s
 sys	0m2.008s

 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  49.33      1.47     1.47                             compress(string, int, string)
  34.56      2.50     1.03 29962271    34.38    34.38  _M_find_before_node(unsigned int, string const&, unsigned int) const
   7.05      2.71     0.21  8986654    23.37    57.74  operator[](string const&)
   6.71      2.91     0.20 20975363     9.53     9.53  show_usage()
   2.35      2.98     0.07                             _Z22convert_char_to_stringB5cxx11PKci
   0.00      2.98     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)
   0.00      2.98     0.00     3841     0.00    34.38  _ZNSt10_HashtableINSt7
   0.00      2.98     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      2.98     0.00        1     0.00     0.00  ~_Hashtable()

30 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 30.txt
 real	0m13.637s
 user	0m9.665s
 sys	0m2.984s

 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  45.59      1.86     1.86                             compress(string, int, string)
  37.25      3.38     1.52 44940806    33.82    33.82  _M_find_before_node(unsigned int, string const&, unsigned int) const
   7.60      3.69     0.31 13479429    23.00    56.82  operator[](string const&)
   6.62      3.96     0.27 31461123     8.58     8.58  show_usage()
   2.94      4.08     0.12                             _Z22convert_char_to_stringB5cxx11PKci
   0.00      4.08     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)
   0.00      4.08     0.00     3841     0.00    33.82  _ZNSt10_HashtableINSt7
   0.00      4.08     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      4.08     0.00        1     0.00     0.00  ~_Hashtable()

Profiling: Ray-tracing algorithm

Source Code: https://github.com/ksanghun/CUDA_raytrace/blob/master/GPUAssaginemt/cputest.cpp

@@ Line 2: / Line 2: @@
 == Team Members ==
 # [mailto:ksanghun@myseneca.ca?subject=GPU610 Sanghun Kim]
-# [mailto:hpark77@myseneca.ca?subject=GPU610 Heetai Park]
 # [mailto:wlee64@myseneca.ca?subject=GPU610 Wonho Lee]
-  [mailto:ksanghun@myseneca.ca;hpark77@myseneca.ca;wlee64@myseneca.ca?subject=GPU610 eMail All]
+  [mailto:ksanghun@myseneca.ca;wlee64@myseneca.ca?subject=GPU610 eMail All]
 == Progress ==
 === Assignment 1 ===
 ==== Profiling: LZW algorithm ====
 It's a simple version of LZW compression algorithm with 12 bit codes.
@@ Line 105: / Line 105: @@
 .00      4.08     0.00        1     0.00     0.00  ~_Hashtable()
+==== Profiling: Ray-tracing algorithm ====
+Source Code: https://github.com/ksanghun/CUDA_raytrace/blob/master/GPUAssaginemt/cputest.cpp
+[[File:Profiling_Raytrace.png]]
+==== '''Ray-Tracing Algorithm''' ====
+[[File:rt_1.png]]
+==== '''Ray-sphere Intersection''' ====
+[[File:rt_2.png]]
+==== '''Trace''' ====
+[[File:rt_3.png]]
+==== '''Floating-Point Considerations''' ====
+[[File:Raytrace_floatingerror.PNG ‎]]
 ----
+=== Assignment 2 ===
+==== 1. Parallelize ====
+- render()
+[[File:Render_CvsP2.png]]
+- main()
+[[File:main_CvsP2.png]]
+==== 2. Performance ====
+[[File:Data_CvsP.PNG]]
+[[File:Graph_CvsP.PNG]]
+----
-==== Profiling: Ray-tracing algorithm ====
+=== Assignment 3 ===
-Source Code: https://www.scratchapixel.com/lessons/3d-basic-rendering/introduction-to-ray-tracing/ray-tracing-practical-example
+==== 1. Optimize ====
-[[File:Profile.JPG]]
+- Global to constant memory
-==== Profiling: bubble vs quick algorithm ====
-It's a simple version of sorting algorithm - Source code
-  void BubbleSort(int arr[], int size) {
+[[File:PvsO2.png]]
-        int tmp;                          /*used for swapping*/
-        int i;
-        int j;
-        for (i = 0; i<size - 1; i++) {
-                for (j = 0; j<size - 1 - i; j++) {
-                        if (arr[j + 1] < arr[j]) {          /* compare the two neighbors */
-                                tmp = arr[j];                   /* swap a[j] and a[j+1]      */
-                                arr[j] = arr[j + 1];
-                                arr[j + 1] = tmp;
-                        }
-                }
-        }
-  }
-  void InsertionSort(int arr[], int left, int right) {
-        int curr;
-        int i, j;
-        for (i = left + 1; i <= right; i++) {
-                curr = arr[i];
-                for (j = i; j>0 && arr[j - 1] > curr; j--) {
-                        arr[j] = arr[j - 1];
-                }
-                arr[j] = curr;
-        }
-  }
-  void QuickSort(int arr[], int left, int right) {
-        if (right - left <= 3) {
-                InsertionSort(arr, left, right);
-        }
-        else {
-                int pivotpt = (left + right) / 2;   //index of the pivot
-                int i = left;
-                int j = right - 1;
-                swap(arr[pivotpt], arr[right]);
-                pivotpt = right;
-                int pivot = arr[pivotpt];
-                while (i<j) {
-                        while (i< right - 1 && arr[i]<pivot) i++;
-                        while (j > 0 && arr[j]>pivot) j--;
-                        if (i<j)
-                                swap(arr[i++], arr[j--]);
-                }
-                if (i == j && arr[i] < arr[pivotpt])
-                        i++;
-                swap(arr[i], arr[pivotpt]);
-                QuickSort(arr, left, i - 1);
-                QuickSort(arr, i + 1, right);
-        }
-  }
-  void QuickSort(int arr[], int size) {
-        QuickSort(arr, 0, size - 1);
-  }
-Using compiler settings (gcc version 5.2.0):
+==== 2. Performance ====
-  g++ -c -O2 -g -pg -std=c++14 a1.cpp
+[[File:Data_PvsO.PNG]]
-Sorting for 50000
- Flat profile:
- Each sample counts as 0.01 seconds.
-  %   cumulative   self              self     total
- time   seconds   seconds    calls  ns/call  ns/call  name
-.77      4.42     4.42                             BubbleSort(int*, int)
-.23      4.43     0.01                             QuickSort(int*, int, int)
-.00      4.43     0.00    16651     0.00     0.00  InsertionSort(int*, int, int)
-.00      4.43     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z10BubbleSortPii
+[[File:Graph_PvsO.PNG]]
-Sorting for 100000
- Flat profile:
- Each sample counts as 0.01 seconds.
-  %   cumulative   self              self     total
- time   seconds   seconds    calls  ns/call  ns/call  name
-.89     17.84    17.84                             BubbleSort(int*, int)
-.06     17.85     0.01    33355   299.81   299.81  InsertionSort(int*, int, int)
-.06     17.86     0.01                             QuickSort(int*, int, int)
-.00     17.86     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z10BubbleSortPii
+==== 3. GPU Occupancy ====
+[[File:rt_5.png]]
+----
-Sorting for 200000
+=== Conclusion ===
- Flat profile:
+==== 1. Output ====
- Each sample counts as 0.01 seconds.
+Video: https://youtu.be/3wV-ObHWZhg
-  %   cumulative   self              self     total
+==== 2. Performance ====
- time   seconds   seconds    calls  ns/call  ns/call  name
+[[File:Graph_CvsPvsO.PNG]]
-.97     71.90    71.90                             BubbleSort(int*, int)
-.01     71.91     0.01    66645   150.05   150.05  InsertionSort(int*, int, int)
-.01     71.92     0.01                             QuickSort(int*, int, int)
-.00     71.92     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z10BubbleSortPii

Difference between revisions of "Thunderbird"

Latest revision as of 09:53, 10 April 2017

Contents

Thunderbird

Team Members

Progress

Assignment 1

Profiling: LZW algorithm

Profiling: Ray-tracing algorithm

Ray-Tracing Algorithm

Ray-sphere Intersection

Trace

Floating-Point Considerations

Assignment 2

1. Parallelize

2. Performance

Assignment 3

1. Optimize

2. Performance

3. GPU Occupancy

Conclusion

1. Output

2. Performance

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools