Difference between revisions of "SPO600 Vectorization Lab"

From CDOT Wiki
Jump to: navigation, search
(Lab 6)
Line 6: Line 6:
 
# Write a short program that creates two 1000-element integer arrays and fills them with random numbers, then sums those two arrays to a third array, and finally sums the third array to a long int and prints the result.
 
# Write a short program that creates two 1000-element integer arrays and fills them with random numbers, then sums those two arrays to a third array, and finally sums the third array to a long int and prints the result.
 
# Compile this program on [[SPO600 Servers#AArch64: aarchie|aarchie]] in such a way that the code is auto-vectorized.
 
# Compile this program on [[SPO600 Servers#AArch64: aarchie|aarchie]] in such a way that the code is auto-vectorized.
# Annotate the emitted code (i.e., obtain a dissassembly via <code>objdump -d</code> and add comments to the instructions in <code>&lt;main&gt;</code> explaining what the code does).
+
# Annotate the emitted code (i.e., obtain a dissassembly via <code>objdump -d</code> and add comments to the instructions in <code>&lt;main&gt;</code> explaining what the code does). '''Prove that the code is vectorized''', for example, by pointing out the use of vector registers and SIMD instructions.
# Review the vector instructions for AArch64. Find a way to scale an array of sound samples (see Lab 5) by a factor between 0.000-1.000 using SIMD. (Note: you may need to convert some data types). You DO NOT need to code this solution (but feel free if you want to!).
 
 
# '''Write a blog post discussing your findings'''. Include:
 
# '''Write a blog post discussing your findings'''. Include:
 
#* The source code
 
#* The source code
Line 13: Line 12:
 
#* Your annotated dissassembly listing
 
#* Your annotated dissassembly listing
 
#* Your reflections on the experience and the results
 
#* Your reflections on the experience and the results
#* Your proposed volume-sampling-via-SIMD solution.
 
  
 
=== Resources ===
 
=== Resources ===

Revision as of 23:37, 1 October 2017

Lab icon.png
Purpose of this Lab
This lab is designed to explore single instruction/multiple data (SIMD) vectorization, and the auto-vectorization capabilities of the GCC compiler.

Lab 5

  1. Write a short program that creates two 1000-element integer arrays and fills them with random numbers, then sums those two arrays to a third array, and finally sums the third array to a long int and prints the result.
  2. Compile this program on aarchie in such a way that the code is auto-vectorized.
  3. Annotate the emitted code (i.e., obtain a dissassembly via objdump -d and add comments to the instructions in <main> explaining what the code does). Prove that the code is vectorized, for example, by pointing out the use of vector registers and SIMD instructions.
  4. Write a blog post discussing your findings. Include:
    • The source code
    • The compiler command line used to build the code
    • Your annotated dissassembly listing
    • Your reflections on the experience and the results

Resources

  • Auto-Vectorization in GCC - Main project page for the GCC auto-vectorizer.
  • Auto-vectorization with gcc 4.7 - An excellent discussion of the capabilities and limitations of the GCC auto-vectorizer, intrinsics for providing hints to GCC, and other code pattern changes that can improve results. Note that there has been some improvement in the auto-vectorizer since this article was written. This article is strongly recommended.
  • Intel (Auto)Vectorization Tutorial - this deals with the Intel compiler (ICC) but the general technical discussion is valid for other compilers such as gcc and llvm