Changes

Jump to: navigation, search

GPU621/Intel Advisor

7 bytes removed, 15:50, 23 November 2018
use border instead of frame for images
== Vectorization Examples ==
[[File:Vectorization-example-serial.png|frameborder]]
=== Serial Version ===
=== SIMD Version ===
[[File:Vectorization-example-simd.png|frameborder]]
<source lang="cpp">
The following image illustrates the loop-carried dependency when two pointers overlap.
[[File:Pointer-alias.png|frameborder]]
== Memory Alignment ==
However, if the data is not aligned, the vectorizer may have to use a '''peeled''' loop to address the misalignment. So instead of vectorizing the entire loop, an extra loop needs to be inserted to perform operations on the front-end of the array that not aligned with memory.
[[File:Memory-alignment-peeled.png|frameborder]]
A remainder loop is the result of having a number of elements in the array that is not evenly divisible by the vector length (the total number of elements of a certain data type that can be loaded into a vector register).
[[File:Memory-alignment-remainder.png|frameborder]]
=== Padding ===
For example, if you have a <code>4 x 19</code> array of floats, and your system access to a 128-bit vector registers, then you should add 1 column to make the array <code>4 x 20</code> so that the number of columns is evenly divisible by the number of floats that can be loaded onto a 128-bit vector register, which is 4 floats.
[[File:Memory-alignment-padding.png|frameborder]]
=== Aligned vs Unaligned Instructions ===
#endif // _WIN32
</source>
 
= Summary =
49
edits

Navigation menu