Changes

Jump to: navigation, search

DPS921/OpenACC vs OpenMP Comparison

2,621 bytes added, 16:39, 30 November 2020
no edit summary
=== Example ===
<source>
#pragma acc kernels
{
for (int i = 0; i < N; i++) {
y[i] = a * x[i] + y[i];
}
}
</source>
== GPU offloading ==
== OpenMP GPU offloading ==
We are comparing with OpenMP because OpenMP started support of offloading to accelerators starting OpenMP 4.0 using `target` constructs. OpenACC uses directives to tell the compiler where to parallelize loops, and how to manage data between host and accelerator memories. OpenMP takes a more generic approach, it allows programmers to explicitly spread the execution of loops, code regions and tasks across teams of threads. OpenMP's directives tell the compiler to generate parallel code in that specific way, leaving little room to the discretion of the compiler and the optimizer.
== Code comparison ==
<source>
Explicit conversions
 
OpenACC OpenMP
 
#pragma acc kernels #pragma omp target
{ {
#pragma acc loop worker #pragma omp parallel for private(tmp)
for(int i = 0; i < N; i++){ for(int i = 0; i < N; i++){
tmp = …; tmp = …;
array[i] = tmp * …; array[i] = tmp * …;
} }
#pragma acc loop vector #pragma omp simd
for(int i = 0; i < N; i++) for(int i = 0; i < N; i++)
array2[i] = …; array2[i] = …;
} }
 
</source><source>
ACC parallel
 
OpenACC OpenMP
 
#pragma acc parallel #pragma omp target
{ #pragma omp parallel
#pragma acc loop {
for(int i = 0; i < N; i++){ #pragma omp for private(tmp) nowait
tmp = …; for(int i = 0; i < N; i++){
array[i] = tmp * …; tmp = …;
} array[i] = tmp * …;
#pragma acc loop }
for(int i = 0; i < N; i++) #pragma omp for simd
array2[i] = …; for(int i = 0; i < N; i++)
} array2[i] = …;
}
 
</source><source>
 
ACC Kernels
 
OpenACC OpenMP
 
#pragma acc kernels #pragma omp target
{ #pragma omp parallel
for(int i = 0; i < N; i++){ {
tmp = …; #pragma omp for private(tmp)
array[i] = tmp * …; for(int i = 0; i < N; i++){
for(int i = 0; i < N; i++) tmp = …;
array2[i] = … array[i] = tmp * …;
} }
#pragma omp for simd
for(int i = 0; i < N; i++)
array2[i] = …
}
 
</source><source>
 
Copy vs. PCopy
 
OpenACC OpenMP
 
int x[10],y[10]; int x[10],y[10];
#pragma acc data copy(x) pcopy(y) #pragma omp target data map(x,y)
{ {
... ...
#pragma acc kernels copy(x) pcopy(y) #pragma omp target update to(x)
{ #pragma omp target map(y)
// Accelerator Code {
... // Accelerator Code
} ...
... }
} }
</source>
36
edits

Navigation menu