Changes

Jump to: navigation, search

Team False Sharing

108 bytes added, 18:19, 16 December 2017
Identifying False Sharing
The frequent coordination required between processors when cache lines are marked ‘Invalid’ requires cache lines to be written to memory and subsequently loaded. False sharing increases this coordination and can significantly degrade application performance.
In Figure, threads 0 and 1 require variables that are adjacent in memory and reside on the same cache line. The cache line is loaded into the caches of CPU 0 and CPU 1. Even though the threads modify different variables, the cache line is invalidated forcing a memory update to maintain cache coherency.
<source lang="cpp">
#include <iostream>
#include <omp.h>
#include "timer.h"
#define NUM_THREADS 1#define SIZE 1000000<code>#include <iostream>#include <omp.h>#include "timer.h"#define NUM_THREADS 18
#define SIZE 1000000
int* a = new int [SIZE];
int* b = new int [SIZE];
int* sum_local = new int[NUM_THREADS];
int sum = 0.0;
int threadsUsed;
for(int i = 0; i < SIZE; i++){//initialize arrays
a[i] = i1; b[i] = i1;
}
omp_set_num_threads(NUM_THREADS);
stopwatch.start();
#pragma omp parallel for reduction{ int threadNum = omp_get_thread_num(+:sum); sum_local[threadNum]=0;#pragma omp for
for(int i = 0; i < SIZE; i++){//calcultae sum of product of arrays
if(i==0){threadsUsed = omp_get_num_threads();} sumsum_local[threadNum] += a[i] * b[i];
}
#pragma omp atomic
sum += sum_local[threadNum];
}
stopwatch.stop();
return 0;
}
</codesource>
=Eliminating False Sharing=
96
edits

Navigation menu