Changes

Team Darth Vector

1,109 bytes added, 21:31, 16 December 2017

→‎TBB Memory Allocation & Problems in Parallel Programming

tbb:parallel_invoke(myFuncA, myFuncB, myFuncC);

</pre>

==TBB Memory Allocation & ~~Problems in~~ Fixing Issues from Parallel Programming==

TBB provides memory allocation just like in STL via the '''std::allocater''' template class. Where TBB's allocater though improves, is through its expanded support for common issues experienced in parallel programming. These allocaters are called '''scalable_allocater<type>''' and '''cache_aligned_allocater<type>''' and ensure that issues like '''Scalability''' and '''False Sharing''' performance problems are reduced.

===False Sharing===

As you may have seen from the workshop "False Sharing" a major performance hit can occur in parallel when data that sits on the same cache line in memory is used by two threads. When threads are attempting operations on the same cache line the threads will compete for access and will move the cache line around. The time taken to move the line is a significant amount of clock cycles which causes the performance problem. Through TBB, Intel created an allocated known as '''cache_aligned_allocater<type>'''. When used, any objects with memory allocation from it will never encounter false sharing. Note that if only 1 object is allocated by this allocater, false sharing may still occur. For compatability's sake(so that programmers can simply use "find and replace"), the cache_aligned_allocater takes the same arguments as the STL allocater. If you wish to use the allocater with STL containers, you only need to set the 2nd argument as the cache_allocater object. The following is an example provided by Intel to demonstrate this: <nowiki>std::vector<int,cache_aligned_allocator<int> >;</nowiki>

===Scaling Issue===

When working in parallel, several threads may be required to access shared memory which causes a performance slow down from forcing a single thread to allocate memory while other threads are required to wait. Intel describes this issue in parallel programming as '''Scalability''' and answers the issue with '''scalable_allocater<type>''' which permits concurrent memory allocation and is considered ideal for "''programs the rapidly allocate and free memory''".

Agodwin

129

edits

Changes

Team Darth Vector

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools