Open main menu

CDOT Wiki β

Changes

GPU621/Code

47 bytes added, 10:23, 28 November 2018
Interpreting results
*Version 2 – the matrix multiplication logic is still inside the parallel for statement, but it is being dynamically scheduled and certain variables are selected to be private or shared.
 '''[https://github.com/coreyjjames/CoreyJJames/tree/Lab3_VTune_Example Example Code]'''
- Run the program with VTune threading analysis.
- The point of interest in the program is under the platform tab.  - You will notice in version 1 some of the threads finish before other's. The work is not being spread evenly.
- In version 2, that issue is resolved all the threads end at the same time. When I ran Version 2 I saw around a 0.6s increase in performance.
*Turn off optimization so you can see source code Hotspot's
*Rebuild after any changes.
 
==='''Interpreting results'''===
Determining the results from VTune will be a different process for your program then mine.
To be success full successful make sure to read through the results and look for anomalies.
'''Example of anomalies:'''
First of all you need to have Intel parallel XE installed on your machine it will allows you to have access to tools such as Intel Advisor, Vtune Amplifier and Inspector on top of your Visual Studio window.
Lets start by creating a project on Visual Studio, I am using a sample code from Intel Advisor folder which they provide different samples for you to test the functionality of it. They can be found under <Your-Installed-Directory>\IntelSWTools\Advisor 2019\samples\en. I am using the "nqueens_Advisor" one. After choosing your project build the solution and look the following Icon from the image below
[[File:Advisor.png|900px]]
[[File:Select.png|250px]]
 
==='''How it works'''===
50
edits