Difference between revisions of "GPU621 Team Tsubame"

From CDOT Wiki
Jump to: navigation, search
(Intel Advisor)
(How is it actually used?)
Line 16: Line 16:
  
 
=== How is it actually used? ===
 
=== How is it actually used? ===
The following walk-through assume that you have Visual Studio 2015 and Intel Advisor 2017 installed.
+
The following walk-through assumes that you have Visual Studio 2015 and Intel Advisor 2017 installed.
  
 
==== Preparations: ====
 
==== Preparations: ====
 
1. Download and unzip Prefix Scan.zip to a preferred location and open it with Visual Studio 2015.
 
1. Download and unzip Prefix Scan.zip to a preferred location and open it with Visual Studio 2015.
 +
 
2. Set the Balanced Tree project as StartUp Project.
 
2. Set the Balanced Tree project as StartUp Project.
 +
 
[[File:S1-2.png]]
 
[[File:S1-2.png]]
3.
+
 
 +
3. Find the Advisor’s directory by executing the following command in cmd.exe: >set advisor
 +
 
 +
4. Change the following project properties:
 +
 
 +
a. In C/C++ > General > Additional Include Directories, add the Advisor’s directory using macro notation: $(ADVISOR_..._DIR)include (or $(ADVISOR_..._DIR)\include if the environment variable does not end with a backslash).
 +
 
 +
b. In C/C++ > General > Debug Information Format, confirm it is set to Program Database (/Zi).
 +
 +
c. In Linker > Debugging > Generate Debug Info, set it to Optimize for debugging (/DEBUG).
 +
 +
d. In C/C++ > Optimization > Optimization, confirm it is set to Maximize Speed (/O2) or higher.
 +
 
 +
e. On the same page, set Inline Function Expansion to Only __inline (/Ob1).
 +
 +
f. In C/C++ > Code Generation > Runtime Library, confirm it has been set to Multi-threaded DLL (/MD); another option is to set this field to Multi-threaded Debug DLL (/MDd).
 +
 +
g. Enable OpenMP under C/C++ > Language > OpenMP Support by setting it to Generate Parallel Code (/Qopenmp).
 +
 +
h. Click OK to save the properties.
 +
 
 +
5. Comment out the “terminate” section in w3.main.cpp to end the application without waiting for user input.
 +
 +
6. Clean the Solution and Build the Project to generate the binary.
 +
 
 +
7. Launch Advisor through Windows Start > All Programs > Intel Parallel Studio XE 2017 > Analyzers > Advisor 2017
 +
 
 +
8. Select File > New > Project… to start a new project.
 +
 +
9. Provide a name for the project in the Create a Project window.
 +
 +
10. Under the Analysis Target tab, add the location of the Balance Tree.exe to the Application field using the Browse… button beside the field (or type the path in manually).
 +
 
 +
11. In the Application parameters field, add the parameters to use when executing the application.
 +
 +
12. Ensure the Inherit settings from Survey Hotspots Analysis Type checkbox is checked in Suitability Analysis and Dependencies Analysis.
 +
 
 +
13. Check Collect information about FLOPS, L1 memory traffic, and AVX-512 mask usage for a complete Trip Count Analysis; this step is optional.
 +
 +
14. Under the Binary/Symbol Search tab, add the visual studio project’s Release folder as a search directory. There will be warnings saying you are missing some symbols during the Survey Analysis, please ignore them.
 +
 +
15. Under the Source Search tab, provide the location of the application’s source code.
 +
 +
16. Select OK to complete the project creation process.

Revision as of 04:47, 21 November 2016

Intel Advisor

Team Member

  1. Yanhao Lei

Progress

Notes

What is Intel Advisor?

The Intel Advisor package provides threading advisor and vectorization advisor to assist programmers in finding the possible parallel enhancements in their serial applications written in C, C++, or FORTRAN.

How does it work?

Vectorization Workflow:

Advisor surveys the given binary of an application built in release mode and its source code to determine information such as the time spent processing the instructions in the call stack, the loops that can be vectorized, and the estimates on the merits of vectorizing un-vectorized and under-vectorized loops. You can upgrade the Survey Report to allow it to make better suggestions by collecting additional information through running a Trip Count Analysis to determine the number of times loops and functions are executed; this step is optional. A second run of the Survey Analysis is required after changes are made to the application based on the suggestions in the first Survey Report. If the new report states all loops are vectorized, then the Advisor has completed its job; but this is often not the case in complex programs due to dependencies and memory issues. To resolve these issues, you can mark suspicious sections of the code and use the Dependencies analysis and the Memory Access Patterns (MAP) analysis to identify the causes and make the appropriate changes.

Threading Workflow:

The Threading Workflow also begins with a survey for times and an optional count of invocations to generate the Survey Report. You must add annotations to the source code to identify the sections for the Advisor to try parallelization. With annotations added, Advisor can determine whether the annotated areas are suitable for parallelization and give estimates of the performance boost if the areas are parallelized. Lastly, a Dependencies analysis can identify the data sharing issues within annotated code sections. Similar to the Vectorization Workflow, you can modify the source code and repeat these analyses as necessary to parallelize a serial application.

How is it actually used?

The following walk-through assumes that you have Visual Studio 2015 and Intel Advisor 2017 installed.

Preparations:

1. Download and unzip Prefix Scan.zip to a preferred location and open it with Visual Studio 2015.

2. Set the Balanced Tree project as StartUp Project.

File:S1-2.png

3. Find the Advisor’s directory by executing the following command in cmd.exe: >set advisor

4. Change the following project properties:

a. In C/C++ > General > Additional Include Directories, add the Advisor’s directory using macro notation: $(ADVISOR_..._DIR)include (or $(ADVISOR_..._DIR)\include if the environment variable does not end with a backslash).
b. In C/C++ > General > Debug Information Format, confirm it is set to Program Database (/Zi).

c. In Linker > Debugging > Generate Debug Info, set it to Optimize for debugging (/DEBUG).

d. In C/C++ > Optimization > Optimization, confirm it is set to Maximize Speed (/O2) or higher.
e. On the same page, set Inline Function Expansion to Only __inline (/Ob1).

f. In C/C++ > Code Generation > Runtime Library, confirm it has been set to Multi-threaded DLL (/MD); another option is to set this field to Multi-threaded Debug DLL (/MDd).

g. Enable OpenMP under C/C++ > Language > OpenMP Support by setting it to Generate Parallel Code (/Qopenmp).

h. Click OK to save the properties.

5. Comment out the “terminate” section in w3.main.cpp to end the application without waiting for user input.

6. Clean the Solution and Build the Project to generate the binary.

7. Launch Advisor through Windows Start > All Programs > Intel Parallel Studio XE 2017 > Analyzers > Advisor 2017

8. Select File > New > Project… to start a new project.

9. Provide a name for the project in the Create a Project window.

10. Under the Analysis Target tab, add the location of the Balance Tree.exe to the Application field using the Browse… button beside the field (or type the path in manually).

11. In the Application parameters field, add the parameters to use when executing the application.

12. Ensure the Inherit settings from Survey Hotspots Analysis Type checkbox is checked in Suitability Analysis and Dependencies Analysis.

13. Check Collect information about FLOPS, L1 memory traffic, and AVX-512 mask usage for a complete Trip Count Analysis; this step is optional.

14. Under the Binary/Symbol Search tab, add the visual studio project’s Release folder as a search directory. There will be warnings saying you are missing some symbols during the Survey Analysis, please ignore them.

15. Under the Source Search tab, provide the location of the application’s source code.

16. Select OK to complete the project creation process.