Difference between revisions of "GPU621/Intel Inspector"

From CDOT Wiki
Jump to: navigation, search
 
(13 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
= Intel Parallel Studio Inspector =
 
= Intel Parallel Studio Inspector =
 
== Project Overview ==
 
== Project Overview ==
Intel Inspector is a dynamic memory and threading error debugger which able to detect and locate memory leaks, deadlocks, and race conditions. The purpose of this project is to introduce Intel Inspector and demonstrate how to use Inspector to debugging our code.  
+
Intel Inspector is a dynamic memory and threading error debugger which able to detect and locate memory leaks, deadlocks, and race conditions. The purpose of this project is to introduce Intel Inspector and demonstrate how to use Inspector to debug our code.  
  
 
== Group Members ==
 
== Group Members ==
Line 9: Line 9:
 
[[User:Skvasa | Saumya Vasa]]
 
[[User:Skvasa | Saumya Vasa]]
  
= Memory Leak =
+
= Features =
 +
The purpose of the Intel inspector is to help us find difficult and non-deterministic errors in large programs. As the program gets bigger and has complicated logic it is difficult to find memory leaks and threading errors. Some of it's main features are
  
= References =
+
* Locate Nondeterministic Threading Errors
 +
 
 +
Threading errors are usually nondeterministic and difficult to reproduce. Intel Inspector helps detect and locate them, including data race conditions (heap and stack races), deadlocks, lock hierarchy violations, and then cross-thread stack access errors.
 +
 
 +
* Detect Hard-to-Find Memory Errors
 +
 
 +
Memory errors can be difficult to find, such as memory leaks, corruption, mismatched allocation and deallocation API, inconsistent use of memory API, illegal memory access, and uninitialized memory read. Intel Inspector finds these errors and integrates with a debugger to identify the associated issues. It also diagnoses memory growth and locates the call stack causing it.
 +
 
 +
* Simplify the Diagnosis of Difficult Errors
 +
 
 +
Debugger breakpoints diagnose errors by breaking into the debugger just before the error occurs. When debugging outside of Intel Inspector, a breakpoint stops execution at the right location. The problem with this is that the location might be executed thousands of times before the error occurs. By combining debug with analysis, Intel Inspector determines when a problem occurs and breaks into the debugger at the right time and location.
 +
 
 +
* Find Persistence Memory Errors
 +
 
 +
Intel® Optane™ DC persistent memory is a new memory technology with high-capacity persistent memory for the data center. It maintains data even when the power is shut off, but this data must first be properly flushed out of volatile cache memory. Persistence Inspector helps find possible persistent memory errors so that the system operates correctly when the power is restored.
 +
 
 +
It detects:
 +
 
 +
Missing or redundant cache flushes
 +
Missing store fences
 +
Out-of-order persistent memory stores
 +
Incorrect undo logging for the
 +
 
 +
==Software supported==
 +
 
 +
Intel Inspector supports various languages (C, C++, and Fortran), operating systems (Windows and Linux), IDEs (Visual Studio, Eclipse, etc.), and compilers (Intel C++, Intel Fortran, Visual C++, GCC, etc.). It also supports OpenMP, TBB, Parallel language extensions for the Intel C++ Compiler, Microsoft PPL, Win32 and POSIX threads, Intel MPI Library.
 +
 
 +
=Tutorial=
 +
 
 +
 
 +
We will analyze a simple code with memory leak to demonstrate the steps to use Intel Inspector's Analysis
 +
 
 +
Step 1: Write your code which is to be analyzed and build the project
 +
 
 +
[[File:Step1 intelInspector.png|1000px|alt text]]
 +
 
 +
 
 +
Step 2: Go to tools > Intel Inspector > Memory Error Analysis (there are several other options which can be used as per requirements)
 +
 
 +
[[File:Step2 IntelInspector.png|1000px|alt text]]
 +
 
 +
==Analysis Panel Details==
 +
 
 +
[[File:Step3 IntelInspector.png|1000px|alt text]]
 +
 
 +
[[File:Step4 IntelInspector.png|1000px|alt text]]
 +
 
 +
==On-demand Memory Analysis==
 +
 
 +
Intel Inspector customarily displays memory leaks at the end of an analysis run when an application exits; however, you can also use the Intel Inspector on-demand memory leak detection feature to gather memory leak information while an application is running. This is useful if:
 +
* An application does not terminate (such as a server process).
 +
* You want memory leak information, but you do not want to wait for an application to terminate.
 +
* You want to determine if memory is leaked during a specific interval of application execution, or during a specific user action.
 +
 
 +
=Invalid Memory Access=
 +
 
 +
Here we are accessing c[1] which is already deleted. Upon analyzing this code we get invalid memory access error along with the line number where the invalid access occurs. There are screenshots of this analysis after this code.
 +
 
 +
<syntaxhighlight lang="cpp" line='line'>
 +
#include<iostream>
 +
int main()
 +
{
 +
char* c;
 +
c = new char[100];//requests heap memory which will not be freed
 +
for (int i = 0;i < 100;i++) {
 +
c[i] = 'a';
 +
}
 +
std::cout << c[10] << std::endl;
 +
 
 +
delete[] c;
 +
c[1] = 'a';
 +
}
 +
</syntaxhighlight>
 +
 
 +
 
 +
[[File:Intel inspector2 - Microsoft Visual Studio 8 11 2021 3 57 42 PM.png|1000px|alt text]]
 +
 
 +
 
 +
= Walkthrough =
 +
 
 +
== Memory Leak ==
 +
 
 +
=== Memory Leak ===
 +
 
 +
 
 +
 
 +
This program is written in C++, it will allocate an array of integer pointers then terminated.
 +
 
 +
<syntaxhighlight lang="cpp" line='line'>
 +
 
 +
int main()
 +
{
 +
int* myInts = new int[5] ;
 +
 
 +
return 0 ;
 +
}
 +
 
 +
</syntaxhighlight>
 +
 
 +
 
 +
 
 +
The result shows there is memory leak on line 4.
 +
 
 +
[[File:memoryLeak_notFreed.png|1200px]]
 +
 
 +
 
 +
 
 +
The Inspector User Guide has provided the following solutions.
 +
 
 +
[[File:UserGuide MemoryLeak.png|1000px]]
 +
 
 +
 
 +
 
 +
=== Mismatched Deallocation and Missing Allocation ===
 +
 
 +
 
 +
 
 +
Try to deallocate the memory by using two delete keyword at the same time.
 +
 
 +
<syntaxhighlight lang="cpp" line='line'>
 +
 
 +
int main()
 +
{
 +
int* myInts = new int[5] ;
 +
 
 +
delete myInts ; // deallocates one object
 +
delete[] myInts ; // deallocate an array of object
 +
 
 +
return 0 ;
 +
}
 +
 
 +
</syntaxhighlight>
 +
 
 +
 
 +
 
 +
If there is type mismatched deallocation, Inspector will mark down the allocation and mismatched deallocation line.
 +
 
 +
[[File:memoryLeak_mismatched.png|1200px]]
 +
 
 +
 
 +
 
 +
There is missing allocation occurred on line 7 because we have deallocated one object in the previous line.
 +
 +
[[File:memoryLeak_missing.png|1200px]]
 +
 
 +
 
 +
 
 +
The Inspector User Guide has provided the following solutions.
 +
 
 +
[[File:UserGuide Mismatched.png|1000px]]
 +
 
 +
 
 +
 
 +
=== Invalid Memory Access ===
 +
 
 +
 
 +
 
 +
Try to assign value to a deallocated object.
 +
 
 +
<syntaxhighlight lang="cpp" line='line'>
 +
 
 +
int main()
 +
{
 +
int* myInts = new int[5] ;
 +
 
 +
delete[] myInts ;
 +
 
 +
myInts[0] = 2 ;
 +
 
 +
return 0 ;
 +
}
 +
 
 +
</syntaxhighlight>
 +
 
 +
 
 +
 
 +
A problem of type invalid memory access is shown. The lines where we are assigning value to the deallocated object and where the object is being allocated and deallocated are being listed out by the Inspector.
 +
 
 +
[[File:InvalidMemoryAccess.png|1200px]]
 +
 
 +
 
 +
 
 +
Here is a diagram which demonstrate the process of invalid memory access.
 +
 
 +
[[File:UserGuide InvalidMemoryAccess Diagram.png|500px]]
 +
 
 +
 
 +
 
 +
The Inspector User Guide has provided the following solutions.
 +
 
 +
[[File:UserGuide InvalidMemoryAccess.png|1000px]]
 +
 
 +
 
 +
 
 +
== Race Condition ==
 +
 
 +
=== Code with Race Condition ===
 +
 
 +
 
 +
 
 +
This program is written in C, it will calculate value of pi by calculating the area under a curve and it is using OpenMP library.
 +
 
 +
<syntaxhighlight lang="c" line='line'>
 +
 
 +
#include <stdio.h>
 +
#include <stdlib.h>
 +
#include <omp.h>
 +
 
 +
int main()
 +
{
 +
long long int i, n = 10000000;
 +
double x, pi;
 +
double sum = 0.0;
 +
double step = 1.0 / (double)n;
 +
 
 +
#pragma omp parallel for private(i,x)
 +
for (i = 0; i < n; i++)
 +
{
 +
x = (i + 0.5) * step;
 +
sum += 4.0 / (1.0 + x * x);
 +
}
 +
 
 +
pi = step * sum;
 +
 
 +
printf("pi = %17.15f\n", pi);
 +
 
 +
return 0;
 +
}
 +
 
 +
</syntaxhighlight>
 +
 
 +
 
 +
 
 +
There is a race condition happening on line 15. From the timeline at at the bottom right of the screenshot we can see that thread #1 and #2 are competing to write data to sum variable.
 +
 
 +
[[File:RaceCondition.png|1200px]]
 +
 
 +
 
 +
 
 +
Here is a diagram which demonstrate the race condition of two threads are trying to write data.
 +
 
 +
[[File:UserGuide RaceCondition Diagram WW.png|800px]]
 +
 
 +
 
 +
 
 +
At this time, thread #0 is trying to read data and thread #6 is trying to write data.
 +
 
 +
This shows that Inspector is able to capture the movement of the threads.
 +
 
 +
[[File:RaceCondition Timeline.png|1200px]]
 +
 
 +
 
 +
 
 +
Here is a diagram which demonstrate the race condition of one thread is trying to read data while the other one is trying to write data.
 +
 
 +
[[File:UserGuide RaceCondition Diagram RW.png|800px]]
 +
 
 +
 
 +
 
 +
The Inspector User Guide has provided the following solutions.
 +
 
 +
[[File:UserGuide RaceCondition.png|1000px]]
 +
 
 +
 
 +
=== Fixed Code ===
 +
 
 +
 
 +
 
 +
This is the solution to above code.
 +
 
 +
<syntaxhighlight lang="c" line='line'>
 +
 
 +
#include <stdio.h>
 +
#include <stdlib.h>
 +
#include <omp.h>
 +
 
 +
int main()
 +
{
 +
long long int i, n = 10000000;
 +
double x, pi;
 +
double sum = 0.0;
 +
double step = 1.0 / (double)n;
 +
 
 +
#pragma omp parallel for private(i,x)
 +
for (i = 0; i < n; i++)
 +
{
 +
x = (i + 0.5) * step;
 +
#pragma omp atomic
 +
sum += 4.0 / (1.0 + x * x);
 +
}
 +
 
 +
pi = step * sum;
 +
 
 +
printf("pi = %17.15f\n", pi);
 +
 
 +
return 0;
 +
}
 +
 
 +
</syntaxhighlight>
 +
 
 +
 
 +
 
 +
= Reference =
 +
 
 +
[http://www.example.com Intel Inspector User Guide]

Latest revision as of 17:56, 11 August 2021


GPU621/DPS921 | Participants | Groups and Projects | Resources | Glossary

Intel Parallel Studio Inspector

Project Overview

Intel Inspector is a dynamic memory and threading error debugger which able to detect and locate memory leaks, deadlocks, and race conditions. The purpose of this project is to introduce Intel Inspector and demonstrate how to use Inspector to debug our code.

Group Members

Joyce Wei
Saumya Vasa

Features

The purpose of the Intel inspector is to help us find difficult and non-deterministic errors in large programs. As the program gets bigger and has complicated logic it is difficult to find memory leaks and threading errors. Some of it's main features are

  • Locate Nondeterministic Threading Errors

Threading errors are usually nondeterministic and difficult to reproduce. Intel Inspector helps detect and locate them, including data race conditions (heap and stack races), deadlocks, lock hierarchy violations, and then cross-thread stack access errors.

  • Detect Hard-to-Find Memory Errors

Memory errors can be difficult to find, such as memory leaks, corruption, mismatched allocation and deallocation API, inconsistent use of memory API, illegal memory access, and uninitialized memory read. Intel Inspector finds these errors and integrates with a debugger to identify the associated issues. It also diagnoses memory growth and locates the call stack causing it.

  • Simplify the Diagnosis of Difficult Errors

Debugger breakpoints diagnose errors by breaking into the debugger just before the error occurs. When debugging outside of Intel Inspector, a breakpoint stops execution at the right location. The problem with this is that the location might be executed thousands of times before the error occurs. By combining debug with analysis, Intel Inspector determines when a problem occurs and breaks into the debugger at the right time and location.

  • Find Persistence Memory Errors

Intel® Optane™ DC persistent memory is a new memory technology with high-capacity persistent memory for the data center. It maintains data even when the power is shut off, but this data must first be properly flushed out of volatile cache memory. Persistence Inspector helps find possible persistent memory errors so that the system operates correctly when the power is restored.

It detects:

Missing or redundant cache flushes Missing store fences Out-of-order persistent memory stores Incorrect undo logging for the

Software supported

Intel Inspector supports various languages (C, C++, and Fortran), operating systems (Windows and Linux), IDEs (Visual Studio, Eclipse, etc.), and compilers (Intel C++, Intel Fortran, Visual C++, GCC, etc.). It also supports OpenMP, TBB, Parallel language extensions for the Intel C++ Compiler, Microsoft PPL, Win32 and POSIX threads, Intel MPI Library.

Tutorial

We will analyze a simple code with memory leak to demonstrate the steps to use Intel Inspector's Analysis

Step 1: Write your code which is to be analyzed and build the project

alt text


Step 2: Go to tools > Intel Inspector > Memory Error Analysis (there are several other options which can be used as per requirements)

alt text

Analysis Panel Details

alt text

alt text

On-demand Memory Analysis

Intel Inspector customarily displays memory leaks at the end of an analysis run when an application exits; however, you can also use the Intel Inspector on-demand memory leak detection feature to gather memory leak information while an application is running. This is useful if:

  • An application does not terminate (such as a server process).
  • You want memory leak information, but you do not want to wait for an application to terminate.
  • You want to determine if memory is leaked during a specific interval of application execution, or during a specific user action.

Invalid Memory Access

Here we are accessing c[1] which is already deleted. Upon analyzing this code we get invalid memory access error along with the line number where the invalid access occurs. There are screenshots of this analysis after this code.

#include<iostream>
int main()
{
	char* c;
	c = new char[100];//requests heap memory which will not be freed
	for (int i = 0;i < 100;i++) {
		c[i] = 'a';
	}
	std::cout << c[10] << std::endl;

	delete[] c;
	c[1] = 'a';
}


alt text


Walkthrough

Memory Leak

Memory Leak

This program is written in C++, it will allocate an array of integer pointers then terminated.

int main()
{
	int* myInts = new int[5] ; 

	return 0 ; 
}


The result shows there is memory leak on line 4.

MemoryLeak notFreed.png


The Inspector User Guide has provided the following solutions.

UserGuide MemoryLeak.png


Mismatched Deallocation and Missing Allocation

Try to deallocate the memory by using two delete keyword at the same time.

int main()
{
	int* myInts = new int[5] ; 

	delete myInts ; // deallocates one object
	delete[] myInts ; // deallocate an array of object

	return 0 ; 
}


If there is type mismatched deallocation, Inspector will mark down the allocation and mismatched deallocation line.

MemoryLeak mismatched.png


There is missing allocation occurred on line 7 because we have deallocated one object in the previous line.

MemoryLeak missing.png


The Inspector User Guide has provided the following solutions.

UserGuide Mismatched.png


Invalid Memory Access

Try to assign value to a deallocated object.

int main()
{
	int* myInts = new int[5] ; 

	delete[] myInts ;

	myInts[0] = 2 ; 

	return 0 ; 
}


A problem of type invalid memory access is shown. The lines where we are assigning value to the deallocated object and where the object is being allocated and deallocated are being listed out by the Inspector.

InvalidMemoryAccess.png


Here is a diagram which demonstrate the process of invalid memory access.

UserGuide InvalidMemoryAccess Diagram.png


The Inspector User Guide has provided the following solutions.

UserGuide InvalidMemoryAccess.png


Race Condition

Code with Race Condition

This program is written in C, it will calculate value of pi by calculating the area under a curve and it is using OpenMP library.

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main()
{
	long long int i, n = 10000000;
	double x, pi;
	double sum = 0.0;
	double step = 1.0 / (double)n;

#pragma omp parallel for private(i,x)
		for (i = 0; i < n; i++)
		{
			x = (i + 0.5) * step;
			sum += 4.0 / (1.0 + x * x);
		}

	pi = step * sum;

	printf("pi = %17.15f\n", pi);

	return 0;
}


There is a race condition happening on line 15. From the timeline at at the bottom right of the screenshot we can see that thread #1 and #2 are competing to write data to sum variable.

RaceCondition.png


Here is a diagram which demonstrate the race condition of two threads are trying to write data.

UserGuide RaceCondition Diagram WW.png


At this time, thread #0 is trying to read data and thread #6 is trying to write data.

This shows that Inspector is able to capture the movement of the threads.

RaceCondition Timeline.png


Here is a diagram which demonstrate the race condition of one thread is trying to read data while the other one is trying to write data.

UserGuide RaceCondition Diagram RW.png


The Inspector User Guide has provided the following solutions.

UserGuide RaceCondition.png


Fixed Code

This is the solution to above code.

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main()
{
	long long int i, n = 10000000;
	double x, pi;
	double sum = 0.0;
	double step = 1.0 / (double)n;

#pragma omp parallel for private(i,x)
		for (i = 0; i < n; i++)
		{
			x = (i + 0.5) * step;
#pragma omp atomic
			sum += 4.0 / (1.0 + x * x);
		}

	pi = step * sum;

	printf("pi = %17.15f\n", pi);

	return 0;
}


Reference

Intel Inspector User Guide