Changes

Jump to: navigation, search

Ghost Cells

4,395 bytes added, 15:17, 6 April 2019
Profiles
|
<source>
Reading in data
Setting buffer
==6484== NVPROF is profiling process 6484, command: .\pcie.exe .\test3.csv .\output3.csv 1000
==6484== Profiling application: .\pcie.exe .\test3.csv .\output3.csv 1000
==6484== Warning: 43 API trace records have same start and end timestamps.
This can happen because of short execution duration of CUDA APIs and low timer resolution on the underlying operating sy
stem.
==6484== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 99.86% 29.120ms 1000 29.119us 24.158us 30.589us DPS::update(float*, float const *, int, int
, float, float)
0.09% 26.269us 2 13.134us 12.990us 13.279us [CUDA memcpy HtoD]
0.04% 13.119us 1 13.119us 13.119us 13.119us [CUDA memcpy DtoH]
API calls: 71.59% 183.43ms 2 91.713ms 10.265us 183.42ms cudaMalloc
15.25% 39.069ms 1 39.069ms 39.069ms 39.069ms cuDevicePrimaryCtxRelease
8.04% 20.601ms 3 6.8671ms 81.478us 20.424ms cudaMemcpy
3.76% 9.6313ms 1000 9.6310us 6.7360us 335.53us cudaLaunchKernel
1.26% 3.2196ms 96 33.537us 0ns 1.6234ms cuDeviceGetAttribute
0.05% 127.03us 1 127.03us 127.03us 127.03us cuModuleUnload
0.04% 107.78us 2 53.890us 22.454us 85.327us cudaFree
0.00% 10.265us 1 10.265us 10.265us 10.265us cuDeviceTotalMem
0.00% 9.6230us 1 9.6230us 9.6230us 9.6230us cuDeviceGetPCIBusId
0.00% 1.2820us 2 641ns 320ns 962ns cuDeviceGet
0.00% 962ns 3 320ns 0ns 641ns cuDeviceGetCount
0.00% 962ns 1 962ns 962ns 962ns cuDeviceGetName
0.00% 321ns 1 321ns 321ns 321ns cuDeviceGetUuid
0.00% 321ns 1 321ns 321ns 321ns cuDeviceGetLuid
</source>
|}
|
<source>
Allocate initial memory
Reading in data
Setting buffer
==2720== NVPROF is profiling process 2720, command: .\alt.exe .\test3.csv .\output3.csv 1000
==2720== Profiling application: .\alt.exe .\test3.csv .\output3.csv 1000
==2720== Warning: 50 API trace records have same start and end timestamps.
This can happen because of short execution duration of CUDA APIs and low timer resolution on the underlying operating sy
stem.
==2720== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 99.88% 25.679ms 1 25.679ms 25.679ms 25.679ms DPS::update(float*, int, int, float, float,
unsigned int, unsigned int)
0.06% 16.670us 1 16.670us 16.670us 16.670us [CUDA memcpy HtoD]
0.05% 12.575us 1 12.575us 12.575us 12.575us [CUDA memcpy DtoH]
0.00% 576ns 1 576ns 576ns 576ns [CUDA memset]
API calls: 70.46% 158.87ms 1 158.87ms 158.87ms 158.87ms cudaMalloc
16.71% 37.678ms 1 37.678ms 37.678ms 37.678ms cudaDeviceReset
11.48% 25.877ms 2 12.938ms 60.947us 25.816ms cudaMemcpy
1.25% 2.8161ms 96 29.334us 0ns 1.3867ms cuDeviceGetAttribute
0.06% 133.12us 1 133.12us 133.12us 133.12us cudaFree
0.02% 47.475us 1 47.475us 47.475us 47.475us cudaMemset
0.01% 18.605us 1 18.605us 18.605us 18.605us cudaLaunchKernel
0.01% 11.548us 1 11.548us 11.548us 11.548us cuDeviceTotalMem
0.00% 9.9440us 1 9.9440us 9.9440us 9.9440us cuDeviceGetPCIBusId
0.00% 1.2830us 1 1.2830us 1.2830us 1.2830us cuDeviceGetName
0.00% 963ns 3 321ns 0ns 642ns cuDeviceGetCount
0.00% 642ns 1 642ns 642ns 642ns cuDeviceGetLuid
0.00% 641ns 2 320ns 0ns 641ns cuDeviceGet
0.00% 0ns 1 0ns 0ns 0ns cuDeviceGetUuid
</source>
|}
 
====== GPU Offload Vs CPU ======
[[File:Gc-spa.png | 800px]]
=== Assignment 3 ===
44
edits

Navigation menu