From CDOT Wiki
Jump to: navigation, search


62 bytes removed, 8 April
Initial implementation
During assignment 2, we tried a simple kernel that took the shape of a dot product, what this achieved was nothing special, actually as predicted at the end of assignment 1, continuously calling cudaMalloc and cudaMemCpy had severe consequences on time.
====Initial implementation====
//version 1 dot product
__global__ void kdot(const float* d_a, const float* d_b, float* d_p, int ni, int nj, int nk) {
[[Source Code ms3 neural net]]
This is the final iteration, we will outline the take aways bellow.

Navigation menu