Revision as of 19:07, 10 April 2016

GPU621/DPS921 | Participants | Groups and Projects | Resources | Glossary

Intel Data Analytics Acceleration Library (DAAL)

Team Member

Intro OLD

Local DAAL Examples Location: C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016\windows\daal\examples

Data: http://open.canada.ca/data/en/dataset/cad804cd-454e-4bd7-9f22-fcee64f60719

New Data: http://open.canada.ca/data/en/dataset/be3880f2-0d04-4583-8265-611b231ebce8

Parser code: https://software.intel.com/en-us/node/610127

Low Order Moments: https://software.intel.com/en-us/node/599561

Our goal is to parse & process this crime data and to add more meaning to said data. Using various parallel techniques taught in the course and comparing them via the DAAL library.

Introduction

DAAL is a C++ & Java / Scala library for data analytics. It's similar to MKL with some differences:

MKL focuses on computation. DAAL focuses on the entire data flow (aquisition, transformation, processing).
Optimized for all kinds of Intel based devices (from data center to home computers)

DAAL supports 3 processing modes

Offline Processing (Batch) - Data can fit in memory, data can be processed all at once.
Online Processing (Streaming) - Data is too big for memory, DAAL processes the data in chunks and combine the partial results for the final result.
Distributed processing - Distributes data processing. DAAL has not bound the communication method and leaves it to the developer (Hadoop, Spark, MPI etc).

Parallel

Code Examples

Batch Sorting

/* file: sorting_batch.cpp
* Copyright 2014-2016 Intel Corporation All Rights Reserved.*/

#include "daal.h"
#include "service.h"

using namespace daal;
using namespace daal::algorithms;
using namespace daal::data_management;
using namespace std;

/* Input data set parameters */
string datasetFileName = "../data/batch/sorting.csv";

int main(int argc, char *argv[])
{
    checkArguments(argc, argv, 1, &datasetFileName);

    /* Initialize FileDataSource<CSVFeatureManager> to retrieve the input data from a .csv file */
    FileDataSource<CSVFeatureManager> dataSource(datasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext);

    /* Retrieve the data from the input file */
    dataSource.loadDataBlock();

    /* Create algorithm objects to sort data using the default (radix) method */
    sorting::Batch<> algorithm;

    /* Print the input observations matrix */
    printNumericTable(dataSource.getNumericTable(), "Initial matrix of observations:");

    /* Set input objects for the algorithm */
    algorithm.input.set(sorting::data, dataSource.getNumericTable());

    /* Sort data observations */
    algorithm.compute();

    /* Get the sorting result */
    services::SharedPtr<sorting::Result> res = algorithm.getResult();

    printNumericTable(res->get(sorting::sortedData), "Sorted matrix of observations:");

    return 0;
}

Useful Link

https://software.intel.com/en-us/daal

@@ Line 23: / Line 23: @@
 * MKL focuses on computation. DAAL focuses on the entire data flow (aquisition, transformation, processing).
 * Optimized for all kinds of Intel based devices (from data center to home computers)
 DAAL supports 3 processing modes
@@ Line 30: / Line 31: @@
+[[File:Daal-flow.png|center|alt=DAAL Data Flow.]]
 == Parallel ==

Difference between revisions of "SLEEPy"

Revision as of 19:07, 10 April 2016

Contents

Intel Data Analytics Acceleration Library (DAAL)

Team Member

Intro OLD

Introduction

Parallel

Code Examples

Batch Sorting

Useful Link

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools