Changes

GPU621/Apache Spark

No change in size, 12:54, 30 November 2020

→‎Architecture

== Architecture ==

Hadoop has a master-slave architecture as shown in figure 13.1. A small Hadoop cluster consists of a single master and multiple worker nodes. The master node consists of a Job Tracker, Task Tracker, NameNode, and DataNode. A worker node acts as both a task tracker and a DataNode. A file on HDFS is split into multiple blocks and each block is replicated within the Hadoop cluster. NameNode is the master server while the DataNodes store and maintain the blocks. The DataNodes are responsible for retrieving the blocks when requested by the NameNode. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode. [[File: Hadoop_1.png|thumb|upright=1|right|alt=Hadoop cluster|13.1 A multi-node Hadoop Cluster]]

== Components ==

Abalachandran7

73

edits

Changes

GPU621/Apache Spark

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools