Changes

Jump to: navigation, search

GPU621/Apache Spark

No change in size, 12:54, 30 November 2020
Architecture
== Architecture ==
Hadoop has a master-slave architecture as shown in figure 13.1. A small Hadoop cluster consists of a single master and multiple worker nodes. The master node consists of a Job Tracker, Task Tracker, NameNode, and DataNode. A worker node acts as both a task tracker and a DataNode. A file on HDFS is split into multiple blocks and each block is replicated within the Hadoop cluster. NameNode is the master server while the DataNodes store and maintain the blocks. The DataNodes are responsible for retrieving the blocks when requested by the NameNode. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode. [[File: Hadoop_1.png|thumb|upright=1|right|alt=Hadoop cluster|13.1 A multi-node Hadoop Cluster]]
== Components ==

Navigation menu