We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy.
Unfortunately, activation email could not send to your email. Please try again.

Cluster FailOver

Thread ID:





127346 Nov 9,2016 09:26 AM Nov 10,2016 05:27 AM Big Data Platform 1
Tags: General
Oula Alshiekh
Asked On November 9, 2016 09:26 AM

as you know that the main purpose of hadoop cluster
is its ability to manipulate data even if one of other datanodes has crashed
but with a cluster with three nodes:NameNode,SecondaryNameNode and DataNode
can we complete retrieving data even though the only data node is crashed and name node and secondary name node still alive

Karthikeyan SankaraVadivel [Syncfusion]
Replied On November 10, 2016 05:27 AM

Hi Oula, 


We regret for the inconvenience caused. 


I would like to explain the cluster failover in some detail. 


Name nodes (active and standby) store only meta data i.e. Information about data but actual data will resist only in data nodes, an active name node and data nodes with needed data are essential to retrieve our data from HDFS.  


To avoid single point failure in name nodes we have active and standby name nodes if active name node fails, standby name node will automatically become as active name node without affecting the cluster.  


Similarly, to avoid fault tolerant in data nodes, data replication will take place, i.e. number of copies of same data will be duplicated across different data nodes. By default, we will have replication factor as 3 in Syncfusion Hadoop cluster, but we need at least 3 data nodes to replicate data as replication factor is 3 in that case. 


In your case you have only one data node so we should up the data node to retrieve the data. After that you can add additional data nodes in existing cluster if you missed to have enough data nodes in cluster creation itself. 

Please refer the following link,  



To know more about higher availability and data replication, please refer the following pages. 

Higher availability: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html  

Data Replication: http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Data_Replication 


Please let me know if you have any queries. 



Karthikeyan S 


This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.