We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date

Webhdfs request with data node failure

i have a syncfusion cluster that consists of four nodes
mn1,mn2 , dn1 and dn2
and i have asked before about high availability with webhdfs request in this question
https://www.syncfusion.com/forums/129179/high-availability-with-web-request
so if we want to upload a file called myfile under hadoophome
we should issue the url
http://mn1:50070/webhdfs/v1/hadoophome/myfile/?user.name=root&op=OPEN
and in  the state of name node failure  we have to handle that from client application as you mentioned and it worked well
but what about the state of data node failure
suppose that dn2 is off and I issued upload  request from my project to active name node  mn1  so mn1 should redirect this request to dn1  (I mean to live nodes only)
but it redirects it to dn1 some times  and the request successes
and sometimes to dn2 and the request fails with request time out
should data node failure be handled from project too ,and how to specify the node that we should redirect  too and offset value  in upload request
any help please ؟

1 Reply

NK Nandhini K Syncfusion Team May 18, 2017 01:42 PM UTC

Hi Shadi, 
 
Thank you for using Syncfusion products. 
 
Please find the response as follows, 
Query 
Response 
Should data node failure be handled from project too? 
No. we don’t want to handle it from the client side project. 
 
It should be handled by Name Node itself. The reason to redirect to dead node(here dn2) sometimes is that the Name Node takes certain amount of time to mark the node as dead based on the below mentioned properties. The default time that takes to mark any dead node is 10 minutes. But you can change it as needed in Hdfs-site.xml using cluster manager by referring this link. 
 
Property  
Default value 
Description 
dfs.namenode.heartbeat.recheck-interval 
300000 
 
This time decides the interval to check for expired datanodes. With this value and dfs.heartbeat.interval, the interval of deciding the datanode is stale or not is also calculated. The unit of this configuration is millisecond. 
dfs.heartbeat.interval 
3 
Determines data node heartbeat interval in seconds. 
dfs.client.write.exclude.nodes.cache.expiry.interval.millis 
600000 
 
The maximum period to keep a DN in the excluded nodes list at a client. After this period, in milliseconds, the previously excluded node(s) will be removed automatically from the cache. The unit of this configuration is millisecond. 
 
how to specify the node that we should redirect  too? 
By default we can’t do that and it is not recommended too. 
Recommendation is let Name Node handle it. 
offset value  in upload request? 
We have created a custom sample for your requirement. In the sample, enter the offset value of a uploading file. It will upload a file with text starting from the given offset value. 
 
Sample: 
 
 
 
 
 
 
Note: 
We can set offset value request only for file Open type operations.  Upload operations does not include open operation. 
For example: 
 
 
 
 
Please find more details about offset in the following link  
 
 
Regards, 
Nandhini K 


Loader.
Up arrow icon