We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy.
Unfortunately, activation email could not send to your email. Please try again.

Webhdfs request with data node failure

Thread ID:

Created:

Updated:

Platform:

Replies:

130531 May 17,2017 04:25 AM May 18,2017 09:42 AM Big Data Platform 1
loading
Tags: General
Shadi
Asked On May 17, 2017 04:25 AM

i have a syncfusion cluster that consists of four nodes
mn1,mn2 , dn1 and dn2
and i have asked before about high availability with webhdfs request in this question
https://www.syncfusion.com/forums/129179/high-availability-with-web-request
so if we want to upload a file called myfile under hadoophome
we should issue the url
http://mn1:50070/webhdfs/v1/hadoophome/myfile/?user.name=root&op=OPEN
and in  the state of name node failure  we have to handle that from client application as you mentioned and it worked well
but what about the state of data node failure
suppose that dn2 is off and I issued upload  request from my project to active name node  mn1  so mn1 should redirect this request to dn1  (I mean to live nodes only)
but it redirects it to dn1 some times  and the request successes
and sometimes to dn2 and the request fails with request time out
should data node failure be handled from project too ,and how to specify the node that we should redirect  too and offset value  in upload request
any help please ؟

Nandhini K [Syncfusion]
Replied On May 18, 2017 09:42 AM

Hi Shadi, 
 
Thank you for using Syncfusion products. 
 
Please find the response as follows, 
Query 
Response 
Should data node failure be handled from project too? 
No. we don’t want to handle it from the client side project. 
 
It should be handled by Name Node itself. The reason to redirect to dead node(here dn2) sometimes is that the Name Node takes certain amount of time to mark the node as dead based on the below mentioned properties. The default time that takes to mark any dead node is 10 minutes. But you can change it as needed in Hdfs-site.xml using cluster manager by referring this link. 
 
Property  
Default value 
Description 
dfs.namenode.heartbeat.recheck-interval 
300000 
 
This time decides the interval to check for expired datanodes. With this value and dfs.heartbeat.interval, the interval of deciding the datanode is stale or not is also calculated. The unit of this configuration is millisecond. 
dfs.heartbeat.interval 
3 
Determines data node heartbeat interval in seconds. 
dfs.client.write.exclude.nodes.cache.expiry.interval.millis 
600000 
 
The maximum period to keep a DN in the excluded nodes list at a client. After this period, in milliseconds, the previously excluded node(s) will be removed automatically from the cache. The unit of this configuration is millisecond. 
 
how to specify the node that we should redirect  too? 
By default we can’t do that and it is not recommended too. 
Recommendation is let Name Node handle it. 
offset value  in upload request? 
We have created a custom sample for your requirement. In the sample, enter the offset value of a uploading file. It will upload a file with text starting from the given offset value. 
 
Sample: 
 
 
 
 
 
 
Note: 
We can set offset value request only for file Open type operations.  Upload operations does not include open operation. 
For example: 
 
 
 
 
Please find more details about offset in the following link  
 
 
Regards, 
Nandhini K 


CONFIRMATION

This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.

;