Tuesday, November 30, 2010

Error in starting jobtracker or namenode: rsync: connection unexpectedly closed (0 bytes received so far) [receiver]

After setting the hadoop cluster, it is possible that you get error rsync: connection unexpectedly closed (0 bytes received so far) [receiver] or unexplained error (code 255) at io.c(463) [receiver=2.6.8] when you try to start the mapred daemon (start-mapred.sh) or dfs daemon (start-dfs.sh) or all daemons (start-all.sh) . 

This means that for something is wrong in the ssh connections in the cluster. For rsync (or hadoop in cluster) to work, the you should be able to ssh between the following hadoop components without any password or prompts.
- Jobtracker to Tasktrackers
Jobtracker to Namenode

Namenode to DataNodes
Namenode to Jobtracker
Datanodes to NameNode
- Tasktrackers to Jobtracker

Once ssh is working between the above 6 directions, these errors should go away.

Supppose a hadoop cluster is composed of the following machines
j.jeka.com : Jobtracker
n.jeka.com: Name node
t1.jeka.com: Datanode and Tasktracker
t1.jeka.com: Datanode and Tasktracker

then from following ssh's should work
Jobtracker to Tasktrackers
j.jeka.com > n.jeka.com

Jobtracker to Tasktrackers
j.jeka.com > t1.jeka.com
j.jeka.com > t2.jeka.com

Namenode to Jobtracker

n.jeka.com > j.jeka.com

Namenode to DataNodes
n.jeka.com > t1.jeka.com
n.jeka.com > t2.jeka.com

Datanodes to NameNode

t1.jeka.com > n.jeka.com
t2.jeka.com > n.jeka.com

Tasktrackers to Jobtracker

t1.jeka.com > j.jeka.com
t2.jeka.com > j.jeka.com

Please note that it is not required to be able to ssh from one task tracker to another

No comments:

Post a Comment