Friday, December 10, 2010

ERROR: hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink

While running a job once I got the following exception


10/12/10 21:09:05 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 10.1.73.148:50010
10/12/10 21:09:05 INFO hdfs.DFSClient: Abandoning block blk_3623545154924652323_87440
10/12/10 21:09:11 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
10/12/10 21:09:11 INFO hdfs.DFSClient: Abandoning block blk_-4726571439643867938_87441\


REASON
The error contains the IP address (10.1.73.148) of the tasktracker/datanode machine for which the exception is thrown. The exception is thrown because the datanode daemon is not running on that machine; you can check this by logging into this machine, lets use 10.1.73.148 in the example, and running command
ps -eaf | grep "DataNode" | grep -v "grep"
If no lines are returned then this means that datanode daemon is not running on 10.1.73.148.

What happened is that machine 10.1.73.148 contain a data block that is required for the job that you are trying to run. If this block is replicated on other machines and those machines are running datanode daemons then this is not a problem, Hadoop will get the data block from some other machine and continue the job but if for any reason the data block is not available on any other node then your job will fail.


RESOLUTION
Logon to 10.1.73.148 and run the following command
hadoop-daemon.sh start datanode
The above command should start the datanode daemon on 10.1.73.148. You can double check this my running command
ps -eaf | grep "DataNode" | grep -v "grep"
It should return 1 line

Thats it. Try running the job again. It should not throw exception anymore

3 comments:

  1. This is not true.....
    My Cluster's Data Nodes are running and still ps -eaf | grep "DataNode" | grep -v "grep" returns nothing....When I use hadoop-daemon.sh start datanode then I got the message....datanode running as process 22134. Stop it first.

    ReplyDelete
  2. I have read your blog it was nice to follow even I am looking for your future updates. Hadoop is a highly growing & scoopful technology in IT market it’s an open-source software framework for managing big data in a distributed fashion on large commodity computing hardware. so get your career with Hadoop.
    Hadoop training in chennai

    ReplyDelete