Sunday, January 25, 2015

Issue in put files to hadoop - 0 datanodes running

I setup hadoop as I have described in my previous posts. While I was working in some interesting projects on Map-Reduce, I faced an issue on sending some of my local files to the hdfs node. I have started dfs already, but exception says
14/10/25 17:46:20 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException( File /user/ashansa/input._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget( at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock( at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock( at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock( at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
I faced this issue several times, so I think it will be helpful if I tell you how to overcome this issue. ( I believe this is a common issue we get while running hdfs, since I got the same issue several times and I found that many others also have faced the same issue )

What worked for me?

1. Stop the hadoop cluster with

2. Clean hadoop tmp directory
    You can find the path to your hadoop tmp directory in hdfs-site.xml.
    Check for the following tag in hdfs-site.xml file.
    You can find the path to your hadoop temp directory in <value> tag.

3. Format node
             bin/hadoop namenode -format

4. Start hadoop cluster

With these few simple steps, I was able to overcome this issue.
Hope this will help you too.