Question

I am implementing the Hadoop Single Node Cluster on my machine by following Michael Noll s tutorial and have come across data replication error:

这里的错误信息是:

> hadoop@laptop:~/hadoop$ bin/hadoop dfs -copyFromLocal
> tmp/testfiles testfiles
> 
> 12/05/04 16:18:41 WARN hdfs.DFSClient: DataStreamer Exception:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to
> 0 nodes, instead of 1   at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
>     at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>     at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)     at
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)     at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)     at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)     at
> java.security.AccessController.doPrivileged(Native Method)  at
> javax.security.auth.Subject.doAs(Subject.java:396)  at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 
>     at org.apache.hadoop.ipc.Client.call(Client.java:740)   at
> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)  at
> $Proxy0.addBlock(Unknown Source)    at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>     at $Proxy0.addBlock(Unknown Source)     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
>     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
>     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
> 
> 12/05/04 16:18:41 WARN hdfs.DFSClient: Error Recovery for block null
> bad datanode[0] nodes == null 12/05/04 16:18:41 WARN hdfs.DFSClient:
> Could not get block locations. Source file
> "/user/hadoop/testfiles/testfiles/file1.txt" - Aborting...
> copyFromLocal: java.io.IOException: File
> /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to
> 0 nodes, instead of 1 12/05/04 16:18:41 ERROR hdfs.DFSClient:
> Exception closing file /user/hadoop/testfiles/testfiles/file1.txt :
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to
> 0 nodes, instead of 1   at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
>     at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>     at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)     at
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)     at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)     at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)     at
> java.security.AccessController.doPrivileged(Native Method)  at
> javax.security.auth.Subject.doAs(Subject.java:396)  at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/testfiles/testfiles/file1.txt could only be replicated to
> 0 nodes, instead of 1   at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
>     at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>     at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)     at
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)     at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)     at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)     at
> java.security.AccessController.doPrivileged(Native Method)  at
> javax.security.auth.Subject.doAs(Subject.java:396)  at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 
>     at org.apache.hadoop.ipc.Client.call(Client.java:740)   at
> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)  at
> $Proxy0.addBlock(Unknown Source)    at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>     at $Proxy0.addBlock(Unknown Source)     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
>     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
>     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>     at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

在我执行时:

bin/stop-all.sh

它说,数据尚未开始,因此无法停止。但是,jps的产出。说数据是存在的。

我尝试了formatting the namenode,changeing Owner granteds,但似乎并未奏效。我希望我没有错过任何其他相关资料。

提前感谢。

Answer 1

为我工作的解决办法是,使用<代码>bin/start-all.sh,使用名称和数据。采用这种办法的情况是,如果你存在某种问题,确定网络的数据点,而且许多填补员额表明,名称需要一些时间才能开始计算,因此,在开始数据点之前,应当给予一定时间。此外,在这种情况下,我与不同的名称和数据标有问题,我不得不将数据标的子与名称相仿。

The step by step procedure will be:

Start the namenode bin/hadoop namenode. Check for errors, if any.
Start the datanodes bin/hadoop datanode. Check for errors, if any.
Now start the task-tracker, job tracker using bin/start-mapred.sh

Answer 2

看看你的名字(可能) 并看看有多少数据显示你。

If it is 0, then either your datanode isn t running or it isn t configured to connect to the namenode.

如果是1,则检查一下它说在外勤部有多大的自由空间。可能是,数据节点在任何地方都可以书写数据(数据 d没有,或没有书面许可)。

Answer 3

虽然已经解决,但我还是为未来的读者增添了这个内容。检查姓名开始和数据开始的建议是有用的,而进一步的调查导致我删除了 had/tore。对我来说,这样做解决了这一错误。

Answer 4

I had the same problem, I took a look at the datanode logs and there was a warning saying that the dfs.data.dir had incorrect permissions... so I just changed them and everything worked, which is kind of weird.

具体来说,我的“dfs.data.dir”被定为“/home/hadoop/hd_tmp”,我的错误是:

...
...
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /home/hadoop/hd_tmp/dfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: All directories in dfs.data.dir are invalid.
...
...

因此,我只是执行了这些指挥:

I stopped all the demons with "bin/stop-all.sh"
Change the permissions of the directory with "chmod -R 755 /home/hadoop/hd_tmp"
I gave format again to the namenode with "bin/hadoop namenode -format".
I re-started the demons "bin/start-all.sh"
And voilà, the datanode was up and running! (I checked it with the command "jsp", where a process named DataNode was shown).

那么,所有东西都得 fine。

Answer 5

In my case, I wrongly set one destination for dfs.name.dir and dfs.data.dir. The correct format is

 <property>
 <name>dfs.name.dir</name>
 <value>/path/to/name</value>
 </property>

 <property>
 <name>dfs.data.dir</name>
 <value>/path/to/data</value>
 </property>

Answer 6

我搬走了黑手工艺场的外壳,然后又转过了这个问题。 Hadoop需要改进其错误信息。我尝试了上述每一种解决办法,但都没有奏效。

Answer 7

我遇到了同样的问题。当我看 当地东道方:50070时,在分组摘要中,除“DFS Used% 100”外,所有特性都显示为0。通常情况下,这种情况发生是因为三个网站*-site.xml的档案中存在一些错误,这些档案依据是HADOOP_INSTALL/conf和东道文件。

In my case, the cause is unable to resolve the hostname. I solved the problem simply by adding "IP_Address hostname" to /etc/hosts.

Answer 8

我不得不删除:

/tmp/hadoop-<user-name> rafter and Format andstart using sbin/start-dfs.sh

sbin/start-yarn.sh

友情链接