English 中文(简体)
Where HDFS stores files locally by default?
原标题:
  • 时间:2010-03-01 19:19:11
  •  标签:
  • hadoop
  • hdfs

I am running hadoop with default configuration with one-node cluster, and would like to find where HDFS stores files locally.

Any ideas?

Thanks.

问题回答

You need to look in your hdfs-default.xml configuration file for the dfs.data.dir setting. The default setting is: ${hadoop.tmp.dir}/dfs/data and note that the ${hadoop.tmp.dir} is actually in core-default.xml described here.

The configuration options are described here. The description for this setting is:

Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.

Seems like for the current version(2.7.1) the dir is

/tmp/hadoop-${user.name}/dfs/data

Based on dfs.datanode.data.dir, hadoop.tmp.dir setting from: http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml

As "more recent answer" and to clarify hadoop version numbers:

If you use Hadoop 1.2.1 (or something similar), @Binary Nerd s answer is still true.

But if you use Hadoop 2.1.0-beta (or something similar), you should read the configuration documentation here and the option you want to set is: dfs.datanode.data.dir

For hadoop 3.0.0, the hdfs root path is as given by the property "dfs.datanode.data.dir"

First find the Hadoop directory present in /usr/lib. There you can find the etc/hadoop directory, where all the configuration files are present.

In that directory you can find the hdfs-site.xml file which contains all the details about HDFS. There you find 2 properties:

dfs.namenode.name.dir – tells where the namenode stores the metadata on the local filesystem.

dfs.datanode.data.dir – tells where the datanode stores the data on the local filesystem





相关问题
Hadoop - namenode is not starting up

I am trying to run hadoop as a root user, i executed namenode format command hadoop namenode -format when the Hadoop file system is running. After this, when i try to start the name node server, it ...

What default reducers are available in Elastic MapReduce?

I hope I m asking this in the right way. I m learning my way around Elastic MapReduce and I ve seen numerous references to the "Aggregate" reducer that can be used with "Streaming" job flows. In ...

Establishing Eclipse project environment for HadoopDB

I have checked-out a project from SourceForge named HadoopDB. It uses some class in another project named Hive. I have used Eclipse Java build path setting to link source to the Hive project root ...

Hadoop: intervals and JOIN

I m very new to Hadoop and I m currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example: input1: 20091001-20091002 A 20091011-20091104 ...

hadoop- determine if a file is being written to

Is there a way to determine if a file in hadoop is being written to? eg- I have a process that puts logs into hdfs. I have another process that monitors for the existence of new logs in hdfs, but I ...

Building Apache Hive - impossible to resolve dependencies

I am trying out the Apache Hive as per http://wiki.apache.org/hadoop/Hive/GettingStarted and am getting this error from Ivy: Downloaded file size doesn t match expected Content Length for http://...

热门标签