English 中文(简体)
Hadoop Map中的脂肪和鞋类优化
原标题:Sort and shuffle optimization in Hadoop MapReduce
最佳回答

The project description is aimed "optimization". This feature is already present in the current Hadoop-MapReduce and it can probably run in a lot less time. Sounds like a valuable enhancement to me.

问题回答

合并功能(见http://wiki.apache.org/hadoop/HadoopMapReduce)下描述),后者比较无休无止。 但是,我认为,合并者只是将钥匙价值乘以单一地图工作,而不是某个定点或 r的所有配对。

I think it is very challenging task. In my understanding the idea is to make a computation tree instead of "flat" map-reduce.The good example of it is Google s Dremel engine (called BigQuey now). I would suggest to read this paper: http://sergey.melnix.com/pub/melnik_VLDB10.pdf
If you interesting in this kind of architecture - you can also take a look on the open source clone of this technology - Open Dremel. http://code.google.com/p/dremel/





相关问题
Hadoop - namenode is not starting up

I am trying to run hadoop as a root user, i executed namenode format command hadoop namenode -format when the Hadoop file system is running. After this, when i try to start the name node server, it ...

What default reducers are available in Elastic MapReduce?

I hope I m asking this in the right way. I m learning my way around Elastic MapReduce and I ve seen numerous references to the "Aggregate" reducer that can be used with "Streaming" job flows. In ...

Establishing Eclipse project environment for HadoopDB

I have checked-out a project from SourceForge named HadoopDB. It uses some class in another project named Hive. I have used Eclipse Java build path setting to link source to the Hive project root ...

Hadoop: intervals and JOIN

I m very new to Hadoop and I m currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example: input1: 20091001-20091002 A 20091011-20091104 ...

hadoop- determine if a file is being written to

Is there a way to determine if a file in hadoop is being written to? eg- I have a process that puts logs into hdfs. I have another process that monitors for the existence of new logs in hdfs, but I ...

Building Apache Hive - impossible to resolve dependencies

I am trying out the Apache Hive as per http://wiki.apache.org/hadoop/Hive/GettingStarted and am getting this error from Ivy: Downloaded file size doesn t match expected Content Length for http://...

热门标签