Question

I m 寻找基于Hadoop的研究/执行项目,我参加了在维基网页上公布的名单:。但该网页最后一次更新是在2009年9月。因此,我不相信其中一些想法是否已经落实。我特别关心“在MR框架内实现节制和节制优化”,讨论“把几个地图的成果纳入到弹 before之前。这可减少寻找工作和中间储存”。

是否有人试图这样做? 是否在目前版本的Hadoop实施?

Answer 1

The project description is aimed "optimization". This feature is already present in the current Hadoop-MapReduce and it can probably run in a lot less time. Sounds like a valuable enhancement to me.

Answer 2

合并功能(见 http://wiki.apache.org/hadoop/HadoopMapReduce)下描述),后者比较无休无止。但是,我认为,合并者只是将钥匙价值乘以单一地图工作,而不是某个定点或 r的所有配对。

Answer 3

I think it is very challenging task. In my understanding the idea is to make a computation tree instead of "flat" map-reduce.The good example of it is Google s Dremel engine (called BigQuey now). I would suggest to read this paper: http://sergey.melnix.com/pub/melnik_VLDB10.pdf
If you interesting in this kind of architecture - you can also take a look on the open source clone of this technology - Open Dremel. http://code.google.com/p/dremel/

友情链接