English 中文(简体)
• 如何规划/减少微型信贷银行的两份收集资料
原标题:How to map/reduce two MongoDB collections

I am new to map / reduce and trying to figure out a way to collect the following data using map / reduce instead doing it my my (slow) application logic:

我有一个收集项目,与收集任务有1:n的关系。 现在,我要收到一系列成果,给我留下第一个是项目,任务最重,最后一个是项目,任务最少。

甚至更完善的一套轮 has也告诉我每个项目有多少任务(假设项目名称是独一无二的):

[project_1: 23, project_2: 42, project_3: 82]

关于地图,我尝试了这样的东西:

map = function () {
  emit(this.project_id, { count:1 });
}

减少:

reduce = function (key, values) {
  var sum = 0;
  values.forEach(function(doc){ sum += 1; });
  return { count:sum };
}

我对我的任务进行了评价:

var mr = db.tasks.mapReduce(map, reduce, { out: "results" });

But I get crucial results when querying:

db[mr.result].find();

我在铁路上使用Mongoid,并完全失去。 谁能把我引向正确的方向?

Thx in advance. Felix

最佳回答

从总体上看,我至少发现一个问题: 削减职能中的总结步骤应当

  values.forEach(function(doc){ sum += doc.count ; });

由于这一职能可能减少以前削减步骤所形成的价值,因此,其价值和价值可乘;

That s a common oversight, mentioned here: http://www.mongodb.org/display/DOCS/Troubleshooting+MapReduce

希望帮助!

问题回答

暂无回答




相关问题
Error in Hadoop MapReduce

When I run a mapreduce program using Hadoop, I get the following error. 10/01/18 10:52:48 INFO mapred.JobClient: Task Id : attempt_201001181020_0002_m_000014_0, Status : FAILED java.io.IOException:...

Error in using Hadoop MapReduce in Eclipse

When I executed a MapReduce program in Eclipse using Hadoop, I got the below error. It has to be some change in path, but I m not able to figure it out. Any idea? 16:35:39 INFO mapred.JobClient: Task ...

Is MapReduce right for me?

I am working on a project that deals with analyzing a very large amount of data, so I discovered MapReduce fairly recently, and before i dive any further into it, i would like to make sure my ...

Hadoop or Hadoop Streaming for MapReduce on AWS

I m about to start a mapreduce project which will run on AWS and I am presented with a choice, to either use Java or C++. I understand that writing the project in Java would make more functionality ...

What default reducers are available in Elastic MapReduce?

I hope I m asking this in the right way. I m learning my way around Elastic MapReduce and I ve seen numerous references to the "Aggregate" reducer that can be used with "Streaming" job flows. In ...

Displaying access log analysis

I m doing some work to analyse the access logs from a Catalyst web application. The data is from the load balancers in front of the web farm and totals about 35Gb per day. It s stored in a Hadoop HDFS ...

热门标签