English 中文(简体)
地图
原标题:MapReduce on child objects not embedded

I am having a problem with creating a mapreduce algorithm that will get me the stats i need. I have a user object that can create a post and a post can have many likes by other users.

User
--Post
----Likes

The Post is not embedded in the user because we access posts separately and not just in a user context. The stat I need is the number of likes an author has gotten and i need to get this through the likes of the posts of a user. The problem is that because the posts are not embedded, I cannot access them in my map function. Here are the map and reduce functions I currently have

def reputation_map
 <<-MAP
   function() {      
       var posts = db.posts.find({user_id:this._id});
       emit(this._id, {posts:posts});      
   }
 MAP
end

def reputation_reduce
  <<-REDUCE
    function(key, values) {
      var count = 0;
      while(values.hasNext()){
        values.next();
        count+=1;
      }
      return {posts:count};
     }
 REDUCE
end   

这只应当把每个用户的员额调回,因此我甚至没有达到同样水平,而是没有算数,这只是员额的频率。 这样做的正确方式是什么?

最佳回答

Map Reduce is really designed to operate on a single collection at a time.

Technically, it is possible to query a separate collection from inside a Map function as you have done, but take caution as this is not recommended nor supported. you may run into issues, especially if the collection is sharded.

有人在回馈时询问了类似的问题:。 • 如何在我的地图/绘画职能内叫ongo? 这是一种良好做法吗?

如果你将多种收集的结果汇总起来,你可能会发现,采用最安全和最直接的方法。

另一种做法是,如果每个作者都喜欢某种价值,将这种价值作为每份文件的价值列入,并在每份更新中花费略多的间接费用来增加这一价值,而不是定期对每个作者的所有投票进行可能的资源负担计算。

希望这将给你一些思考,以恢复你们所需要的价值观。

如果你想写一份“减少行动”地图,收集单册,共同体将在此提供帮助。 请包括一份样本投入文件,并说明预期产出。

For more information on Map Reduce, the documentation may be found here: http://www.mongodb.org/display/DOCS/MapReduce

Additionally, there are some good Map Reduce examples in the MongoDB Cookbook: http://cookbook.mongodb.org/

The "Extras" section of the cookbook article "Finding Max And Min Values with Versioned Documents" http://cookbook.mongodb.org/patterns/finding_max_and_min/ contains a good step-by-step walkthrough of a Map Reduce operation, explaining how the functions are executed.

问题回答

暂无回答




相关问题
Error in Hadoop MapReduce

When I run a mapreduce program using Hadoop, I get the following error. 10/01/18 10:52:48 INFO mapred.JobClient: Task Id : attempt_201001181020_0002_m_000014_0, Status : FAILED java.io.IOException:...

Error in using Hadoop MapReduce in Eclipse

When I executed a MapReduce program in Eclipse using Hadoop, I got the below error. It has to be some change in path, but I m not able to figure it out. Any idea? 16:35:39 INFO mapred.JobClient: Task ...

Is MapReduce right for me?

I am working on a project that deals with analyzing a very large amount of data, so I discovered MapReduce fairly recently, and before i dive any further into it, i would like to make sure my ...

Hadoop or Hadoop Streaming for MapReduce on AWS

I m about to start a mapreduce project which will run on AWS and I am presented with a choice, to either use Java or C++. I understand that writing the project in Java would make more functionality ...

What default reducers are available in Elastic MapReduce?

I hope I m asking this in the right way. I m learning my way around Elastic MapReduce and I ve seen numerous references to the "Aggregate" reducer that can be used with "Streaming" job flows. In ...

Displaying access log analysis

I m doing some work to analyse the access logs from a Catalyst web application. The data is from the load balancers in front of the web farm and totals about 35Gb per day. It s stored in a Hadoop HDFS ...

热门标签