English 中文(简体)
activerecord has_many :through find with one sql call
原标题:

I have a these 3 models:

class User < ActiveRecord::Base
  has_many :permissions, :dependent => :destroy
  has_many :roles, :through => :permissions
end

class Permission < ActiveRecord::Base
  belongs_to :role
  belongs_to :user
end
class Role < ActiveRecord::Base
  has_many :permissions, :dependent => :destroy
  has_many :users, :through => :permissions
end

I want to find a user and it s roles in one sql statement, but I can t seem to achieve this:

The following statement:

user = User.find_by_id(x, :include => :roles)

Gives me the following queries:

  User Load (1.2ms)   SELECT * FROM `users` WHERE (`users`.`id` = 1) LIMIT 1
  Permission Load (0.8ms)   SELECT `permissions`.* FROM `permissions` WHERE (`permissions`.user_id = 1) 
  Role Load (0.8ms)   SELECT * FROM `roles` WHERE (`roles`.`id` IN (2,1)) 

Not exactly ideal. How do I do this so that it does one sql query with joins and loads the user s roles into memory so saying:

user.roles

doesn t issue a new sql query

最佳回答

Loading the Roles in a separate SQL query is actually an optimization called "Optimized Eager Loading".

Role Load (0.8ms)   SELECT * FROM `roles` WHERE (`roles`.`id` IN (2,1))

(It is doing this instead of loading each role separately, the N+1 problem.)

The Rails team found it was usually faster to use an IN query with the associations looked up previously instead of doing a big join.

A join will only happen in this query if you add conditions on one of the other tables. Rails will detect this and do the join.

For example:

User.all(:include => :roles, :conditions => "roles.name =  Admin ")

See the original ticket, this previous Stack Overflow question, and Fabio Akita s blog post about Optimized Eager Loading.

问题回答

As Damien pointed out, if you really want a single query every time you should use join.

But you might not want a single SQL call. Here s why (from here):

Optimized Eager Loading


Let’s take a look at this:

Post.find(:all, :include => [:comments])

Until Rails 2.0 we would see something like the following SQL query in the log:

SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, `posts`.`body` AS t0_r2, `comments`.`id` AS t1_r0, `comments`.`body` AS t1_r1 FROM `posts` LEFT OUTER JOIN `comments` ON comments.post_id = posts.id 

But now, in Rails 2.1, the same command will deliver different SQL queries. Actually at least 2, instead of 1. “And how can this be an improvement?” Let’s take a look at the generated SQL queries:

SELECT `posts`.`id`, `posts`.`title`, `posts`.`body` FROM `posts` 

SELECT `comments`.`id`, `comments`.`body` FROM `comments` WHERE (`comments`.post_id IN (130049073,226779025,269986261,921194568,972244995))

The :include keyword for Eager Loading was implemented to tackle the dreaded 1+N problem. This problem happens when you have associations, then you load the parent object and start loading one association at a time, thus the 1+N problem. If your parent object has 100 children, you would run 101 queries, which is not good. One way to try to optimize this is to join everything using an OUTER JOIN clause in the SQL, that way both the parent and children objects are loaded at once in a single query.

Seemed like a good idea and actually still is. But for some situations, the monster outer join becomes slower than many smaller queries. A lot of discussion has been going on and you can take a look at the details at the tickets 9640, 9497, 9560, L109.

The bottom line is: generally it seems better to split a monster join into smaller ones, as you’ve seen in the above example. This avoid the cartesian product overload problem. For the uninitiated, let’s run the outer join version of the query:

mysql> SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, `posts`.`body` AS t0_r2, `comments`.`id` AS t1_r0, `comments`.`body` AS t1_r1 FROM `posts` LEFT OUTER JOIN `comments` ON comments.post_id = posts.id ;

+-----------+-----------------+--------+-----------+---------+
| t0_r0     | t0_r1           | t0_r2  | t1_r0     | t1_r1   |
+-----------+-----------------+--------+-----------+---------+
| 130049073 | Hello RailsConf | MyText |      NULL | NULL    | 
| 226779025 | Hello Brazil    | MyText | 816076421 | MyText5 | 
| 269986261 | Hello World     | MyText |  61594165 | MyText3 | 
| 269986261 | Hello World     | MyText | 734198955 | MyText1 | 
| 269986261 | Hello World     | MyText | 765025994 | MyText4 | 
| 269986261 | Hello World     | MyText | 777406191 | MyText2 | 
| 921194568 | Rails 2.1       | NULL   |      NULL | NULL    | 
| 972244995 | AkitaOnRails    | NULL   |      NULL | NULL    | 
+-----------+-----------------+--------+-----------+---------+
8 rows in set (0.00 sec)

Pay attention to this: do you see lots of duplications in the first 3 columns (t0_r0 up to t0_r2)? Those are the Post model columns, the remaining being each post’s comment columns. Notice that the “Hello World” post was repeated 4 times. That’s what a join does: the parent rows are repeated for each children. That particular post has 4 comments, so it was repeated 4 times.

The problem is that this hits Rails hard, because it will have to deal with several small and short-lived objects. The pain is felt in the Rails side, not that much on the MySQL side. Now, compare that to the smaller queries:

mysql> SELECT `posts`.`id`, `posts`.`title`, `posts`.`body` FROM `posts` ;
+-----------+-----------------+--------+
| id        | title           | body   |
+-----------+-----------------+--------+
| 130049073 | Hello RailsConf | MyText | 
| 226779025 | Hello Brazil    | MyText | 
| 269986261 | Hello World     | MyText | 
| 921194568 | Rails 2.1       | NULL   | 
| 972244995 | AkitaOnRails    | NULL   | 
+-----------+-----------------+--------+
5 rows in set (0.00 sec)

mysql> SELECT `comments`.`id`, `comments`.`body` FROM `comments` WHERE (`comments`.post_id IN (130049073,226779025,269986261,921194568,972244995));
+-----------+---------+
| id        | body    |
+-----------+---------+
|  61594165 | MyText3 | 
| 734198955 | MyText1 | 
| 765025994 | MyText4 | 
| 777406191 | MyText2 | 
| 816076421 | MyText5 | 
+-----------+---------+
5 rows in set (0.00 sec)

Actually I am cheating a little bit, I manually removed the created_at and updated_at fields from the all the above queries in order for you to understand it a little bit clearer. So, there you have it: the posts result set, separated and not duplicated, and the comments result set with the same size as before. The longer and more complex the result set, the more this matters because the more objects Rails would have to deal with. Allocating and deallocating several hundreds or thousands of small duplicated objects is never a good deal.

But this new feature is smart. Let’s say you want something like this:

>> Post.find(:all, :include => [:comments], :conditions => ["comments.created_at > ?", 1.week.ago.to_s(:db)])

In Rails 2.1, it will understand that there is a filtering condition for the ‘comments’ table, so it will not break it down into the small queries, but instead, it will generate the old outer join version, like this:

SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, `posts`.`body` AS t0_r2, `posts`.`created_at` AS t0_r3, `posts`.`updated_at` AS t0_r4, `comments`.`id` AS t1_r0, `comments`.`post_id` AS t1_r1, `comments`.`body` AS t1_r2, `comments`.`created_at` AS t1_r3, `comments`.`updated_at` AS t1_r4 FROM `posts` LEFT OUTER JOIN `comments` ON comments.post_id = posts.id WHERE (comments.created_at >  2008-05-18 18:06:34 ) 

So, nested joins, conditions, and so forth on join tables should still work fine. Overall it should speed up your queries. Some reported that because of more individual queries, MySQL seems to receive a stronger punch CPU-wise. Do you home work and make your stress tests and benchmarks to see what happens.

Including a model loads the datas. But makes a second query.
For what you want to do, you should use the :joins parameter.

user = User.find_by_id(x, :joins => :roles)




相关问题
rails collection_select vs. select

collection_select and select Rails helpers: Which one should I use? I can t see a difference in both ways. Both helpers take a collection and generates options tags inside a select tag. Is there a ...

SSL slowness in EC2

We ve deployed our rails app to EC2. In our setup, we have two proxies on small instances behind round-robin DNS. These run nginx load balancers for a dynamically growing and shrinking farm of web ...

Auth-code with A-Za-z0-9 to use in an URL parameter

As part of a web application I need an auth-code to pass as a URL parameter. I am currently using (in Rails) : Digest::SHA1.hexdigest((object_id + rand(255)).to_s) Which provides long strings like : ...

RubyCAS-Client question: Rails

I ve installed RubyCAS-Client version 2.1.0 as a plugin within a rails app. It s working, but I d like to remove the ?ticket= in the url. Is this possible?

activerecord has_many :through find with one sql call

I have a these 3 models: class User < ActiveRecord::Base has_many :permissions, :dependent => :destroy has_many :roles, :through => :permissions end class Permission < ActiveRecord::...

Ordering a hash to xml: Rails

I m building an xml document from a hash. The xml attributes need to be in order. How can this be accomplished? hash.to_xml

Text Editor for Ruby-on-Rails

guys which text editor is good for Rubyonrails? i m using Windows and i was using E-Texteditor but its not free n its expired now can anyone plese tell me any free texteditor? n which one is best an ...

How to get SQL queries for each user where env is production

I’m developing an application dedicated to generate statistical reports, I would like that user after saving their stat report they save sql queries too. To do that I wrote the following module: ...

热门标签