English 中文(简体)
SQL Joins, Count(), and group by to sort posts by # of yes/no votes
原标题:

I have posts, votes, and comments tables. Each post can have N yes votes , N no votes and N comments. I am trying to get a set of posts sorted by number of yes votes.

I have a query that does exactly this, but is running far too slowly. On a data set of 1500 posts and 15K votes, it s take .48 seconds on my dev machine. How can I optimize this?

select
  p.*,
  v.yes,
  x.no
from
  posts p
left join (select post_id, vote_type_id, count(1) as yes from votes where (vote_type_id = 1) group by post_id) v on v.post_id = p.id
left join (select post_id, vote_type_id, count(1) as no from votes where (vote_type_id = 2) group by post_id) x on x.post_id = p.id
left join (select post_id, count(1) as comment_count from comments group by post_id) p on p.confession_id = p.id
order by
  yes desc
limit
    0, 10

EDIT:

  • Votes and Comments both have a post_id FK
  • Adding an index on vote_type_id and post_id in the votes table shaved .1sec off the query execution.
最佳回答

Add a yes_count column and use a trigger to update the vote count for each post when the vote is made. You can index this column, then it should be very fast.

问题回答

Use explain for checking the query execution plan so you can see why it is slow, usually it is enough to see the plan and later create appropriate indexes. The 1.5k and 15k tables are really small so that query should be much faster.

Why don t you add a column yes and no ? Rather than adding a new entry at every post, just increment the count.

If I misunderstood your database or you can t modify it, at least do you have a foreign key on votes.post_id to post.id? Foreign keys are crutial if you do any join.

First off, your current query shouldn t compile, as it uses p as an alias for both the comments and the posts table.

Second, you re joining votes twice: once for no, and once for yes. Using a CASE statement, you can compute the sums of both with a single join. Here s a sample query:

select
  p.*,
  sum(case when v.vote_type_id = 1 then 1 else 0 end) as yes,
  sum(case when v.vote_type_id = 2 then 1 else 0 end) as no,
  count(c.id) as comment_count
from posts p
left join votes v on v.post_id = p.id
left join comments c on c.post_id = p.id
order by yes desc
limit 0, 10

Third, you could verify that the proper foreign keys exists for the relations between posts, votes and comments. An (post_id, vote_type_id) index on the votes could also help.





相关问题
SQL SubQuery getting particular column

I noticed that there were some threads with similar questions, and I did look through them but did not really get a convincing answer. Here s my question: The subquery below returns a Table with 3 ...

please can anyone check this while loop and if condition

<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...

php return a specific row from query

Is it possible in php to return a specific row of data from a mysql query? None of the fetch statements that I ve found return a 2 dimensional array to access specific rows. I want to be able to ...

Character Encodings in PHP and MySQL

Our website was developed with a meta tag set to... <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> This works fine for M-dashes and special quotes, etc. However, I ...

Pagination Strategies for Complex (slow) Datasets

What are some of the strategies being used for pagination of data sets that involve complex queries? count(*) takes ~1.5 sec so we don t want to hit the DB for every page view. Currently there are ~...

Averaging a total in mySQL

My table looks like person_id | car_id | miles ------------------------------ 1 | 1 | 100 1 | 2 | 200 2 | 3 | 1000 2 | 4 | 500 I need to ...

热门标签