English 中文(简体)
Mysql: Optimizing Selecting rows from multiple ranges (using indexes?)
原标题:

My table (projects):

id, lft, rgt
1, 1, 6
2, 2, 3
3, 4, 5
4, 7, 10
5, 8, 9
6, 11, 12
7, 13, 14

As you may have noticed, this is hierarchical data using the nested set model. Tree pretty-printed:

1
 2
 3
4
 5
6
7

I want to select all sub projects under project 1 and 4. I can do this with:

SELECT p.id
FROM projects AS p, projects AS ps
WHERE (ps.id = 1 OR ps.id = 4)
AND p.lft BETWEEN ps.lft AND ps.rgt

However, this is very slow with a large table, when running EXPLAIN (Query) i get:

+----+-------------+-------+-------+------------------------+---------+---------+------+------+-------------------------------------------------+
| id | select_type | table | type  | possible_keys          | key     | key_len | ref  | rows | Extra                                           |
+----+-------------+-------+-------+------------------------+---------+---------+------+------+-------------------------------------------------+
|  1 | SIMPLE      | ps    | range | PRIMARY,lft,rgt,lftRgt | PRIMARY | 4       | NULL |    2 | Using where                                     | 
|  1 | SIMPLE      | p     | ALL   | lft,lftRgt             | NULL    | NULL    | NULL | 7040 | Range checked for each record (index map: 0x12) | 
+----+-------------+-------+-------+------------------------+---------+---------+------+------+-------------------------------------------------+

(The project table has indexes on lft, rgt, and lft-rgt. As you can see, mysql does not use any index, and loops through the 7040 records)

I have found that if I only select for one of the super project, mysql manages to use the indexes:

SELECT p.id
FROM projects AS p, projects AS ps
WHERE ps.id = 1
AND p.lft BETWEEN ps.lft AND ps.rgt

EXPLAINs to:

+----+-------------+-------+-------+------------------------+---------+---------+-------+------+-------------+
| id | select_type | table | type  | possible_keys          | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+-------+------------------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | ps    | const | PRIMARY,lft,rgt,lftRgt | PRIMARY | 4       | const |    1 |             | 
|  1 | SIMPLE      | p     | range | lft,lftRgt             | lft     | 4       | NULL  |    7 | Using where | 
+----+-------------+-------+-------+------------------------+---------+---------+-------+------+-------------+

FINALLY, my question: I there any way i can SELECT rows matching multiple ranges, and still benefit from indexes?

最佳回答

From 7.2.5.1. The Range Access Method for Single-Part Indexes in MySQL reference manual:

Currently, MySQL does not support merging multiple ranges for the range access method for spatial indexes. To work around this limitation, you can use a UNION with identical SELECT statements, except that you put each spatial predicate in a different SELECT.

So you need to have a union of two different selects.

问题回答

have you tried a union? take your second example, add "union" underneath and the repeat but matching id 4. i don t know if it would work, but it seems like an obvious thing to try.

edit:

SELECT p.id
FROM projects AS p, projects AS ps
WHERE ps.id = 1
AND p.lft BETWEEN ps.lft AND ps.rgt
UNION
SELECT p.id
FROM projects AS p, projects AS ps
WHERE ps.id = 4
AND p.lft BETWEEN ps.lft AND ps.rgt

Your query does merge the multiple ranges.

It uses a range access method to combine the multiple ranges on p (which is leading in the join).

For each row returned from p, it checks the best method to retrieve all rows from ps for the given values of p.lft and p.rgt. Depending on the query selectivity, it may be either a fullscan over ps or a index lookup over one of two possible indexes.

The number of rows shown in the EXPLAIN means nothing: the EXPLAIN just shows the worst possible outcome. It doesn t necessarily mean that all these rows will be examined. Whether they will or not the optimizer can only tell in runtime.

The documentation snippet about the impossibility to merge the multiple ranges is only valid for SPATIAL indexes (R-Tree those that you create over GEOMETRY types). These indexes are good for the queries that search upwards (the ancestors of a given project) but not downwards.

A plain B-Tree index can combine the multiple ranges. From the documentation:

For all types of indexes, multiple range conditions combined with OR or AND form a range condition.

The real problem is that the optimizer in MySQL cannot make a single correct decision: either use a single fullscan (with ps leading), or make several range scans.

Say, you have 10,000 rows and your projects boundaries are 0-500 and 2000-2500. The optimizer will see that each boundary will benefit from the index, the range check will result in two range accesses, while a single fullscan would be better.

It may be even worse if your project boundaries are, say, 0-3000 and 5000-6000. In this case the optimizer will make two fullscans, while one would suffice.

To help the optimizer make the correct decision, you should make the covering index on (lft, id) in this order:

CREATE INDEX ix_lft_id ON projects (lft, id)

The tipping point for using the fullscan over a covering index rather than a range condition is 90%, that means you will never have more than a one fullscan in your actual plan.





相关问题
SQL SubQuery getting particular column

I noticed that there were some threads with similar questions, and I did look through them but did not really get a convincing answer. Here s my question: The subquery below returns a Table with 3 ...

please can anyone check this while loop and if condition

<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...

php return a specific row from query

Is it possible in php to return a specific row of data from a mysql query? None of the fetch statements that I ve found return a 2 dimensional array to access specific rows. I want to be able to ...

Character Encodings in PHP and MySQL

Our website was developed with a meta tag set to... <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> This works fine for M-dashes and special quotes, etc. However, I ...

Pagination Strategies for Complex (slow) Datasets

What are some of the strategies being used for pagination of data sets that involve complex queries? count(*) takes ~1.5 sec so we don t want to hit the DB for every page view. Currently there are ~...

Averaging a total in mySQL

My table looks like person_id | car_id | miles ------------------------------ 1 | 1 | 100 1 | 2 | 200 2 | 3 | 1000 2 | 4 | 500 I need to ...

热门标签