Question

Suppose you want to find the last record entered into the database (highest ID) matching a string: Model.where(:name => Joe ). There are 100,000+ records. There are many matches (say thousands).

这样做的最有效方式是什么? 邮政管理处是否需要找到所有记录,还是只能找到最后的记录? 这是否是一个特别缓慢的问题?

铁路工程:3.0.7、1.9.2和8.3。

Answer 1

这里的重要部分是有一个<>匹配指数。你们可以尝试这一小考验:

为测试创建化学产品<代码>x。

-- DROP SCHEMA x CASCADE;  -- to wipe it all for a retest or when done.
CREATE SCHEMA x;
CREATE TABLE x.tbl(id serial, name text);

插入10 000个随机浏览器:

INSERT INTO x.tbl(name) SELECT  x  || generate_series(1,10000);

Insert another 10000 rows with repeating names:

INSERT INTO x.tbl(name) SELECT  y  || generate_series(1,10000)%20;

删除随机10%,使其更加真实:

DELETE FROM x.tbl WHERE random() < 0.1;

ANALYZE x.tbl;

Qu:

SELECT *
FROM   x.tbl
WHERE  name =  y17 
ORDER  BY id DESC
LIMIT  1;

--> Total runtime: 5.535 ms

CREATE INDEX tbl_name_idx on x.tbl(name);

---> 总操作时间:1.228

DROP INDEX x.tbl_name_idx;
CREATE INDEX tbl_name_id_idx on x.tbl(name, id);

---> 总操作时间:0.053

DROP INDEX x.tbl_name_id_idx;
CREATE INDEX tbl_name_id_idx on x.tbl(name, id DESC);

---> 总操作时间:0.048

DROP INDEX x.tbl_name_id_idx;
CREATE INDEX tbl_name_idx on x.tbl(name);
CLUSTER x.tbl using tbl_name_idx;

--> Total runtime: 1.144 ms

DROP INDEX x.tbl_name_id_idx;
CREATE INDEX tbl_name_id_idx on x.tbl(name, id DESC);
CLUSTER x.tbl using tbl_name_id_idx;

---> 总操作时间:0.047

Conclusion

With a fitting index, the query performs more than 100x faster.
Top performer is a multicolumn index with the filter column first and the sort column last.
Matching sort order in the index helps a little in this case.

Clustering helps with the simple index, because still many columns have to be read from the table, and these can be found in adjacent blocks after clustering. It doesn t help with the multicolumn index in this case, because only one record has to be fetched from the table.
Read more about multicolumn indexes in the manual.

所有这些影响随着表格的规模而增加。 2个小栏中的10 000个小段只是一个非常小的试验案例。

Answer 2

You can put the query together in Rails and the ORM will write the proper SQL:

Model.where(:name=>"Joe").order( created_at DESC ).first

这不应导致检索所有模型记录,甚至使表格扫描。

Answer 3

这可能是最容易的:

SELECT [columns] FROM [table] WHERE [criteria] ORDER BY [id column] DESC LIMIT 1

Note: Indexing is important here. A huge DB will be slow to search no matter how you do it if you re not indexing the right way.

Conclusion

友情链接