Question

By way of introduction...
I ve run into this question: Difference between 2 adjacent fields - Date - PHP MYSQL and was trying to achieve the goal, i.e iterate through dates and get diff, with pure MySQL.
Another question there (Subtracting one row of data from another in SQL) helped me to understand how to make something similar with MySQL. It did not solve the problem, as the solutions are still depandant either on fixed values or on assumed order of data, but it did help me to understand the methodology.
There is one other question (How to get next/previous record in MySQL?) with answers describing how to get values from next/previous row. It s still dependand on some fixed values, but I learned how to use the technique.

Say I有这个表格foo:

CREATE TABLE `foo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dateof` date NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

  id | dateof
-----+------------
   1 | 2012-01-01
   2 | 2012-01-02
  11 | 2012-01-04
  12 | 2012-01-01
  13 | 2012-01-02
  14 | 2012-01-09
 111 | 2012-01-01
 112 | 2012-01-01
 113 | 2012-01-01

有两个假设:

Primary key (id) ordered ascending and "holes" allowed.
Every date in dateof column is valid, in the meaning: no NULLs and no defaults (0000-00-00). I want to iterate through every row and calculate number of days passed with previous entry, to receive this:

  id | date       | days_diff
-----+------------+-----------
   1 | 2012-01-01 |     0
   2 | 2012-01-02 |     1
  11 | 2012-01-04 |     2
  12 | 2012-01-01 |    -3
  13 | 2012-01-02 |     1
  14 | 2012-01-09 |     7
 111 | 2012-01-01 |    -8
 112 | 2012-01-01 |     0
 113 | 2012-01-01 |    30

我从所有学到过这个解决办法(见solution 1,因为还有另一个:

SELECT
    f.id,
    DATE_FORMAT(f.dateof,  %b %e, %Y ) AS date,
    (SELECT DATEDIFF(f.dateof, f2.dateof)
        FROM foo f2
        WHERE f2.id = (
            SELECT MAX(f3.id) FROM foo f3 WHERE f3.id < f.id
        )
    ) AS days_diff
FROM foo f;

(Example fiddle here:)。

这象一个药店......,直到丁巴只有两条条目。更糟糕的是:

EXPLAIN:
id select_type        table type   possible_keys key     key_len ref    rows  Extra
1  PRIMARY            f     ALL    NULL          NULL    NULL    NULL   17221   
2  DEPENDENT SUBQUERY f2    eq_ref PRIMARY       PRIMARY 4       func   1     Using where
3  DEPENDENT SUBQUERY f3    index  PRIMARY       PRIMARY 4       NULL   17221 Using where; Using index

任期:8.672sec. Fetch:228.515 sec.

我想在<代码>上添加索引。页: 1

CREATE TABLE `foo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `dateof` date DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `dateof` (`dateof`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

...and gained tiny improvement:

EXPLAIN:
id select_type        table type   possible_keys key     key_len ref  rows  Extra
1  PRIMARY            f     index  NULL          dateof  4       NULL 18369 Using index
2  DEPENDENT SUBQUERY f2    eq_ref PRIMARY       PRIMARY 4       func 1     Using where
3  DEPENDENT SUBQUERY f3    index  PRIMARY       dateof  4       NULL 18369 Using where; Using index

18031浏览:持续时间:8.406sec. Fetch:219.281 sec.

I recalled reading somewhere about advantages of MyISAM over InnoDB, in some cases. So I changed the to MyISAM:

ALTER TABLE `foo` ENGINE = MyISAM;

18031浏览:持续时间:5.671sec. Fetch:151.610。

Sure it s better but still slow.

我用另一种算法(solution 2):

SELECT
  f.id,
  DATE_FORMAT(f.dateof,  %b %e, %Y ) AS date,
  (SELECT DATEDIFF(f.dateof, f2.dateof)
    FROM foo f2
    WHERE f2.id < f.id
    ORDER BY f2.id DESC
    LIMIT 1
  ) AS days_diff
FROM foo f;

......但更慢的是:

18031浏览:持续时间:15.609sec. Fetch:184.656。

是否有其他办法优化这一查询或数据结构,以便更快地完成这项任务?

Answer 1

It is no surprise that your approaches are very slow even for a moderately sized table.

在理论上,应当能够利用分析功能计算O(n)时间的结果,遗憾的是,MySQL没有支持这一功能。但是,你可以使用变数在MySQL中复制LAG:

SELECT
    id,
    DATE_FORMAT(f.dateof,  %b %e, %Y ) AS date,
    DATEDIFF(dateof, @prev) AS days_diff,
    @prev := dateof
FROM FOO, (SELECT @prev := NULL) AS vars
ORDER BY id

This should be several orders of magnitude faster than what you are trying to do.

友情链接