DevilKing's blog

冷灯看剑,剑上几分功名?炉香无需计苍生,纵一穿烟逝,万丈云埋,孤阳还照古陵

0%

MySQL order by limit

原文链接

1
2
3
4
5
6
7
8
CREATE TABLE `t_tbl_test_time_08` (
`f_some_id` int(11) unsigned DEFAULT '0',
`f_qiye_id` int(11) DEFAULT '0',
`f_type` tinyint(3) DEFAULT '0' COMMENT '有效联系类型 1: QQ联系,2:拨打电话,3:发送邮件,4:发送短信,5:添加跟进记录,6:拜访客户,7:EC联系,8:更新客户阶段',
`f_contact_time` timestamp NULL DEFAULT '1970-01-01 16:00:01',
UNIQUE KEY `some_qiye_type` (`f_some_id`,`f_qiye_id`,`f_type`),
KEY `f_contact_time` (`f_contact_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mysql> explain extended select f_some_id from d_ec_some1.t_tbl_test_time_08
where f_qiye_id=5077665 and f_type=9
and f_contact_time > '2017-10-17 14:23:49' and f_contact_time < '2017-10-17 14:23:53'
order by f_some_id limit 300;
+----+-------------+-----------------------+-------+----------------+---------------+---------+------+-------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+-------+----------------+---------------+---------+------+-------+----------+-------------+
| 1 | SIMPLE | t_tbl_test_time_08 | index | f_contact_time | some_qiye_type | 12 | NULL | 16032 | 2248.49 | Using where |
+----+-------------+-----------------------+-------+----------------+---------------+---------+------+-------+----------+-------------+
1 row in set

-- 指定一个索引
mysql> explain extended select f_some_id from d_ec_some1.t_tbl_test_time_08 use index(f_contact_time)
where f_qiye_id=5077665 and f_type=9
and f_contact_time > '2017-10-17 14:23:49' and f_contact_time < '2017-10-17 14:23:53'
order by f_some_id limit 300;
+----+-------------+-----------------------+-------+----------------+----------------+---------+------+--------+----------+---------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+-------+----------------+----------------+---------+------+--------+----------+---------------------------------------------------------------+
| 1 | SIMPLE | t_tbl_test_time_08 | range | f_contact_time | f_contact_time | 5 | NULL | 360478 | 100 | Using index condition; Using where; Using MRR; Using filesort |
+----+-------------+-----------------------+-------+----------------+----------------+---------+------+--------+----------+---------------------------------------------------------------+
1 row in set

第一个explain结果里面,type=index,表示 full index scan。注意这里看到的 index 不代表“查询用到索引了”。全索引扫描比全表扫描并不好到拿去,甚至更慢(因为随机IO)。是否用到正确的索引要看key那一列:
some_qiye_type 索引定义是 (f_some_id,f_qiye_id,f_type),从key_len=12看得出这三列都用上了(12=4+5+3)。但实际这个执行计划需要200多秒。

第二个explain,在sql里面指定了 use index(f_contact_time),依据是where条件里面f_contact_time的范围固定4s,猜想数据量不会很大,过滤效果会比较好。

这里的filtered部分,表示执行后的时间?

Extra列:

  • Using index condition: 用到了索引下推特性。Using where是回表拿数据。关于ICP见文后参考。
  • Using MRR: 用到了 Multi-Range Read 优化特性。mysql在通过二级索引范围查找的时候,得到的记录在物理上是无序的,为了减少去获取数据的随机IO,它会在内存缓冲区里面先根据rowid快速排序,然后顺序IO去拉取数据。(这个缓冲区大小参数由 read_rnd_buffer_size 控制)
  • Using filesort: 需要内存排序。对应 order by f_some_id

rows扫描虽然36w行,比前面的 16032 要多,但这个执行计划实际运行只需要0.7s,要快将近300倍。

这里我们祭出优化sql的两大法宝:profiling和optimizer_trace,来尝试找出是什么因素。

  • profiling:可以定位出sql从接受到返回结果,时间都耗在哪里
  • optimizer_trace: 跟踪优化器的成本评估过程,可以情况的看到它如何从多个候选索引里,做出选择
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
mysql> set profiling=1;
mysql> select f_some_id from d_ec_some1.t_tbl_test_time_08 where f_qiye_id=5077665 and f_type=9
and f_contact_time > '2017-10-17 14:23:49' and f_contact_time < '2017-10-17 14:23:53' order by f_some_id limit 300;

mysql> show profile block io,cpu for query 1;
+----------------------+------------+------------+------------+--------------+---------------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+------------+------------+------------+--------------+---------------+
| starting | 0.000121 | 0 | 0 | 0 | 0 |
| checking permissions | 3.2E-5 | 0 | 0 | 0 | 0 |
| Opening tables | 3.7E-5 | 0 | 0 | 0 | 0 |
| init | 4.2E-5 | 0 | 0 | 0 | 0 |
| System lock | 2.9E-5 | 0 | 0 | 0 | 0 |
| optimizing | 3.3E-5 | 0 | 0 | 0 | 0 |
| statistics | 0.005796 | 0 | 0.000999 | 448 | 0 |
| preparing | 4.3E-5 | 0 | 0 | 0 | 0 |
| Sorting result | 2.8E-5 | 0 | 0 | 0 | 0 |
| executing | 2.7E-5 | 0 | 0 | 0 | 0 |
| Sending data | 172.824522 | 189.040262 | 2.441629 | 1504928 | 6896 |
| end | 8.3E-5 | 0 | 0 | 0 | 0 |
| query end | 3E-5 | 0 | 0 | 0 | 0 |
| closing tables | 3.3E-5 | 0 | 0 | 0 | 0 |
| freeing items | 7E-5 | 0 | 0 | 0 | 0 |
| logging slow query | 3.1E-5 | 0 | 0 | 0 | 0 |
| Opening tables | 3.4E-5 | 0 | 0 | 0 | 0 |
| System lock | 7E-5 | 0 | 0 | 0 | 8 |
| cleaning up | 9.5E-5 | 0 | 0 | 0 | 0 |
+----------------------+------------+------------+------------+--------------+---------------+
19 rows in set

mysql> show status like "Handler%";
+----------------------------+---------+
| Variable_name | Value |
+----------------------------+---------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 4 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 1 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 9430930 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 1 |
+----------------------------+---------+
18 rows in set

那么解决这个问题,就是要去掉干扰因素,或者提供更优的选项给它。

  1. 去掉干扰因素
    干扰它的索引是 some_qiye_type,是个唯一索引,因为恰好f_some_id开头,索引优化器想当然的用它来避免排序。
    去掉这个干扰因素就是调换 f_qiye_id,f_some_id的顺序。调换顺序还有个好处,有f_qiye_id等值条件,可以用在索引检索上。
    但是它的负面作用有两个:
    • 原本这个表上有 f_some_id 的等值、join ref以及分页查询,调换索引这两个字段顺序后,全都变成慢查询
    • f_qiye_id作为第一列,满足条件的值可能会有上百万,对这个查询的改观不大
  2. 提供更优的索引
    添加索引 (f_qiye_id,f_contact_time) 看起来不错。这样一来,该类查询都会走这个索引

另外这个表上只有一个唯一索引,而且该唯一索引有字段允许null,所以没有主键。
加一个自增主键 f_id bigint unsigned not null
修改f_some_id字段为 f_some_id bigint unsigned NOT NULL
修改f_qiye_id字段为 f_qiye_id int unsigned NOT NULL
修改字段f_type字段为 tinyint NOT NULL
总之一句话:所有(作为索引的)字段,都定义为NOT NULL,f_some_id全部定义为bigint。最终表结构:

1
2
3
4
5
6
7
8
9
10
CREATE TABLE `t_tbl_test_time_16` (
`f_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`f_some_id` bigint(20) unsigned NOT NULL,
`f_qiye_id` int(11) unsigned NOT NULL DEFAULT '0',
`f_type` tinyint(3) NOT NULL DEFAULT '0' COMMENT '有效联系类型 1: QQ联系,2:拨打电话,3:发送邮件,4:发送短信,5:添加跟进记录,6:拜访客户,7:EC联系,8:更新客户阶段',
`f_contact_time` timestamp NOT NULL DEFAULT '1970-01-01 16:00:01',
PRIMARY KEY(f_id),
UNIQUE KEY `some_qiye_type` (`f_some_id`,`f_qiye_id`,`f_type`) USING BTREE,
KEY `idx_qiye_time` (`f_qiye_id`,`f_contact_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;