mysql-覆盖索引

时间 2019-11-19 标签 mysql 覆盖索引

什么叫作覆盖索引？

解释一：就是select的数据列只用从索引中就可以取得，没必要从数据表中读取，换句话说查询列要被所使用的索引覆盖。
解释二：索引是高效找到行的一个方法，当能经过检索索引就能够读取想要的数据，那就不须要再到数据表中读取行了。若是一个索引包含了（或覆盖了）知足查询语句中字段与条件的数据就叫作覆盖索引。
解释三：是非汇集组合索引的一种形式，它包括在查询里的Select、Join和Where子句用到的全部列（即创建索引的字段正好是覆盖查询语句[select子句]与查询条件[Where子句]中所涉及的字段，也即，索引包含了查询正在查找的全部数据）。

　　不是全部类型的索引均可以成为覆盖索引。覆盖索引必需要存储索引的列，而哈希索引、空间索引和全文索引等都不存储索引列的值，因此MySQL只能使用B-Tree索引作覆盖索引sql

　　当发起一个被索引覆盖的查询(也叫做索引覆盖查询)时，在EXPLAIN的Extra列能够看到“Using index”的信息性能

几种优化场景：

　　1.无WHERE条件的查询优化：

　　执行计划中，type 为ALL，表示进行了全表扫描优化

　　如何改进？优化措施很简单，就是对这个查询列创建索引。以下，spa

ALERT TABLE t1 ADD KEY(staff_id);

再看一下执行计划

explain select sql_no_cache count(staff_id) from t1\G *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: staff_id
      key_len: 1
          ref: NULL
         rows: 1023849
　　　　　 Extra: Using index


1 row in set (0.00 sec)

　　possible_key: NULL，说明没有WHERE条件时查询优化器没法经过索引检索数据，这里使用了索引的另一个优势，即从索引中获取数据，减小了读取的数据块的数量。无where条件的查询，能够经过索引来实现索引覆盖查询，但前提条件是，查询返回的字段数足够少，更不用说select *之类的了。毕竟，创建key length过长的索引，始终不是一件好事情。code

查询消耗

　　从时间上看，小了0.13 secblog

二、二次检索优化

　　以下这个查询：排序

select sql_no_cache rental_date from t1 where inventory_id<80000;
…
…
| 2005-08-23 15:08:00 |
| 2005-08-23 15:09:17 |
| 2005-08-23 15:10:42 |
| 2005-08-23 15:15:02 |
| 2005-08-23 15:15:19 |
| 2005-08-23 15:16:32 |
+---------------------+
79999 rows in set (0.13 sec)

　　执行计划：索引

explain select sql_no_cache rental_date from t1 where inventory_id<80000\G *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: range
possible_keys: inventory_id
          key: inventory_id
      key_len: 3
          ref: NULL
         rows: 153734
        Extra: Using index condition
1 row in set (0.00 sec)

　　Extra：Using index condition 表示使用的索引方式为二级检索，即79999个书签值被用来进行回表查询。可想而知，仍是会有必定的性能消耗的get

　　尝试针对这个SQL创建联合索引，以下：博客

alter table t1 add key(inventory_id,rental_date);

　　执行计划：

explain select sql_no_cache rental_date from t1 where inventory_id<80000\G *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: range
possible_keys: inventory_id,inventory_id_2
          key: inventory_id_2
      key_len: 3
          ref: NULL
         rows: 162884
        Extra: Using index
1 row in set (0.00 sec)

　　Extra：Using index 表示没有会标查询的过程，实现了索引覆盖

三、分页查询优化

　　以下这个查询场景

select tid,return_date from t1 order by inventory_id limit 50000,10; +-------+---------------------+
| tid   | return_date         |
+-------+---------------------+
| 50001 | 2005-06-17 23:04:36 |
| 50002 | 2005-06-23 03:16:12 |
| 50003 | 2005-06-20 22:41:03 |
| 50004 | 2005-06-23 04:39:28 |
| 50005 | 2005-06-24 04:41:20 |
| 50006 | 2005-06-22 22:54:10 |
| 50007 | 2005-06-18 07:21:51 |
| 50008 | 2005-06-25 21:51:16 |
| 50009 | 2005-06-21 03:44:32 |
| 50010 | 2005-06-19 00:00:34 |
+-------+---------------------+
10 rows in set (0.75 sec)

　　在未优化以前，咱们看到它的执行计划是如此的糟糕

explain select tid,return_date from t1 order by inventory_id limit 50000,10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1023675
        
1 row in set (0.00 sec)

　　看出是全表扫描。加上而外的排序，性能消耗是不低的

　　如何经过覆盖索引优化呢？

　　咱们建立一个索引，包含排序列以及返回列，因为tid是主键字段，所以，下面的复合索引就包含了tid的字段值

alter table t1 add index liu(inventory_id,return_date);

　　那么，效果如何呢？

select tid,return_date from t1 order by inventory_id limit 50000,10; +-------+---------------------+
| tid   | return_date         |
+-------+---------------------+
| 50001 | 2005-06-17 23:04:36 |
| 50002 | 2005-06-23 03:16:12 |
| 50003 | 2005-06-20 22:41:03 |
| 50004 | 2005-06-23 04:39:28 |
| 50005 | 2005-06-24 04:41:20 |
| 50006 | 2005-06-22 22:54:10 |
| 50007 | 2005-06-18 07:21:51 |
| 50008 | 2005-06-25 21:51:16 |
| 50009 | 2005-06-21 03:44:32 |
| 50010 | 2005-06-19 00:00:34 |
+-------+---------------------+
10 rows in set (0.03 sec)

　　能够发现，添加复合索引后，速度提高0.7s！

　　咱们看一下改进后的执行计划

explain select tid,return_date from t1 order by inventory_id limit 50000,10\G *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: liu
      key_len: 9
          ref: NULL
         rows: 50010

　　　　Extra: Using index

1 row in set (0.00 sec)

　　执行计划也能够看到，使用到了复合索引，而且不须要回表

　　对比一下以下的改写SQL，思想是经过索引消除排序

select a.tid,a.return_date from t1 a inner join (select tid from t1 order by inventory_id limit 800000,10) b on a.tid=b.tid;

　　并在此基础上，咱们为inventory_id列建立索引，并删除以前的覆盖索引

alter table t1 add index idx_inid(inventory_id)； drop index liu;

　　而后收集统计信息。

select a.tid,a.return_date from  t1 a inner join  (select tid from t1 order by inventory_id limit 800000,10) b on a.tid=b.tid; +--------+---------------------+
| tid    | return_date         |
+--------+---------------------+
| 800001 | 2005-08-24 13:09:34 |
| 800002 | 2005-08-27 11:41:03 |
| 800003 | 2005-08-22 18:10:22 |
| 800004 | 2005-08-22 16:47:23 |
| 800005 | 2005-08-26 20:32:02 |
| 800006 | 2005-08-21 14:55:42 |
| 800007 | 2005-08-28 14:45:55 |
| 800008 | 2005-08-29 12:37:32 |
| 800009 | 2005-08-24 10:38:06 |
| 800010 | 2005-08-23 12:10:57 |
+--------+---------------------+

　　这种优化手段较前者时间多消耗了大约140ms。这种优化手段虽然使用索引消除了排序，可是仍是要经过主键值回表查询。所以，在select返回列较少或列宽较小的时候，咱们能够经过创建复合索引的方式优化分页查询，效果更佳，由于它不须要回表！

参考文献:

[1] 袋鼠云技术团队博客，https://yq.aliyun.com/articles/62419

[2] Baron Schwartz等著，宁海元等译；《高性能MySQL》（第3版）；电子工业出版社，2013