Mysql 使用 optimizer_trace 查看执行流程，分析、验证优化思路

时间 2019-11-09

标签 mysql 使用 optimizer trace 查看执行流程分析验证优化思路栏目 MySQL 繁體版

原文原文链接

该博客是我在看了《 MySQL实战45讲》以后的一次实践笔记。文章比较枯燥，若是你在这篇文章看到一些陌生的关键字，建议你也必定要去作实验，只有作实验且验证了各个数据的由来，才能真正弄懂。

背景

Mysql 版本：5.7
业务需求：须要统最近一个月阅读量最大的10篇文章
为了对比后面实验效果，我加了3个索引html

CREATE TABLE `article_rank` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `aid` int(11) unsigned NOT NULL,
  `pv` int(11) unsigned NOT NULL DEFAULT '1',
  `day` int(11) NOT NULL COMMENT '日期 例如 20171016',
  PRIMARY KEY (`id`),
  KEY `idx_day` (`day`),
  KEY `idx_day_aid_pv` (`day`,`aid`,`pv`),
  KEY `idx_aid_day_pv` (`aid`,`day`,`pv`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

实验原理

Optimizer Trace 是MySQL 5.6.3里新加的一个特性，能够把MySQL Optimizer的决策和执行过程输出成文本，结果为JSON格式，兼顾了程序分析和阅读的便利。

利用performance_schema库里面的session_status来统计innodb读取行数
利用performance_schema库里面的optimizer_trace来查看语句执行的详细信息mysql

下面的实验都使用以下步骤来执行算法

#0. 若是前面有开启 optimizer_trace 则先关闭
SET optimizer_trace="enabled=off";

#1. 开启 optimizer_trace
SET optimizer_trace='enabled=on';

#2. 记录如今执行目标 sql 以前已经读取的行数
select VARIABLE_VALUE into @a from performance_schema.session_status where variable_name = 'Innodb_rows_read';

#3. 执行咱们须要执行的 sql
todo

#4. 查询 optimizer_trace 详情
select trace from `information_schema`.`optimizer_trace`\G;

#5. 记录如今执行目标 sql 以后读取的行数
select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';

官方文档 https://dev.mysql.com/doc/int...

实验

我作了四次实验，具体执行的第三步的 sql 以下sql

实验	sql
实验1	select `aid`,sum(`pv`) as num from article_rank force index(idx_day_aid_pv) where `day`>20190115 group by aid order by num desc LIMIT 10;
实验2	select `aid`,sum(`pv`) as num from article_rank force index(idx_day) where `day`>20190115 group by aid order by num desc LIMIT 10;
实验3	select `aid`,sum(`pv`) as num from article_rank force index(idx_aid_day_pv) where `day`>20190115 group by aid order by num desc LIMIT 10;
实验4	select `aid`,sum(`pv`) as num from article_rank force index(PRI) where `day`>20190115 group by aid order by num desc LIMIT 10;

实验1

mysql> select `aid`,sum(`pv`) as num from article_rank force index(idx_day_aid_pv) where `day`>'20190115' group by aid order by num desc LIMIT 10;
# 结果省略
10 rows in set (25.05 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": {
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "disk (InnoDB)",
                "record_format": "fixed"
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 1057,
              "row_size": 36,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 488,
              "sort_mode": "<sort_key, additional_fields>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
Query OK, 1 row affected (0.00 sec)

mysql> select @b-@a;
+---------+
| @b-@a   |
+---------+
| 6417027 |
+---------+
1 row in set (0.01 sec)

实验2

mysql> select `aid`,sum(`pv`) as num from article_rank force index(idx_day) where `day`>'20190115' group by aid order by num desc LIMIT 10;
# 结果省略
10 rows in set (42.06 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": {
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "disk (InnoDB)",
                "record_format": "fixed"
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 1057,
              "row_size": 36,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 488,
              "sort_mode": "<sort_key, additional_fields>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select @b-@a;
+---------+
| @b-@a   |
+---------+
| 9625540 |
+---------+
1 row in set (0.00 sec)

实验3

mysql> select `aid`,sum(`pv`) as num from article_rank force index(idx_aid_day_pv) where `day`>'20190115' group by aid order by num desc LIMIT 10;
# 省略结果
10 rows in set (5.38 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 0,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 649101,
              "row_size": 24,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 352,
              "sort_mode": "<sort_key, rowid>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
Query OK, 1 row affected (0.00 sec)

mysql> select @b-@a;
+----------+
| @b-@a    |
+----------+
| 14146056 |
+----------+
1 row in set (0.00 sec)

实验4

mysql> select `aid`,sum(`pv`) as num from article_rank force index(PRI) where `day`>'20190115' group by aid order by num desc LIMIT 10;# 省略查询结果
10 rows in set (21.90 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": {
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "disk (InnoDB)",
                "record_format": "fixed"
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 1057,
              "row_size": 36,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 488,
              "sort_mode": "<sort_key, additional_fields>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
Query OK, 1 row affected (0.00 sec)

mysql> select @b-@a;
+----------+
| @b-@a    |
+----------+
| 17354569 |
+----------+
1 row in set (0.00 sec)

执行流程举例说明

看下本案例中的 sql 去掉强制索引以后的语句json

select `aid`,sum(`pv`) as num from article_rank where `day`>20190115 group by aid order by num desc LIMIT 10;

咱们以实验1为例数组

第一步

由于该 sql 中使用了 group by，因此咱们看到optimizer_trace在执行时（join_execution）都会先建立一张临时表creating_tmp_table）来存放group by子句以后的结果。缓存

存放的字段是 aid和 num两个字段。该临时表是如何存储的? row_length 为何是 20? 另开三篇博客写了这个问题
https://mengkang.net/1334.html
https://mengkang.net/1335.html
https://mengkang.net/1336.html

第二步

由于memory_table_size_exceeded的缘由，须要把临时表intermediate_tmp_table以InnoDB引擎存在磁盘。bash

mysql> show global variables like '%table_size';
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| max_heap_table_size | 16777216 |
| tmp_table_size      | 16777216 |
+---------------------+----------+

https://dev.mysql.com/doc/ref...
https://dev.mysql.com/doc/ref...
max_heap_table_size
This variable sets the maximum size to which user-created MEMORY tables are permitted to grow. The value of the variable is used to calculate MEMORY table MAX_ROWS values. Setting this variable has no effect on any existing MEMORY table, unless the table is re-created with a statement such as CREATE TABLE or altered with ALTER TABLE or TRUNCATE TABLE. A server restart also sets the maximum size of existing MEMORY tables to the global max_heap_table_size value.session

tmp_table_size
The maximum size of internal in-memory temporary tables. This variable does not apply to user-created MEMORY tables.
The actual limit is determined from whichever of the values of tmp_table_size and max_heap_table_size is smaller. If an in-memory temporary table exceeds the limit, MySQL automatically converts it to an on-disk temporary table. The internal_tmp_disk_storage_engine option defines the storage engine used for on-disk temporary tables.app

也就是说这里临时表的限制是16M，而一行须要占的空间是20字节，那么最多只能容纳floor(16777216/20) = 838860行，因此row_limit_estimate是838860。

咱们统计下group by以后的总行数。

mysql> select count(distinct aid) from article_rank where `day`>'20190115';
+---------------------+
| count(distinct aid) |
+---------------------+
|              649091 |
+---------------------+

649091 < 838860

问题：为何会触发 memory_table_size_exceeded呢？

数据写入临时表的过程以下：

在磁盘上建立临时表，表里有两个字段，aid和num，由于是 group by aid，因此aid是临时表的主键。
实验1中是扫描索引idx_day_aid_pv，依次取出叶子节点的aid和pv的值。
若是临时表种没有对应的 aid就插入，若是已经存在的 aid，则把须要插入行的 pv 累加在原来的行上。

第三步

对intermediate_tmp_table里面的num字段作desc排序

filesort_summary.examined_rows

排序扫描行数统计，咱们统计下group by以后的总行数。（前面算过是649091）

因此每一个实验的结果中filesort_summary.examined_rows 的值都是649091。
filesort_summary.number_of_tmp_files的值为0，表示没有使用临时文件来排序。

filesort_summary.sort_mode

MySQL 会给每一个线程分配一块内存用于排序，称为sort_buffer。sort_buffer的大小由sort_buffer_size来肯定。

mysql> show global variables like 'sort_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| sort_buffer_size | 262144 |
+------------------+--------+
1 row in set (0.01 sec)

也就说是sort_buffer_size默认值是256KB

https://dev.mysql.com/doc/ref...
Default Value (Other, 64-bit platforms, >= 5.6.4) 262144

排序的方式也是有多种的

<sort_key, rowid>
<sort_key, additional_fields>
<sort_key, packed_additional_fields>

additional_fields

初始化sort_buffer，肯定放入字段，由于咱们这里是根据num来排序，因此sort_key就是num，additional_fields就是aid；
把group by 子句以后生成的临时表（intermediate_tmp_table）里的数据（aid,num）存入sort_buffer。咱们经过number_of_tmp_files值为0，知道内存是足够用的，并无使用外部文件进行归并排序；
对sort_buffer中的数据按num作快速排序；
按照排序结果取前10行返回给客户端；

rowid

根据索引或者全表扫描，按照过滤条件得到须要查询的排序字段值和row ID；
将要排序字段值和row ID组成键值对，存入sort buffer中；
若是sort buffer内存大于这些键值对的内存，就不须要建立临时文件了。不然，每次sort buffer填满之后，须要在内存中排好序（快排），并写到临时文件中；
重复上述步骤，直到全部的行数据都正常读取了完成；
用到了临时文件的，须要利用磁盘外部排序，将row id写入到结果文件中；
根据结果文件中的row ID按序读取用户须要返回的数据。因为row ID不是顺序的，致使回表时是随机IO，为了进一步优化性能（变成顺序IO），MySQL会读一批row ID，并将读到的数据按排序字段顺序插入缓存区中(内存大小read_rnd_buffer_size)。

实验结果分析

在看了附录中的实验结果以后，我汇总了一些比较重要的数据对比信息

指标	index	query_time	filesort_summary.examined_rows	filesort_summary.sort_mode	filesort_priority_queue_optimization.rows_estimate	converting_tmp_table_to_ondisk	Innodb_rows_read
实验1	idx_day_aid_pv	25.05	649091	additional_fields	1057	true	6417027
实验2	idx_day	42.06	649091	additional_fields	1057	true	9625540
实验3	idx_aid_day_pv	5.38	649091	rowid	649101	false	14146056
实验4	PRI	21.90	649091	additional_fields	1057	true	17354569

filesort_summary.examined_rows

实验1案例中已经分析过。

mysql> select count(distinct aid) from article_rank where `day`>'20190115';
+---------------------+
| count(distinct aid) |
+---------------------+
|              649091 |
+---------------------+

filesort_summary.sort_mode

一样的字段，一样的行数，为何有的是additional_fields排序，有的是rowid排序呢？

前面咱们已经分析过对于 InnoDB 表来讲 additional_fields 对比 rowid 来讲，减小了回表，也就减小了磁盘访问，会被优先选择。可是要注意这是对于 InnoDB 来讲的。而实验3是内存表，使用的是 memory 引擎。回表过程只是根据数据行的位置，直接访问内存获得数据，不会有磁盘访问（能够简单的理解为一个内存中的数组下标去找对应的元素），排序的列越少越好占的内存就越小，因此就选择了 rowid 排序。

还有一个缘由就是咱们这里使用了limit 10这样堆的成员个数比较小，因此占用的内存不会太大。不要忘了这里选择优先队列排序算法依然受到sort_buffer_size的限制。

关于内存表的排序详解，能够参考 MySQL实战45讲的第17讲如何正确地显示随机消息

filesort_priority_queue_optimization

关于这里的 filesort_priority_queue_optimization 算法能够参考 http://www.javashuo.com/article/p-ctofdnun-np.html

优先队列排序执行步骤分析：

在临时表（未排序）中取出前 10 行，把其中的num（来源于sum(pv)）和rowid做为10个元素构成一个小顶堆，也就是最小的 num 在堆顶。
取下一行，根据 num 的值和堆顶值做比较，若是该字大于堆顶的值，则替换掉。而后将新的堆作堆排序。
重复步骤2直到第 649091 行比较完成。
而后对最后的10行作一次回表查询其 aid,num。

rows_estimate

根据以上分析，先读取了 649091 行，而后回表又读取了 10 行，因此总共是 649101 行。
实验3的结果与之相吻合，可是其余的都是 1057 行，是怎么算出来的呢？

row_size

没弄明白

存储在临时表里时，都是 aid 和 num 字段，占用宽度是4+15是19字节。
实验3是 rowid 排序，也就是说num 15 字节 + row ID 6 字节，应该是21字节，实际结果是24字节；
其余是 additional_fields 排序，也就是15+4+6 25 字节，实际结果是36字节。

converting_tmp_table_to_ondisk

是否建立临时表。一样是写入 649091 到内存临时表，为何其余三种方式都会出现内存不够用的状况呢？
注意到一点，实验3中建立临时表时key_length是0，其余都是4

Innodb_rows_read

上面实验中每次在统计@b-@a的过程当中，咱们查询了OPTIMIZER_TRACE这张表，须要用到临时表，而 internal_tmp_disk_storage_engine 的默认值是 InnoDB。若是使用的是 InnoDB 引擎的话，把数据从临时表取出来的时候，会让 Innodb_rows_read 的值加 1。

咱们先查询下面两个数据，下面须要使用到

mysql> select count(*) from article_rank;
+----------+
| count(*) |
+----------+
| 14146055 |
+----------+

mysql> select count(*) from article_rank where `day`>'20190115';
+----------+
| count(*) |
+----------+
|  3208513 |
+----------+

实验1

由于知足条件的总行数是3208513，由于使用的是idx_day_aid_pv索引，而查询的值是aid和pv，因此是覆盖索引，不须要进行回表。
可是能够看到在建立临时表（creating_tmp_table）以后，由于超过临时表内存限制（memory_table_size_exceeded），因此这3208513行数据的临时表会写入磁盘，使用的依然是InnoDB引擎。
因此实验1最后结果是 3208513*2 + 1 = 6417027；

实验2

相比实验1，实验2中不只须要对临时表存盘，同时由于索引是idx_day，不能使用覆盖索引，还须要每行都回表，因此最后结果是 3208513*3 + 1 = 9625540；

实验3

实验3中由于最左列是aid，没法对day>20190115表达式进行过滤筛选，因此须要遍历整个索引（覆盖全部行的数据）。
可是本次过程当中建立的临时表（memory 引擎）都是在内存中操做，因此最后结果是14146055 + 1 = 14146056；

须要注意，若是咱们开启慢查询日志，慢查询日志里面的扫描行数和这里统计的不同，内存临时表的扫描行数也算在内的。

实验4

实验4首先遍历主表，须要扫描14146055行，而后把符合条件的3208513行放入临时表，因此最后是14146055 + 3208513 + 1 = 17354569。

参考

《 MySQL实战45讲》
https://time.geekbang.org/column/article/73479
https://time.geekbang.org/column/article/73795
https://dev.mysql.com/doc/ref...
https://juejin.im/entry/59019...