MySQL 重复索引探讨(持续更新中...)

时间 2019-11-17

标签 mysql 重复索引探讨持续更新栏目 MySQL 繁體版

原文原文链接

资料参考：http://xiezhenye.com/2015/01/%E6%89%BE%E5%88%B0-mysql-%E6%95%B0%E6%8D%AE%E5%BA%93%E4%B8%AD%E7%9A%84%E4%B8%8D%E8%89%AF%E7%B4%A2%E5%BC%95.html html

<一> 建立'有问题的'表

1.建立表test1

CREATE TABLE test1 (
  id int(11) NOT NULL,
  f1 int(11) DEFAULT NULL,
  f2 int(11) DEFAULT NULL,
  f3 int(11) DEFAULT NULL,
  PRIMARY KEY (id),
  KEY k1 (f1,id),
  KEY k2 (id,f1),
  KEY k3 (f1),
  KEY k4 (f1,f3),
  KEY k5 (f1,f3,f2)
)

2.建立表 test2

CREATE TABLE test2 (
  id1 int(11) NOT NULL DEFAULT 0,
  id2 int(11) NOT NULL DEFAULT 0,
  b int(11) DEFAULT NULL,
  PRIMARY KEY (id1,id2),
  KEY k1 (b)
)

<二> 存在问题的索引

1. 包含主键的索引

innodb 自己是聚簇表，每一个二级索引自己就包含主键，相似f1,id 的索引，虽然实际没什么害处，但反映使用者对mysql 索引的不了解。而 id,f1 这种多余索引，会浪费存储空间，并影响数据更新性能。包含主键的索引用这样一句sql 就能所有找出来：mysql

select c.*, pk from 
       (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 
         from INFORMATION_SCHEMA.STATISTICS 
         where index_name != 'PRIMARY' and table_schema != 'mysql'
     group by table_schema, table_name, index_name) c,
       (select table_schema, table_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') pk 
         from INFORMATION_SCHEMA.STATISTICS 
         where index_name = 'PRIMARY' and table_schema != 'mysql'
     group by table_schema, table_name) p  
     where c.table_name = p.table_name and c.table_schema = p.table_schema and c.cols like concat('%', pk, '%');

结果：
sql

2.重复的索引

包含重复前缀的索引，索引能由另外一个包含该前缀的索引彻底代替，是多余索引。多余的索引会浪费存储空间，并影响数据更新性能。这样的索引一样用一句 sql 能够找出来。性能

select c1.table_schema, c1.table_name, c1.index_name,c1.cols,c2.index_name, c2.cols from
       (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 
         from INFORMATION_SCHEMA.STATISTICS 
         where table_schema != 'mysql' and index_name!='PRIMARY'
     group by table_schema,table_name,index_name) c1,   
       (select table_schema, table_name,index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 
         from INFORMATION_SCHEMA.STATISTICS 
         where table_schema != 'mysql' and index_name != 'PRIMARY'
     group by table_schema, table_name, index_name) c2 
     where c1.table_name = c2.table_name and c1.table_schema = c2.table_schema and c1.cols like concat(c2.cols, '%') and c1.index_name != c2.index_name;

结果：spa

2.2 关于冗余索引，补充一些

若是建立了索引(A,B)，再建立索引(A)，则(A) 是冗余索引，由于这只是前一个索引的前缀索引。code

所以索引(A,B)也能够当作索引(A)来使用(这种冗余只是对B-Tree 索引来讲的)。htm

可是若是再建立索引(B,A)，则不是冗余索引，索引(B)也不是，由于B 不是索引(A,B)的最左前缀列。索引

另外，其它不一样类型的索引(如哈希索引或者全文索引)也不会是B-Tree 索引的冗余索引，而不管覆盖的索引列是什么。get

3. 低区分度索引

这样的索引因为仍然会扫描大量记录，在实际查询时一般会被忽略。可是在某些状况下仍然是有用的。所以须要根据实际状况进一步分析。这里是区分度小于 10% 的索引，能够根据须要调整参数。it

select p.table_schema, p.table_name, c.index_name, c.car, p.car total from
       (select table_schema, table_name, index_name, max(cardinality) car
         from INFORMATION_SCHEMA.STATISTICS
     where index_name != 'PRIMARY'
     group by table_schema, table_name,index_name) c,
       (select table_schema, table_name, max(cardinality) car
         from INFORMATION_SCHEMA.STATISTICS
     where index_name = 'PRIMARY' and table_schema != 'mysql'
     group by table_schema,table_name) p
     where c.table_name = p.table_name and c.table_schema = p.table_schema and p.car > 0 and c.car / p.car < 0.1;

结果：

4. 复合主键

因为 innodb 是聚簇表，每一个二级索引都会包含主键值。复合主键会形成二级索引庞大，而影响二级索引查询性能，并影响更新性能。一样须要根据实际状况进一步分析。

sql 为：

select table_schema, table_name, group_concat(column_name order by seq_in_index separator ',') cols, max(seq_in_index) len
        from INFORMATION_SCHEMA.STATISTICS
        where index_name = 'PRIMARY' and table_schema != 'mysql'
        group by table_schema, table_name having len>1;

结果为：