mysql的聚簇索引和非聚簇索引

时间 2019-11-08

标签 mysql 索引栏目 MySQL 繁體版

原文原文链接

mysql聚簇索引和非聚簇索引

这篇文章主要介绍mysql中innodb的聚簇索引和非聚簇索引,那首先咱们要先看下聚簇索引和非聚簇索引的概念是什么,是干什么用的.html

聚簇索引和非聚簇索引的概念

咱们先引用官网上的一段话来看看它们是干吗的mysql

Every InnoDB table has a special index called the clustered index where the data for the rows is stored. Typically, the clustered index is synonymous with the primary key. To get the best performance from queries, inserts, and other database operations, you must understand how InnoDB uses the clustered index to optimize the most common lookup and DML operations for each table.

When you define a PRIMARY KEY on your table, InnoDB uses it as the clustered index. Define a primary key for each table that you create. If there is no logical unique and non-null column or set of columns, add a new auto-increment column, whose values are filled in automatically.

If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.

If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index named GEN_CLUST_INDEX on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
复制代码

有耐性的朋友能够本身翻译看看,这里我们大概翻译了一下,总结出上面这段话的意思:
每一个InnoDB表都有一个特殊的索引，称为聚簇索引，用于存储行数据。
1.若是建立了一个主键，InnoDB会将其用做聚簇索引(若是主键没有逻辑惟一且非空的列或列集，最好是设置成自动递增的)
2.若是没有为表建立主键，则MySQL会在全部键列都不为NULL的状况下找到第一个UNIQUE索引，InnoDB会将其用做汇集索引
3.若是表没有PRIMARY KEY或合适的UNIQUE索引，则InnoDB在包含行ID值的合成列上内部生成一个名为GEN_CLUST_INDEX的隐藏的汇集索引(隐藏的是看不到的,也就是说不会出如今desc table中,行ID是一个6字节的字段，随着插入新行而单调增长)
从这三种状况来看的话,就是说无论你有没有建立主键,mysql都会给你弄一个聚簇索引给安排上,你建立了就用你设置的主键为聚簇索引,没有建立就给你来个隐藏的.sql

聚簇索引(也称为主键索引)就是携带了行数据的索引,非聚簇索引就是除了聚簇索引以外的索引.这样提及来可能有点干巴巴的,我们画个图来理解一下.
假设有一张表test数据库

create table test(
id int primary key,
age int not null,
name varchar(16),
PRIMARY KEY (`id`),
KEY `idx_age` (`age`) USING BTREE,
KEY `idx_name` (`name`) USING BTREE,
)engine=InnoDB;
复制代码

主键是id,而后有两个普通索引idx_age,idx_name(btree类型的索引),使用的是innodb引擎. 咱们知道id就是聚簇索引,idx_age,idx_name是非聚簇索引. 如今有三条数据(1,11,'甲'),(2,12,'乙'),(2,13,'丙').那么他们在数据库中存储的形式是,以下: 聚簇索引: bash

非聚簇索引:

能够看到聚簇索引后面是直接跟着的数据,而非聚簇索引指向的是聚簇索引的key值. 所以非聚簇索引查询数据须要先查到聚簇索引的key,而后用这个key去查询真正的数据(这个过程称为回表). 也就是说非聚簇索引是须要查询两次如图:

因此能走聚簇索引的尽可能走聚簇索引(也能够说是尽可能走主键),看起来都是走索引,实际上主键要更快. 并且主键索引若是是自增的int类型,由于长度比较小,占用的空间也比较小.

覆盖索引

咱们上面说到若是是非聚簇索引的话会须要回表,查询两次,可是若是要查询得字段,数据直接就在索引上是能够不须要回表的.这种索引称为覆盖索引. 好比咱们要查询上面的test表中的age和name两个字段.函数

select id,age,name from test where age = 13;
复制代码

直接查询的话,会根据age的索引找到id的key,而后再用id去查询出数据. 可是若是咱们建立一个(age,name)的联合索引,状况就不同了.
优化

由于要返回的值,id在联合索引指向的key上,age和name共同组成了联合索引, 所以数据都在(age,name)的联合索引上,并不须要回表在去查询一次,能够大大提升查询得效率.
固然这个查询要比较频繁,使用率比较高,毕竟建立索引也是要消耗资源的,实际状况要根据查询频率和索引大小来作出判断.
有联合索引存在的状况下能走覆盖索引固然是最好的,提升了查询效率.
注:还有在某些count聚合函数使用的时候可使用覆盖索引来优化count,好比说select count(age) from test.
由于age是有索引了,直接使用到的也是age,因此覆盖索引了,无需回表.

总结:

1.聚簇索引和非聚簇索引,查询得时候使用聚簇索引能提升查询效率,尽可能避免走非聚簇索引回表的耗时操做
2.覆盖索引能提升查询效率,主要是避免了回表的操做,查询得时候根据具体状况创建合适的索引走覆盖索引提升查询速度ui

参考资料:

1.dev.mysql.com/doc/refman/…
2.mp.weixin.qq.com/s/y0pjtNUZh…spa