咱们常常在论坛和面试中遇到这个问题,mysql中,where in会不会用到索引?mysql
为了完全搞明白这个问题,作了一些测试,发现记录数大小对是否命中索引有影响,咱们来看一看。面试
使用的mysql版本是5.7,数据库引擎为默认的innoDB,索引类型是默认的B+树索引,用explain执行计划确认是否命中索引。sql
咱们建立一个表数据库
create table staffs( id int primary key auto_increment, name varchar(24) not null default '' comment '姓名', age int not null default 0 comment '年龄', pos varchar(20) not null default '' comment '职位', add_time timestamp not null default current_timestamp comment '入职时间' )charset utf8 comment '员工记录表';
先插入三条数据数组
insert into staffs(name,age,pos,add_time) values('z3',22,'manager',now()); insert into staffs(name,age,pos,add_time) values('July',23,'dev',now()); insert into staffs(name,age,pos,add_time) values('2000',23,'dev',now());
alter table staffs add index idx_staffs_name(name);
mysql> explain select * from staffs where name in ('z3', '2000'); +----+-------------+--------+------------+------+-----------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_name | NULL | NULL | NULL | 3 | 66.67 | Using where | +----+-------------+--------+------------+------+-----------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
能够看到,没有命中索引,行数为3,server层对存储引擎返回的数据作过滤以后剩余66.67%,也就是说,存储引擎返回了3条记录,mysql的server层过滤掉1条,剩下2条,filtered的值为66.67%. (explain详见以前的博文: http://www.javashuo.com/article/p-nawevcyl-ds.html)bash
准备索引测试
alter table staffs drop index idx_staffs_name; alter table staffs add index idx_staffs_nameAgePos(name, age, pos);
mysql> explain select * from staffs where name = 'z3'; +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+ | 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 74 | const | 1 | 100.00 | NULL | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
mysql> explain select * from staffs where name in ('z3', '2000'); +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 3 | 66.67 | Using where | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.04 sec)
能够看到,用 = 查询时,因为最左原则,用到了索引,而用in查询时,没有用到索引。优化
mysql> explain select * from staffs where name = 'z3' and age = 22; +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+ | 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | const,const | 1 | 100.00 | NULL | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
mysql> explain select * from staffs where name = 'z3' and age in (22, 23); +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 3 | 66.67 | Using where | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
一样的,当使用 = 查询时,依次使用了联合索引,而第二个字段用 in 查询时,连第一个字段都被拖累,没有使用索引。spa
为了快速插入大量数据并建立索引,咱们先把原来的那张表drop掉,再建一张同样的表,不带任何索引,这样就不会耗费更新索引的时间。这边用存储过程插入。.net
DELIMITER $$ CREATE PROCEDURE test_insert() BEGIN declare i int; set i = 1 ; WHILE (i < 10000) DO INSERT INTO staffs(`name`,`age`,`pos`) VALUES(CONCAT('a', i), FLOOR(20 + RAND() * (100 - i + 1)),'dev'); set i = i + 1; END WHILE; commit; END$$ DELIMITER ; CALL test_insert();
Query OK, 0 rows affected (8 min 7.84 sec)
9999条数据耗时8分多钟,仍是有点慢的。
按照以前的动做,创建索引(命令和上面同样,为了节约篇幅,这里就不放出来了,下同),再查询。
mysql> explain select * from staffs where name in ('a1', 'a2000'); +----+-------------+--------+------------+-------+-----------------+-----------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------+-----------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_name | idx_staffs_name | 74 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------+-----------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
命中索引,2条记录,准确率100%.
一样先删除单列索引,建立联合索引。
mysql> explain select * from staffs where name in ('a1', 'a2000'); +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 74 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
命中索引。
mysql> explain select * from staffs where name in ('a1', 'a2000') and age = 23; +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
in字段后面再加条件也能够命中。
mysql> explain select * from staffs where name = 'a1' and age in (22, 23); +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.01 sec)
mysql> explain select * from staffs where name in ('a1', 'a2000') and age in (22, 23); +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | NULL | 4 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
对中间字段也没有影响,一样能够命中索引。
3.1 当数据量少时,会按照联合索引的顺序依次使用索引,反而不会使用单列索引,可能的缘由是,mysql认为数据量过小,直接走全表查询,全表扫描反而更快。
3.2 当数据量大时,单列索引必定会使用。联合索引也会按顺序依次使用。
3.3 固然这里in条件里面的数值长度不大,若是是一个很长数组,致使返回的结果占全表记录数量较大时,应该也不会使用索引而走全表查询。
3.4 这里尚未测试,当in条件里面是一个子查询时的状况。同时,这里没有对5.7如下版本作测试。这里引用一段这位博主的话
若是是 5.5 以前的版本确实不会走索引的,在 5.5 以后的版本,MySQL 作了优化。MySQL 在 2010 年发布 5.5 版本中,优化器对 in 操做符能够自动完成优化,针对创建了索引的列可使用索引,没有索引的列仍是会走全表扫描。
好比,5.5 以前的版本(如下都是 5.5 之前的版本)。select * from a where id in (select id from b); 这条 sql 语句它的执行计划其实并非先查询出 b 表的全部 id,而后再与 a 表的 id 进行比较。mysql 会把 in 子查询转换成 exists 相关子查询,因此它实际等同于这条 sql 语句:select * from a where exists(select * from b where b.id=a.id);
而 exists 相关子查询的执行原理是:循环取出 a 表的每一条记录与 b 表进行比较,比较的条件是 a.id=b.id。看 a 表的每条记录的 id 是否在 b 表存在,若是存在就行返回 a 表的这条记录。