运营组的同事最近提出一个需求,但愿能够统计出用系统用户及订单状况,因而乎咱们很想固然的写出了一个统计SQL,用户表user和行程表直接join,而且针对行程作了group,但SQL执行速度出奇的慢。数据库
explain select users.`mobile_num`, concat(users.`lastName` ,users.`firstName`) as userName, users.`company`, (case `users`.`idPhotoCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `idPhotoCheckStatus`, (case `users`.`driverLicenseCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `driverLicenseCheckStatus`, (case `users`.`companyCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `companyCheckStatus`, (case `users`.`unionCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `unionCheckStatus`, count(passenger_trip.id) as ptrip_num from users left join passenger_trip on passenger_trip.userId = users.id and passenger_trip.status != 'cancel' left join driver_trip on driver_trip.`userId`=users.`id` and driver_trip.`status` != 'cancel' where company != '本公司名' and company != '本公司昵称'
当时的第一反应是数据库挂住了,由于用户表的数据量10W左右,行程表的数据也是10W左右,不可能这么慢!经过explain查看分析计划,而且查看过关联字段的索引状况,发现这是一个最多见的关联查询,固然是经过join实现。ide
转而一想,10W*10W,通过笛卡尔集以后,这不是百亿级的数据筛选吗?!因而换了一种写法进行尝试。3d
explain select users.`mobile_num`, concat(users.`lastName` ,users.`firstName`) as userName, users.`company`, (case `users`.`idPhotoCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `idPhotoCheckStatus`, (case `users`.`driverLicenseCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `driverLicenseCheckStatus`, (case `users`.`companyCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `companyCheckStatus`, (case `users`.`unionCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `unionCheckStatus`, (select count(passenger_trip.id) from passenger_trip where passenger_trip.userId = users.id and passenger_trip.status != 'cancel') as ptrip_num, (select count(driver_trip.id) from driver_trip where driver_trip.userId = users.id and driver_trip.status != 'cancel') as dtrip_num from users where company != '本公司名' and company != '公司昵称'
这样的效果竟然比直接join快了N倍,执行速度从未知到10秒内返回,查看执行计划:code
进一步调整SQL进行尝试:blog
explain select users.`mobile_num`, concat(users.`lastName` ,users.`firstName`) as userName, users.`company`, (case `users`.`idPhotoCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `idPhotoCheckStatus`, (case `users`.`driverLicenseCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `driverLicenseCheckStatus`, (case `users`.`companyCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `companyCheckStatus`, (case `users`.`unionCheckStatus` when '2' then '已认证' when '3' then '已驳回' else '待认证' end) as `unionCheckStatus`, ptrip_num, dtrip_num from users left join (select count(passenger_trip.id) as ptrip_num, passenger_trip.`userId` from passenger_trip where passenger_trip.status != 'cancel' group by passenger_trip.`userId` ) as ptrip on ptrip.userId = users.id left join (select count(driver_trip.id) as dtrip_num, driver_trip.`userId` from driver_trip where driver_trip.status != 'cancel' group by driver_trip.`userId` ) as dtrip on dtrip.userId = users.id where company != '本公司名' and company != '公司昵称'
竟然5秒内返回,这才是正常的预期,10W级的数据筛选,应该是几秒内返回的!排序
出现这种差异的缘由,其实很简单,SQL语句执行的时候是有必定顺序的。索引
第一种写法,直接join的结果,就是在100亿条数据中进行筛选;
后面两种则是优先执行子查询,完成10W级别的查询,再进行一次主表10W级的关联查询,因此数量级明显少于第一种写法。ip