在用flask-sqlalchemy对一个千万级别表进行count操做时,出现了耗时严重、内存飙升的状况。
要统计出一天内车辆访问次数,原代码以下:sql
car_visit_counts = CarVisit.query.filter( CarVisit.park == car_visit.park, CarVisit.plate_number == car_visit.plate_number, CarVisit.visited_at >= today_start_time(), ).count()
发现代码运行特别慢,因此把生成的sql打印出来看一下:flask
SELECT COUNT(*) AS count_1 FROM ( SELECT car_visits.id AS car_visits_id , car_visits.park_id AS car_visits_park_id , car_visits.store_id AS car_visits_store_id , car_visits.car_id AS car_visits_car_id , car_visits.brand_id AS car_visits_brand_id , ... FROM car_visits WHERE %(param_1)s = car_visits.park_id AND car_visits.plate_number = %(plate_number_1)s AND car_visits.visited_at >= %(visited_at_1)s ) AS anon_1
能够发现进行了一次子查询,这样的话会生成临时表,效率低下,将原语句改变一下:session
car_visit_counts = db.session.query(func.count(CarVisit.id)).filter( CarVisit.park == car_visit.park, CarVisit.plate_number == car_visit.plate_number, CarVisit.visited_at >= today_start_time(), ).scalar()
此时在看一下打印的sql语句:scala
SELECT COUNT(car_visits.id) AS count_1 FROM car_visits WHERE %(param_1)s = car_visits.park_id AND car_visits.plate_number = %(plate_number_1)s AND car_visits.visited_at >= %(visited_at_1)s
子查询消失了,速度也快了好多。code