SQL 语句优化贯穿于数据库类应用程序的整个生命周期,包括前期程序开发,产品测试以及后期生产维护。针对于不一样类型的 SQL 性能问题有不一样的优化方法。索引对于改善数据库 SQL 查询操做性能相当重要,如何选择合适的列以及正确的组合所选择的列建立索引对查询语句的性能有着极大的影响,本文将结合具体案例进行解释。html
客户 A 业务核心数据库采用 DB2 UDB,业务部门报告其中一个模块响应缓慢,经过分析该业务模块代码能够定位为一条性能较差的 SQL 语句。sql
1
2
3
|
db2fox@bivm:~/test> cat t1.sql
select name,location,address from t1 where name=16123
db2fox@bivm:~/test>
|
步骤一:分析该 SQL 语句的执行计划数据库
DB2 提供了能分析 SQL 执行计划的工具:db2expln,经过分析 SQL 执行计划咱们将了解 DB2 优化器选择了什么样的“途径”来访问数据,执行计划的优劣将直接影响 SQL 的性能。工具
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
|
db2fox@bivm:~/test> db2expln -database fox -i -g -stmtfile t1.sql -terminator ';' -output t1.exp
db2fox@bivm:~/test> cat t1.exp
DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool
******************** DYNAMIC ***************************************
==================== STATEMENT ==========================================
Isolation Level = Cursor Stability
Blocking = Block Unambiguous Cursors
Query Optimization Class = 5
Partition Parallel = No
Intra-Partition Parallel = No
SQL Path = "SYSIBM", "SYSFUN", "SYSPROC", "SYSIBMADM",
"DB2FOX"
Statement:
select name, location, address
from t1
where name=16123
Section Code Page = 1208
Estimated Cost = 3517.214111 --此处咱们能够看到行计划的 COST 值
Estimated Cardinality = 3600.000000
( 2) Access Table Name = DB2FOX.T1 ID = 4,513
| #Columns = 3
| Skip Inserted Rows
| Avoid Locking Committed Data
| Currently Committed for Cursor Stability
| May participate in Scan Sharing structures
| Scan may start anywhere and wrap, for completion
| Fast scan, for purposes of scan sharing management
| Scan can be throttled in scan sharing management
| Relation Scan
| | Prefetch: Eligible
| Lock Intents
| | Table: Intent Share
| | Row : Next Key Share
| Sargable Predicate(s)
| | #Predicates = 1
( 1) | | Return Data to Application
| | | #Columns = 3
( 1) Return Data Completion
End of section
Optimizer Plan:
Rows
Operator
(ID)
Cost
3600
RETURN
( 1)
3517.21
|
3600
TBSCAN --> 该执行计划选择了全表扫描
( 2)
3517.21
|
90000
Table:
DB2FOX
T1
|
这是一条很是简单的 SQL 语句,其执行计划选择了“全表扫描”,通常状况下全表扫描的“代价”较高而执行效率较差,相对而言,使用索引的效率要高的多,但在一些特殊状况下“全表扫描”的效率要优于“使用索引”,影响优化器选择的因素有不少,包括:表的大小,查询结果集的大小,有无索引,I/O 预读等。性能
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
db2fox@bivm:~/test> db2 "select count(*) from t1"
1
-----------
90000
1 record(s) selected.
fox@bivm:~/test> db2 "select substr(indname,1,10),substr(tabname,1,20),
substr(colnames,1,20) from syscat.indexes where tabname='T1'"
1 2 3
---------- -------------------- --------------------
I_T1 T1 +LOCATION+NAME
db2fox@bivm:~/test> db2 "select firstkeycard, first2keycard from syscat.indexes
> where indname='I_T1'"
FIRSTKEYCARD FIRST2KEYCARD
-------------------- ---------------------------------------------------
3 -->重复值很是的多59093 -->重复值很是的少
1 record(s) selected.
db2fox@bivm:~/test>
db2fox@bivm:~/test> db2 "describe table t1"
Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
NAME SYSIBM CHARACTER 40 0 Yes
LOCATION SYSIBM CHARACTER 50 0 Yes
ADDRESS SYSIBM VARCHAR 130 0 Yes
3 record(s) selected.
db2fox@bivm:~/test>
|
T1 表上有一个名为“I_T1”的索引,该表有大概 9 万条记录,并且 NAME 列的重复值很是的少,这种状况下影响业务性能的 SQL 语句很是适合使用索引,但当前的执行计划却选择了“全表扫描”!咱们再仔细观察一下该 SQL 语句的原文:select name,location,address from t1 where name=16123 请注意 where 条件 name=16123 这是一个“数值”类型,而 t1 表中 NAME 列定义的是“字符”类型的,这多是影响执行化选择的缘由!测试
步骤二:修改 SQL 原文fetch
将 SQL 原文中 where 条件部分加“引号”以使得“优化器”能够选择索引。优化
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
|
db2fox@bivm:~/test> cat t1.sql
select name,location,address from t1 where name='16123'
db2fox@bivm:~/test>
db2fox@bivm:~/test> db2expln -database fox -i -g -stmtfile t1.sql -terminator ';' -output t1.exp
DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool
Output is available in "t1.exp".
db2fox@bivm:~/test> cat t1.exp
DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool
******************** DYNAMIC ***************************************
==================== STATEMENT ==========================================
Isolation Level = Cursor Stability
Blocking = Block Unambiguous Cursors
Query Optimization Class = 5
Partition Parallel = No
Intra-Partition Parallel = No
SQL Path = "SYSIBM", "SYSFUN", "SYSPROC", "SYSIBMADM",
"DB2FOX"
Statement:
select name, location, address
from t1
where name='16123'
Section Code Page = 1208
Estimated Cost = 132.665771 -->COST 值比优化前改善很是明显
Estimated Cardinality = 2.596810
( 2) Access Table Name = DB2FOX.T1 ID = 4,513
| Index Scan: Name = DB2FOX.I_T1 ID = 1
| | Regular Index (Not Clustered)
| | Index Columns:
| | | 1: LOCATION (Ascending)
| | | 2: NAME (Ascending)
| #Columns = 2
| Skip Inserted Rows
| Avoid Locking Committed Data
| Currently Committed for Cursor Stability
| Evaluate Predicates Before Locking for Key
| #Key Columns = 2
| | Start Key: Inclusive Value
| | | 1: [GAP Unconstrained]
| | | 2: '16123 ...'
| | Stop Key: Inclusive Value
| | | 1: [GAP Unconstrained]
| | | 2: '16123 ...'
| Data Prefetch: Sequential(2), Readahead
| Index Prefetch: Sequential(4), Readahead
| Lock Intents
| | Table: Intent Share
| | Row : Next Key Share
| Sargable Predicate(s)
( 1) | | Return Data to Application
| | | #Columns = 3
( 1) Return Data Completion
End of section
Optimizer Plan:
Rows
Operator
(ID)
Cost
2.59681
RETURN
( 1)
132.666
|
2.59681
FETCH
( 2)
132.666
/ \
2.59681 90000
IXSCAN Table:
( 3) DB2FOX
115.093 T1
|
59093
Index:
DB2FOX -->已经使用了索引
I_T1
db2fox@bivm:~/test>
|
从新执行该 SQL 语句验证其优化效果,能够看出该 SQL 已经有明显的改善,但依然没有知足业务指望。SQL 的性能很大程度上是与“索引”相关的, 正确的使用索引以及合理的设计“索引”是改善 SQL 性能的最主要手段,“索引”质量的高低也将直接影响 SQL 的性能好坏。lua
步骤三:分析相关索引spa
索引 I_T1 是由 LOCATION 列和 NAME 列联合构成的“组合索引”,一般状况下“组合索引”的“引导列”(排在最左边的列)对查询语句中的 where 条件影响最大,而索引 I_T1 的引导列为 LOCATION, 所以能够考虑新建立一个索引只有 NAME 列或者建立一个新的由 NAME 列为引导列的组合索引。
1
2
3
4
5
6
7
8
9
|
db2fox@bivm:~> db2 "create index i_t1_name on t1(name)"
DB20000I The SQL command completed successfully.
db2fox@bivm:~> db2 "describe indexes for table t1"
Index Index Unique Number of Index Index Null
schema name rule columns type partitioning keys
------------------------------- ------------------- --------------
DB2FOX I_T1 D 2 RELATIONAL DATA - Y
DB2FOX I_T1_NAME D 1 RELATIONAL DATA - Y
2 record(s) selected.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
|
db2fox@bivm:~/test> db2expln -database fox -i -g -stmtfile t1.sql -terminator ';' -output t1.exp
DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool
Output is available in "t1.exp".
db2fox@bivm:~/test> cat t1.exp
DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool
******************** DYNAMIC ***************************************
==================== STATEMENT ==========================================
Isolation Level = Cursor Stability
Blocking = Block Unambiguous Cursors
Query Optimization Class = 5
Partition Parallel = No
Intra-Partition Parallel = No
SQL Path = "SYSIBM", "SYSFUN", "SYSPROC", "SYSIBMADM",
"DB2FOX"
Statement:
select name, location, address
from t1
where name='16123'
Section Code Page = 1208
Estimated Cost = 27.005688 -->COST 值比优化前改善很是明显
Estimated Cardinality = 2.898831
( 2) Access Table Name = DB2FOX.T1 ID = 4,513
| Index Scan: Name = DB2FOX.I_T1_NAME ID = 2
| | Regular Index (Not Clustered)
| | Index Columns:
| | | 1: NAME (Ascending)
| #Columns = 2
| Skip Inserted Rows
| Avoid Locking Committed Data
| Currently Committed for Cursor Stability
| Evaluate Predicates Before Locking for Key
| #Key Columns = 1
| | Start Key: Inclusive Value
| | | 1: '16123 ...'
| | Stop Key: Inclusive Value
| | | 1: '16123 ...'
| Data Prefetch: Sequential(1), Readahead
| Index Prefetch: Sequential(1), Readahead
| Lock Intents
| | Table: Intent Share
| | Row : Next Key Share
| Sargable Predicate(s)
( 1) | | Return Data to Application
| | | #Columns = 3
( 1) Return Data Completion
End of section
Optimizer Plan:
Rows
Operator
(ID)
Cost
2.89883
RETURN
( 1)
27.0057
|
2.89883
FETCH
( 2)
27.0057
/ \
2.89883 90000
IXSCAN Table:
( 3) DB2FOX
13.5494 T1
|
30731
Index:
DB2FOX
I_T1_NAME -->优化器选择新建立的索引
db2fox@bivm:~/test>
|
从以上的执行计划中能够看到 COST 值从最初的3517.214111最终下降到27.005688,该 SQL 语句的性能提高很是明显。
索引一般用于加速对表的访问。可是,逻辑数据设计也可使用索引。例如,惟一索引不容许列中存在重复值的条目,从而保证了一个表中不会有两行相同的记录。还能够建立索引,以将一列中的值按升序或降序进行排序。
要点: 在建立索引时要记住,虽然它们能够提升查询性能,但会对写性能产生负面影响。出现此负面影响是由于对于数据库管理器写入表中的每行,它还必须更新任何受影响的索引。所以,只有在可以明显提升总体性能时,才应建立索引。
在建立索引时,还应考虑表结构和最常对这些表执行查询的类型。例如,频繁发出的查询的 WHERE 子句中出现的列很适合做为索引。可是,在较少运行的查询中,索引对 INSERT 和 UPDATE 语句的性能产生的负面影响可能超过所带来的好处。
一样,在常常运行的查询的 GROUP BY 子句中出现的列可能会从建立索引中获益,尤为在用于分组行的值的数目小于要分组的行数时。
在建立索引时, 也能够进行压缩。以后,您可使用 ALTER INDEX 语句来修改索引,从而启用或禁用压缩功能。
要删除索引,可使用 DROP INDEX 命令。
设计索引时的准则和注意事项
注: 都应该按重复值最少到重复值最多的顺序对索引键中的列进行排序。此排序提供最佳性能。
本案例中经过修改了两 SQL 原文并从新设计了一个索引达到了优化目的,知足了业务要求,当数据库出现性能问题时,经过现象分析其本质,最终找到优化的具体方法。数据库优化是一个系统化的过程,有时没法一蹴而就,须要按部就班。深入的理解数据库的运行机制和原理是迅速判断性能问题的基础。
参考:https://www.ibm.com/developerworks/cn/data/library/techarticle/dm-1511-db2sql-optmize/