一家贵阳地区的重点用户在一次ZLHIS升级sp后,性能急剧降低。特别是医嘱相关的操做(包括新开,修改) 很是慢。现场人员同时反馈即便是对医嘱记录的单单行数据的进行update也很是慢,这种状况已经持续了1个多小时,严重影响了系统的运行,医嘱相关的业务基本处于停滞状态。react
经过电话沟通,感受象是遇到了“表级锁”,致使事务无做获取TM锁。经过现场人员提取时段的awr报告,做了简单分析:
DB Name |
DB Id |
Instance |
Inst num |
Release |
RAC |
Host |
ORCL |
1160490627 |
orcl |
1 |
10.2.0.1.0 |
NO |
ZYHOSPIT-C55630 |
|
Snap Id |
Snap Time |
Sessions |
Cursors/Session |
Begin Snap: |
38900 |
17-2月 -12 08:01:04 |
277 |
59.6 |
End Snap: |
38901 |
17-2月 -12 09:00:52 |
414 |
61.4 |
Elapsed: |
|
59.81 (mins) |
|
|
DB Time: |
|
1,768.12 (mins) |
|
|
能够看到db time是间隔时间的近30倍,系统性能很是差,接下来查看top 5 timed Events来肯定db time的主要的构成:
Event |
Waits |
Time(s) |
Avg Wait(ms) |
% Total Call Time |
Wait Class |
enq: TM - contention |
25,550 |
75,077 |
2,938 |
70.8 |
Application |
CPU time |
|
16,319 |
|
15.4 |
|
db file sequential read |
1,987,166 |
8,702 |
4 |
8.2 |
User I/O |
db file scattered read |
1,386,286 |
2,933 |
2 |
2.8 |
User I/O |
enq: TX - row lock contention |
411 |
1,094 |
2,662 |
1.0 |
Application |
enq:TM-contention等待占整个db time的70.8%,平均等待时间也达到了2938ms(也就是,2.93s) ,出现了严重的tm 类的enqueue等待,使用脚本查询v$视图。
TM 锁(TM lock)用于确保在修改表的内容时,表的结构不会改变。例如,若是你已经更新了一个表,会获得这个表的一个TM 锁。这会防止另外一个用户在该表上执行DROP 或ALTER 命令。若是你有表的一个TM 锁,而另外一位用户试图在这个表上执行DDL,他就会获得如下错误消息:
drop table dept
*
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified
在一个事务中
,
若是修改了多个表,则会获得多个表的
TM
锁。常见的enqueue的锁mode有3和6,那咱们这里持有的是那种模式的锁呢?咱们使用下述sql来进行查询:
Select Decode(Request, 0, 'Holder: ', 'Waiter: ') || Sid Sess, Id1, Id2, Lmode, Request, Type
From V$lock
Where (Id1, Id2, Type) In (Select Id1, Id2, Type From V$lock Where Request > 0)
Order By Id1, Request;
|
SESS |
ID1 |
ID2 |
LMODE |
REQUEST |
TYPE |
1 |
Holder: 220 |
52074 |
0 |
3 |
0 |
TM |
2 |
Waiter: 224 |
52074 |
0 |
0 |
2 |
TM |
3 |
Waiter: 138 |
52074 |
0 |
0 |
2 |
TM |
4 |
Waiter: 125 |
52074 |
0 |
0 |
2 |
TM |
5 |
Waiter: 243 |
52074 |
0 |
0 |
2 |
TM |
6 |
Waiter: 401 |
52074 |
0 |
0 |
2 |
TM |
7 |
Waiter: 136 |
52074 |
0 |
0 |
2 |
TM |
8 |
Waiter: 506 |
52074 |
0 |
0 |
2 |
TM |
9 |
Waiter: 502 |
52074 |
0 |
0 |
2 |
TM |
10 |
Waiter: 61 |
52074 |
0 |
0 |
2 |
TM |
11 |
Waiter: 7 |
52074 |
0 |
0 |
2 |
TM |
12 |
Waiter: 99 |
52074 |
0 |
0 |
2 |
TM |
13 |
Waiter: 207 |
52074 |
0 |
0 |
2 |
TM |
14 |
Waiter: 491 |
52074 |
0 |
0 |
2 |
TM |
15 |
Waiter: 245 |
52074 |
0 |
0 |
2 |
TM |
16 |
Waiter: 140 |
52074 |
0 |
0 |
3 |
TM |
17 |
Waiter: 150 |
52074 |
0 |
0 |
3 |
TM |
18 |
Waiter: 66 |
52074 |
0 |
0 |
3 |
TM |
19 |
Waiter: 116 |
52074 |
0 |
0 |
3 |
TM |
20 |
Waiter: 132 |
52074 |
0 |
0 |
3 |
TM |
21 |
Waiter: 106 |
52074 |
0 |
0 |
3 |
TM |
|
|
|
|
|
|
|
能够看到持有的是mode为3的tm锁,而请求的mode为2的锁;enqueu事件的id1列描述了表的object_id,查询dba_objects能够查到OBJECT_ID为520740正是“病人医嘱记录”这一张表。
Wait for TM Enqueue in Mode 3
Unindexed foreign key columns are the primary cause of TM lock contention in mode 3. However, this only applies to databases prior to Oracle9i Database. Depending on the operation, when foreign key columns are not indexed, Oracle either takes up a DML share lock (S – mode 4) or share row exclusive lock (SRX – mode 5) on the child table whenever the parent key or row is modified. (The share row exclusive lock is taken on the child table when the parent row is deleted and the foreign key constraint is created with the ON DELETE CASCADE option. Without this option, Oracle takes the share lock.) The share lock or share row exclusive lock on the child table prohibits other processes from getting a row exclusive lock (RX—mode 3) on the table. The waiting session will wait until the blocking session commits or rolls back its transaction.sql
Here is a philosophical question for you: Are you going to start building new indexes for all the foreign key columns in your databases? DBAs are divided on this. Our take is that you should hold your horses and don’t get carried away building new indexes just yet. If you do, you will introduce many new indexes to the database, some that are unnecessary. For example, you don’t need to create new indexes on foreign key columns when the parent tables they reference are static. You only need to create indexes on foreign key columns of the child table that is being identified by the
enqueue
wait event. The object ID for the child table is recorded in the P2 column, which corresponds to the ID1 column of the V$LOCK view. Query the DBA_OBJECTS view using the object ID and you will see the name of the child table. Yes, you will be operating in reactive mode, but it beats creating unnecessary indexes in the database, which not only wastes storage and increases maintenance, but may open up another can of worms for SQL tuning.
这段话的大致意思是,没有索引的外键列是模式3 中tm锁争用的主要缘由,然而这种缘由只适用9i以前的数据库,根据不一样的操做,当外键列没有被索引时,Oracle在子表上采用一个DML共享锁或共享独占锁,只要父键或父行被修改。子表上的共享锁或共享行独占锁禁止进程或会话得到表上的独占锁,会话交持续等待,直到形成阻塞的会话提交或回退它的事务。
咱们的库是Oracle 10g,彷佛这段说明并不适用咱们的状况;咱们经过下列的sql查找表上有外键,但未创建索引的列:
SELECT TABLE_NAME,
CONSTRAINT_NAME,
CNAME1 || NVL2(CNAME2, ',' || CNAME2, NULL) ||
NVL2(CNAME3, ',' || CNAME3, NULL) ||
NVL2(CNAME4, ',' || CNAME4, NULL) ||
NVL2(CNAME5, ',' || CNAME5, NULL) ||
NVL2(CNAME6, ',' || CNAME6, NULL) ||
NVL2(CNAME7, ',' || CNAME7, NULL) ||
NVL2(CNAME8, ',' || CNAME8, NULL) COLUMNS
FROM (SELECT B.TABLE_NAME,
B.CONSTRAINT_NAME,
MAX(DECODE(POSITION, 1, COLUMN_NAME, NULL)) CNAME1,
MAX(DECODE(POSITION, 2, COLUMN_NAME, NULL)) CNAME2,
MAX(DECODE(POSITION, 3, COLUMN_NAME, NULL)) CNAME3,
MAX(DECODE(POSITION, 4, COLUMN_NAME, NULL)) CNAME4,
MAX(DECODE(POSITION, 5, COLUMN_NAME, NULL)) CNAME5,
MAX(DECODE(POSITION, 6, COLUMN_NAME, NULL)) CNAME6,
MAX(DECODE(POSITION, 7, COLUMN_NAME, NULL)) CNAME7,
MAX(DECODE(POSITION, 8, COLUMN_NAME, NULL)) CNAME8,
COUNT(*) COL_CNT
FROM (SELECT SUBSTR(TABLE_NAME, 1, 30) TABLE_NAME,
SUBSTR(CONSTRAINT_NAME, 1, 30) CONSTRAINT_NAME,
SUBSTR(COLUMN_NAME, 1, 30) COLUMN_NAME,
POSITION
FROM USER_CONS_COLUMNS) A,
USER_CONSTRAINTS B
WHERE A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
AND B.CONSTRAINT_TYPE = 'R'
GROUP BY B.TABLE_NAME, B.CONSTRAINT_NAME) CONS
WHERE COL_CNT > ALL
(SELECT COUNT(*)
FROM USER_IND_COLUMNS I
WHERE I.TABLE_NAME = CONS.TABLE_NAME
AND I.COLUMN_NAME IN (CNAME1, CNAME2, CNAME3, CNAME4, CNAME5,
CNAME6, CNAME7, CNAME8)
AND I.COLUMN_POSITION <= CONS.COL_CNT
GROUP BY I.INDEX_NAME)
这个查询,使用了decode函数来实现行转列的效果,从而获得外键的列;从获得的结果中,查看医嘱记录相关的表,能够看表上确实有这种未建索引的外键:
|
病人医嘱记录 |
病人医嘱记录_FK_前提ID |
前提ID |
|
病人医嘱记录 |
病人医嘱记录_FK_病人科室ID |
病人科室ID |
|
病人医嘱记录 |
病人医嘱记录_FK_开嘱科室ID |
开嘱科室ID |
|
病人医嘱记录 |
病人医嘱记录_FK_执行科室ID |
执行科室ID |
焦点集中在“前提ID”上,由于其余几个外键列都是引用部门表,部门表做为基础表,数据变更的机率比较小。而"前提id"是一个
自引用的外键,并非简单的主从表形式的外键,从升级脚本中找到这个约束的定义:
ALTER TABLE 病人医嘱记录
ADD CONSTRAINT 病人医嘱记录_FK_前提ID
FOREIGN KEY (前提ID)
REFERENCES 病人医嘱记录(ID);
能够看到咱们前提ID引用的是表的主键列(ID),ID虽然基本上不更新,但insert很是频繁;通过测试,这种自引用的外键约束即便是在10g中,当咱们更新或insert记录时也会引起对表的tm锁;若是在insert到表时,未创建外键都会引起tm锁,接下来就是创建索引:
CREATE INDEX 病人医嘱记录_IX_前提ID
ON 病人医嘱记录(前提ID)
PCTFREE 10
TABLESPACE zl9CisRec
online nologging;
因为是生产库时,创建索引时加了online选项,同时加了nologging选项不产生日志以加快创建的速度。若是在创建索引的过程当中使用了parallel 选项,必定记住在索引创建完成后,将parallel修改回1,以避免产生大量的并发进程。
索引创建完成后,相关操做恢复正常。
总结:从owi的说明中能够看到,并非全部的外建都须要创建索引,是否创建索引要根据引用的主键是否常常变化,以及外键列上的索引是否可以提高性能,防止避免创建一个不使用或不多使用的“僵尸索引“。在咱们的案例中,也没有为几个引用部门表的外键创建索引,仍是那句话,都得具体问题具体分析,不能简单行事。 这个案例也说明,即便是在10g下,对于自引用的主键常常变化(包括insert)的外键,必需要创建索引。