Oracle性能分析(一) DB Time和DB CPU

分析数据库的负载高低,一般能够经过CPU利用率,磁盘响应速度等进行分析,Oracle数据库提供AWR报告,报告的内容不少不少,可是却没有一个定性的指标,能回答:sql

1. 到底数据库当前的负载是高仍是低?数据库

2. 我感受我数据库很慢,个人数据库硬件是是否是应该升级了?服务器

 

先看一段典型的AWR报告,截取部分片断网络

 

Host Name Platform CPUs Cores Sockets Memory (GB)
ABCD Linux x86 64-bit 12 12 12 94.15
  Snap Id Snap Time Sessions Cursors/Session Instances
Begin Snap: 13647 24-Aug-18 00:00:40 575 4.1 2
End Snap: 13695 25-Aug-18 00:00:06 558 4.6 2
Elapsed:   1,439.43 (mins)      
DB Time:   2,762.39 (mins)      

这里有三个关键信息:性能

1. 首先,咱们的服务器是12个CPU,12核spa

2. 这个AWR报告的区间是从  24-Aug-18 00:00:40 到 25-Aug-18 00:00:06,整个区间是1,439.43 (mins), 24小时。orm

3. 而后是DB TIME: 2,762.39 (mins), 这是一个关键的性能指标,就是全部数据库请求的运行时间加总。blog

 

举个栗子,假设一家理发店24小时(1,439.43 (mins))营业,一共有12个理发师(12 CPU),顾客在理发店的累计逗留时间就是 DB TIME:2,762.39 (mins)。内存

若是拿顾客累计逗留时间除以营业时间,2762.30/1439.43=1.92, 也就是说,在营业时间内,店内平均有1.92个顾客。ci

顾客的累计逗留时间(DB TIME)不是理发师(CPU)的累计工做时间,若是两个顾客都要烫发,正巧只有一个烫发机,则其中一个顾客须要等待(阻塞),这段时间理发师没有干活,

因此:

DB CPU 是必定会小于 DB Time的, 理发师剪头发的时间必定小于顾客在店内逗留的时间。

 

继续看AWR报告,给出了理发师的工做时间,DB CPUS(S)=0.9 ,咱们有12个理发师,也就是全天CPU的利用率是 0.9/12=7.5%

Load Profile

 

 

 

  Per Second Per Transaction Per Exec Per Call
DB Time(s): 1.9 0.1 0.00 0.00
DB CPU(s): 0.9 0.0 0.00 0.00
Redo size (bytes): 2,286,644.3 69,730.7    
Logical read (blocks): 117,205.0 3,574.1    
Block changes: 9,588.5 292.4    
Physical read (blocks): 1,768.0 53.9    
Physical write (blocks): 287.0 8.8    
Read IO requests: 201.7 6.2    
Write IO requests: 75.0 2.3    
Read IO (MB): 13.8 0.4    
Write IO (MB): 2.2 0.1    
Global Cache blocks received: 29.1 0.9    
Global Cache blocks served: 37.5 1.1    
User calls: 3,122.8 95.2    
Parses (SQL): 473.3 14.4    
Hard parses (SQL): 9.5 0.3    
SQL Work Area (MB): 15.8 0.5    
Logons: 0.1 0.0    
Executes (SQL): 11,292.7 344.4    
Rollbacks: 0.7 0.0    
Transactions: 32.8      

 

全天CPU利用率只有7.5%,总体看比较低,可是不意味着数据库的负荷很低,有两种状况:

1. 负荷可能集中在某个时间段,这就须要咱们观察每一个时间段的数据库负荷。

2. 数据库的瓶颈可能不在CPU,而在磁盘或者网络

1)数据库内存较少,大部分时间消耗在磁盘的读取上,这个能够从磁盘的响应速度判断

2)网络带宽不足,数据库要发送大量数据到客户端,在这里造成了等待。

  

因此如今咱们想分时间段得到DB TIME和DB CPU这两个指标,若是须要按每一个时间段导出一个AWR报告,这是一件使人抓狂的事,还好,AWR报告实际上就是经过一些数据库视图生成的,这些视图就是数据库定时的快照,因此咱们能够利用SQL语句很方便的生成。

首先看下快照的主表:

SELECT * FROM DBA_HIST_SNAPSHOT ORDER BY SNAP_ID

    SNAP_ID DBID INSTANCE_NUMBER STARTUP_TIME BEGIN_INTERVAL_TIME END_INTERVAL_TIME FLUSH_ELAPSED SNAP_LEVEL ERROR_COUNT SNAP_FLAG SNAP_TIMEZONE
1 13648 1647988888 2 10-8月 -18   05.20.05.000 下午 24-8月 -18   12.00.40.204 上午 24-8月 -18   12.30.25.506 上午 +00000 00:00:00.5 1 0 0 +0 08:00:00
2 13648 1647988888 1 28-7月 -18   08.01.15.000 下午 24-8月 -18   12.00.40.229 上午 24-8月 -18   12.30.25.524 上午 +00000 00:00:02.5 1 0 0 +0 08:00:00
3 13649 1647988888 1 28-7月 -18   08.01.15.000 下午 24-8月 -18   12.30.25.524 上午 24-8月 -18   01.00.16.839 上午 +00000 00:00:02.2 1 0 0 +0 08:00:00
4 13649 1647988888 2 10-8月 -18   05.20.05.000 下午 24-8月 -18   12.30.25.506 上午 24-8月 -18   01.00.16.820 上午 +00000 00:00:01.2 1 0 0 +0 08:00:00
5 13650 1647948888 1 28-7月 -18   08.01.15.000 下午 24-8月 -18   01.00.16.839 上午 24-8月 -18   01.30.08.125 上午 +00000 00:00:02.9 1 0 0 +0 08:00:00
6 13650 1647988888 2 10-8月 -18   05.20.05.000 下午 24-8月 -18   01.00.16.820 上午 24-8月 -18   01.30.08.072 上午 +00000 00:00:00.9 1 0 0 +0 08:00:00
7 13651 1647988888 1 28-7月 -18   08.01.15.000 下午 24-8月 -18   01.30.08.125 上午 24-8月 -18   02.00.02.352 上午 +00000 00:00:01.6 1 0 0 +0 08:00:00
8 13651 1647988888 2 10-8月 -18   05.20.05.000 下午 24-8月 -18   01.30.08.072 上午 24-8月 -18   02.00.02.334 上午 +00000 00:00:00.7 1 0 0 +0 08:00:00

 经过这个表咱们能够判断以下:

1. 半小时一个快照(这个是配置值,可改),快照并非彻底的整点,存在必定的偏差,即快照的区间接近半小时,可能会多几十秒或者少几十秒。

2. 这里有两个INSTANCE_NUMBER,由于这个是集群,因此实际上有两个物理机。

 

DBA_HIST_SYS_TIME_MODEL 表里面有咱们须要的DB CPU和DB time,这个值是累计值,即须要算一个区间的数,须要那结束时间的值减去开始时间的值。

SELECT * FROM DBA_HIST_SYS_TIME_MODEL
WHERE SNAP_ID='13648'
AND STAT_NAME IN ('DB CPU','DB time')

    SNAP_ID DBID INSTANCE_NUMBER STAT_ID STAT_NAME VALUE
1 13648 1647988888 2 2748282437 DB CPU 41061757271
2 13648 1647988888 1 2748282437 DB CPU 2214287108077
3 13648 1647988888 2 3649082374 DB time 105530922248
4 13648 1647988888 1 3649082374 DB time 3992370209915

 

 

建立一个表存储原始数据,

CREATE TABLE SYSTEMLOAD AS 
WITH TEMP AS(
  SELECT a.INSTANCE_NUMBER,
         to_number(to_char(A.BEGIN_INTERVAL_TIME,'yyyymmdd')) AS BEGIN_DATE,
         TRUNC(((A.BEGIN_INTERVAL_TIME+0)-trunc(A.BEGIN_INTERVAL_TIME+0))*48) AS SNAPORDER,
         SUBSTR('0' || TRUNC(TRUNC(((A.BEGIN_INTERVAL_TIME+0)-trunc(A.BEGIN_INTERVAL_TIME+0))*48)/2) ||
         (CASE WHEN mod(TRUNC(((A.BEGIN_INTERVAL_TIME+0)-trunc(A.BEGIN_INTERVAL_TIME+0))*48),2)=1 THEN '30' ELSE '00' END),-4)
         AS BEGIN_TIME,
         A.BEGIN_INTERVAL_TIME+0 AS BEGIN_INTERVAL_TIME,
         A.END_INTERVAL_TIME+0 AS END_INTERVAL_TIME,
         ((A.END_INTERVAL_TIME+0)-(A.BEGIN_INTERVAL_TIME+0))*60*60*24 AS TIMEINTERVAL,
         B.SNAP_ID AS STARTSNAP_ID,
         A.SNAP_ID AS ENDSNAP_ID
  FROM DBA_HIST_SNAPSHOT A
  INNER JOIN DBA_HIST_SNAPSHOT B  /*寻找前一个SNAP_ID*/
  ON A.SNAP_ID-1=B.SNAP_ID
  AND A.INSTANCE_NUMBER=B.INSTANCE_NUMBER
  ORDER BY A.SNAP_ID
)
SELECT A.*,
       DBTIME_E.VALUE-DBTIME_S.VALUE AS DBTIME,
       DBCPU_E.VALUE-DBCPU_S.VALUE AS DBCPU,  
       REDOSIZE_E.VALUE-REDOSIZE_S.VALUE AS REDOSIZE,
       PHYSICALREAD_E.VALUE-PHYSICALREAD_S.VALUE AS PHYSICALREAD,
       PHYSICALWRITE_E.VALUE-PHYSICALWRITE_S.VALUE AS PHYSICALWRITE      
FROM TEMP A
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBTIME_S 
ON a.STARTSNAP_ID=DBTIME_S.SNAP_ID
AND A.INSTANCE_NUMBER=DBTIME_S.INSTANCE_NUMBER
AND DBTIME_S.STAT_NAME='DB time'
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBTIME_E
ON a.ENDSNAP_ID=DBTIME_E.SNAP_ID
AND A.INSTANCE_NUMBER=DBTIME_E.INSTANCE_NUMBER
AND DBTIME_E.STAT_NAME='DB time'
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBCPU_S 
ON a.STARTSNAP_ID=DBCPU_S.SNAP_ID
AND A.INSTANCE_NUMBER=DBCPU_S.INSTANCE_NUMBER
AND DBCPU_S.STAT_NAME='DB CPU'
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBCPU_E
ON a.ENDSNAP_ID=DBCPU_E.SNAP_ID
AND A.INSTANCE_NUMBER=DBCPU_E.INSTANCE_NUMBER
AND DBCPU_E.STAT_NAME='DB CPU'
LEFT OUTER JOIN DBA_HIST_SYSSTAT REDOSIZE_S
ON A.STARTSNAP_ID=REDOSIZE_S.SNAP_ID
AND A.INSTANCE_NUMBER=REDOSIZE_S.INSTANCE_NUMBER
AND REDOSIZE_S.STAT_NAME='redo size'
LEFT OUTER JOIN DBA_HIST_SYSSTAT REDOSIZE_E
ON A.ENDSNAP_ID=REDOSIZE_E.SNAP_ID
AND A.INSTANCE_NUMBER=REDOSIZE_E.INSTANCE_NUMBER
AND REDOSIZE_E.STAT_NAME='redo size'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALREAD_S
ON A.STARTSNAP_ID=PHYSICALREAD_S.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALREAD_S.INSTANCE_NUMBER
AND PHYSICALREAD_S.STAT_NAME='physical read bytes'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALREAD_E
ON A.ENDSNAP_ID=PHYSICALREAD_E.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALREAD_E.INSTANCE_NUMBER
AND PHYSICALREAD_E.STAT_NAME='physical read bytes'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALWRITE_S
ON A.STARTSNAP_ID=PHYSICALWRITE_S.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALWRITE_S.INSTANCE_NUMBER
AND PHYSICALWRITE_S.STAT_NAME='physical write bytes'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALWRITE_E
ON A.ENDSNAP_ID=PHYSICALWRITE_E.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALWRITE_E.INSTANCE_NUMBER
AND PHYSICALWRITE_E.STAT_NAME='physical write bytes'
AND DBTIME_E.VALUE-DBTIME_S.VALUE>0 /*重启可能致使从新累积,不加入这些异常数据*/
--WHERE A.INSTANCE_NUMBER=1
ORDER BY A.STARTSNAP_ID

 

输出单日的运行指标:

SELECT BEGIN_DATE,
       MAX(BEGIN_TIME) AS BEGIN_TIME,
       ROUND(SUM(DBTIME)/SUM(TIMEINTERVAL)/1000000,2) AS AVGDBTIME,
       ROUND(SUM(DBCPU)/SUM(TIMEINTERVAL)/1000000,2) AS AVGDBCPU,
       ROUND(SUM(REDOSIZE)/SUM(TIMEINTERVAL)/1024/1024,2) AS AVGREDOSIZE ,
       ROUND(SUM(PHYSICALREAD)/SUM(TIMEINTERVAL)/1024/1024,2) AS AVGPHYSICALREAD,
       ROUND(SUM(PHYSICALWRITE)/SUM(TIMEINTERVAL)/1024/1024,2) AS AVGPHYSICALWRITE
FROM SYSTEMLOAD
WHERE INSTANCE_NUMBER=1
AND BEGIN_DATE='20180831'
GROUP BY BEGIN_DATE,SNAPORDER
ORDER BY BEGIN_DATE,SNAPORDER

 

如下为输出,能够看出DB TIME和DB CPU其实都不高,若是算CPU的利用率,将AVGDBCPU除以12便可,大概不到20%

    BEGIN_DATE BEGIN_TIME AVGDBTIME AVGDBCPU AVGREDOSIZE AVGPHYSICALREAD AVGPHYSICALWRITE
1 20180831 0 1.71 1.16 0.98 1.37 1.19
2 20180831 30 0.98 0.89 0.31 0.14 0.44
3 20180831 100 1.31 0.88 0.28 0.12 0.43
4 20180831 130 1.19 1.06 0.47 0.05 0.55
5 20180831 200 1.79 1.38 0.49 4.81 0.69
6 20180831 230 1.75 1.62 0.67 0.58 0.75
7 20180831 300 1.42 1.33 0.53 0.73 0.73
8 20180831 330 1.44 1.28 0.59 5.46 0.67
9 20180831 400 0.96 0.92 0.2 0.02 0.37
10 20180831 430 1.03 0.95 0.4 0.06 0.38
11 20180831 500 1.16 1.02 1.03 1.47 1.32
12 20180831 530 1.11 1 0.4 0.65 0.4
13 20180831 600 1.74 1.21 2.08 5.51 1.46
14 20180831 630 1.37 1.13 0.51 1.5 0.57
15 20180831 700 1.24 1.12 0.41 0.66 0.64
16 20180831 730 1.29 1.06 0.62 2.06 0.66
17 20180831 800 5.89 2.02 3.64 43.82 3.93
18 20180831 830 2.28 1.66 1.82 11.51 2.21
19 20180831 900 3.29 2.08 3.69 12.46 3.15
20 20180831 930 3.15 1.7 7.08 27.65 7.41
21 20180831 1000 2.37 1.36 6.3 20.12 5.92
22 20180831 1030 1.93 1.1 5.46 25.37 5.04
23 20180831 1100 2.54 1.3 5.55 32.08 5.92
24 20180831 1130 1.76 1.03 4.82 17.46 5.66
25 20180831 1200 1.03 0.57 2.19 18.68 2.47
26 20180831 1230 1.22 0.89 2.67 17.34 3.04
27 20180831 1300 1.68 1.14 4.73 21.53 4.47
28 20180831 1330 1.92 1.05 4.83 21.42 5.06
29 20180831 1400 1.88 1.23 5.21 22.94 5.14
30 20180831 1430 1.18 0.65 4.03 10.04 4.96
31 20180831 1500 1.32 0.62 2.97 6.28 2.83
32 20180831 1530 1.68 0.91 3.85 14.67 5.12
33 20180831 1600 5.56 1.05 5.9 41.36 5.46
34 20180831 1630 2.82 0.86 3.34 7.82 4.21
35 20180831 1700 2.85 1.31 3.24 37.8 4.28
36 20180831 1730 2.7 1.16 3.43 19.22 4.11
37 20180831 1800 2.36 0.73 2.57 25.6 3.81
38 20180831 1830 1.82 1.12 1.98 17.04 6.45
39 20180831 1900 1.63 0.83 2.36 15.45 3.15
40 20180831 1930 0.9 0.48 1.59 4.32 2.08
41 20180831 2000 1.09 0.28 1.01 9.93 2.01
42 20180831 2030 0.61 0.21 0.85 0.21 1.08
43 20180831 2100 0.49 0.21 0.77 0.21 1.03
44 20180831 2130 0.41 0.2 0.96 1.19 1.06
45 20180831 2200 2.35 0.72 2.79 22.78 4.89
46 20180831 2230 0.78 0.39 0.82 8.1 1.54
47 20180831 2300 0.2 0.12 0.66 0.53 0.87
48 20180831 2330 0.19 0.1 0.53 0.07 0.77

 

总结:

统计了DB CPU和DB TIME两个关键指标,可是DB CPU是数据库层面的数据,并非服务器层面的数据,有可能服务器上面还运行了其余程序,占用了大量的CPU,致使DB CPU这个值很低,可是实际上服务器的CPU很是繁忙。下一篇文章将得到服务器层面的CPU利用率。

相关文章
相关标签/搜索