相信不少人都知道经过 任务管理器
抓取dump,虽然简单粗暴,但没法知足程序的无数种死法,好比:linux
既然手工太弱鸡,那有什么好的工具呢? 除了 adplus,本文推荐一款神器 procdump
, 下载地址:https://docs.microsoft.com/zh... ,还能支持 linux 😘😘😘,具体怎么安装就不细说了。缓存
内存膨胀
这种状况我相信颇有朋友都遇到过,我见过最多的案例就是用了小缓存 static,而后有意无心的忘记释放,致使无限堆积终爆炸,那这种怎么用 procdump 去抓呢? app
为了方便演示,我先写一个无限分配内存的例子。工具
static void Main(string[] args) { List<string> list = new List<string>(); for (int i = 0; i < int.MaxValue; i++) { list.Add(string.Join(",", Enumerable.Range(0, 10000))); } Console.ReadLine(); }
将程序跑起来后,设置 procdump 在内存超过 1G 的时候自动抓取全内存 dump,使用以下命令.测试
C:\Windows\system32>procdump ConsoleApp2 -m 1024 -ma E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug ProcDump v10.0 - Sysinternals process dump utility Copyright (C) 2009-2020 Mark Russinovich and Andrew Richards Sysinternals - www.sysinternals.com Process: ConsoleApp2.exe (24112) Process image: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ConsoleApp2.exe CPU threshold: n/a Performance counter: n/a Commit threshold: >= 1024 MB Threshold seconds: 10 Hung window check: Disabled Log debug strings: Disabled Exception monitor: Disabled Exception filter: [Includes] * [Excludes] Terminate monitor: Disabled Cloning type: Disabled Concurrent limit: n/a Avoid outage: n/a Number of dumps: 1 Dump folder: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ Dump filename/mask: PROCESSNAME_YYMMDD_HHMMSS Queue to WER: Disabled Kill after dump: Disabled Press Ctrl-C to end monitoring without terminating the process. [21:23:43] Commit: 1087Mb [21:23:43] Dump 1 initiated: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ConsoleApp2.exe_210323_212343.dmp [21:23:43] Dump 1 writing: Estimated dump file size is 1179 MB. [21:23:44] Dump 1 complete: 1179 MB written in 1.3 seconds [21:23:44] Dump count reached.
从最后五行能够看出,当内存达到 1087M
的时候自动生成了 dump 文件,接下来用 windbg 看一看。spa
!address -summary
便可0:000> !address -summary Mapping file section regions... Mapping module regions... Mapping PEB regions... Mapping TEB and stack regions... Mapping heap regions... Mapping page heap regions... Mapping other regions... Mapping stack trace database regions... Mapping activation context regions... --- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal Free 63 b30b4000 ( 2.798 GB) 69.94% <unknown> 228 48547000 ( 1.130 GB) 93.99% 28.25% Image 210 4115000 ( 65.082 MB) 5.29% 1.59% Stack 21 700000 ( 7.000 MB) 0.57% 0.17% Heap 12 170000 ( 1.438 MB) 0.12% 0.04% Other 7 5a000 ( 360.000 kB) 0.03% 0.01% TEB 7 13000 ( 76.000 kB) 0.01% 0.00% PEB 1 3000 ( 12.000 kB) 0.00% 0.00% --- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal MEM_PRIVATE 250 47121000 ( 1.110 GB) 92.36% 27.76% MEM_IMAGE 217 411e000 ( 65.117 MB) 5.29% 1.59% MEM_MAPPED 19 1cfd000 ( 28.988 MB) 2.35% 0.71% --- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal MEM_FREE 63 b30b4000 ( 2.798 GB) 69.94% MEM_COMMIT 357 47f12000 ( 1.124 GB) 93.49% 28.10% MEM_RESERVE 129 502a000 ( 80.164 MB) 6.51% 1.96% --- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal PAGE_READWRITE 177 437d5000 ( 1.055 GB) 87.70% 26.36% PAGE_EXECUTE_READ 35 33c7000 ( 51.777 MB) 4.21% 1.26% PAGE_READONLY 90 c41000 ( 12.254 MB) 1.00% 0.30% PAGE_WRITECOPY 34 70b000 ( 7.043 MB) 0.57% 0.17% PAGE_READWRITE|PAGE_GUARD 14 23000 ( 140.000 kB) 0.01% 0.00% PAGE_EXECUTE_READWRITE 7 7000 ( 28.000 kB) 0.00% 0.00% --- Largest Region by Usage ----------- Base Address -------- Region Size ---------- Free 80010000 7f130000 ( 1.986 GB) <unknown> 438e1000 200f000 ( 32.059 MB) Image 660e0000 f55000 ( 15.332 MB) Stack e00000 fd000 (1012.000 kB) Heap c97000 98000 ( 608.000 kB) Other ff2c0000 33000 ( 204.000 kB) TEB 990000 3000 ( 12.000 kB) PEB 98d000 3000 ( 12.000 kB)
看到上面 PAGE_READWRITE
行的 (1.055 GB)
吗? 和刚才 Console 中的 1087M 遥相呼应,没毛病。线程
!dumpheap -stat -min 1024
便可||0:0:000> !dumpheap -stat -min 1024 Statistics: MT Count TotalSize Class Name 65d42788 2 13044 System.Object[] 65d42d74 2 98328 System.String[] 65d42c60 73 1082988 System.Char[] 65d424e4 11452 1119913984 System.String
从输出的最后一行能够看出,System.String
有1w多个,接下来能够增长 -type
属性筛选出 >10k
的字符串。debug
0:000> !dumpheap -type System.String -min 10240 Address MT Size 03c75568 65d424e4 97792 03c8d378 65d424e4 97792 4a855060 65d424e4 97792 Statistics: MT Count TotalSize Class Name 65d424e4 11452 1119913984 System.String Total 11452 objects 0:000> !gcroot 4a855060 Thread 36e4: *** WARNING: Unable to verify checksum for ConsoleApp2.exe 00b3f358 012108d1 ConsoleApp2.Program.Main(System.String[]) [E:\net5\ConsoleApp1\ConsoleApp2\Program.cs @ 18] ebp+18: 00b3f370 -> 02c71fd8 System.Collections.Generic.List`1[[System.String, mscorlib]] -> 02cce2ec System.String[] -> 4a855060 System.String Found 1 unique roots (run '!GCRoot -all' to see all roots).
从最后的 !gcroot
看,确实是被 Program.cs:18
行的 List 所持有,到此水落石出。code
提及CPU爆高的案例,我发现更多的是在 非托管堆
上,好比GC回收,争抢锁等,不多有人能傻到在 托管层
上把cpu搞起来。orm
对了,分析CPU 爆高有一个小技巧,那就是连续抓 dump 快照,看两个 dump 中的线程运行状况,这时候就很是适合 procdump,先来看测试代码。
class Program { static void Main(string[] args) { Parallel.For(0, int.MaxValue, (i) => { while (true) { } }); Console.ReadLine(); } }
如今我设定 连续 5s 内 CPU 超过 70% 抓取 dump,直到 2 个为止
。
C:\Windows\system32>procdump ConsoleApp2 -s 5 -n 2 -c 70 E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug ProcDump v10.0 - Sysinternals process dump utility Copyright (C) 2009-2020 Mark Russinovich and Andrew Richards Sysinternals - www.sysinternals.com Process: ConsoleApp2.exe (22152) Process image: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ConsoleApp2.exe CPU threshold: >= 70% of system Performance counter: n/a Commit threshold: n/a Threshold seconds: 5 Hung window check: Disabled Log debug strings: Disabled Exception monitor: Disabled Exception filter: [Includes] * [Excludes] Terminate monitor: Disabled Cloning type: Disabled Concurrent limit: n/a Avoid outage: n/a Number of dumps: 2 Dump folder: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ Dump filename/mask: PROCESSNAME_YYMMDD_HHMMSS Queue to WER: Disabled Kill after dump: Disabled Press Ctrl-C to end monitoring without terminating the process. [22:25:47] CPU: 95% 1s [22:25:48] CPU: 100% 2s [22:25:50] CPU: 96% 3s [22:25:51] CPU: 98% 4s [22:25:52] CPU: 99% 5s (Trigger) [22:25:53] Dump 1 initiated: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ConsoleApp2.exe_210323_222553.dmp [22:25:54] Dump 1 complete: 5 MB written in 0.3 seconds [22:25:56] CPU: 88% 1s [22:25:58] CPU: 93% 2s [22:26:00] CPU: 89% 3s [22:26:02] CPU: 89% 4s [22:26:04] CPU: 95% 5s (Trigger) [22:26:05] Dump 2 initiated: E:\net5\ConsoleApp1\ConsoleApp2\bin\Debug\ConsoleApp2.exe_210323_222605.dmp [22:26:06] Dump 2 complete: 5 MB written in 0.4 seconds [22:26:07] Dump count reached.
从最后输出中能够看到,连续 5s
CPU 超过了 70% 抓取了 dump,总共来了2个。
如今 dump 有了,接下来用两个 windbg 实例打开,验证下 dump 的生成时间,以下图所示:
从图中能够看到,两个 dump 生成时间相隔 12s,并且经过 !runaway
发现下面的线程:
都运行了长达 10s ,这说明什么?说明这二个线程应该在某个地方死循环了。。。对吧。。。
切到 14 号线程经过 !clrstack
看调用堆栈便可,都是死在 ConsoleApp2.Program+c.b__0_0(Int32)
这里出不来。。。
感受篇幅有点长了,就先说到这里吧,有兴趣的话,能够把 procdump 拉下来玩一玩 🤭。