windows下,咱们对于.net程序发生Crash,资源泄露,死锁等问题的分析,有神器windbghtml
.net core程序运行在linux上时,该怎么进行对对Core Dump文件进行分析呢?今天介绍一款Linux/mac os下的一款调试利器:lldb。node
官网地址python
Linux下调试.Net core(1):lldb的安装linux
dotnet core调试docker下生成的dump文件git
Debugging .NET Core on Linux with LLDBgithub
.NET Core is designed to be cross-platform, modular and optimized for clouddocker
here is a cheat sheet of the different tools on Windows and Linux:shell
Linux | Windows | |
---|---|---|
CPU sampling | perf, BCC | ETW |
Dynamic tracing | perf, BCC | X |
Static tracing | LTTng | ETW |
Dump generation | gcore, ProcDump | ProcDump, WER |
Dump anlysis | LLDB | VS, WinDBG |
The LLDB debugger is conceptually similar to the native Windows debugging tools in that it is a low level and command live driven debugger. ubuntu
It is available for a number of different *NIX systems as well as MacOS.vim
Part of the reason the .NET Core team chose the LLDB debugger was for its extensibility points that allowed them to create the SOS plugin which can be used to debug .NET core applications.
download and install correct version of LLDB into the box.
The SOS Debugging Extension helps you debug managed programs in debugger by providing information about the internal Common Language Runtime (CLR) environment.
The .NET Core team also bring this available on Linux for LLDB.
Find the pid of the dotnet application, then launch LLDB and type: process attach -p <PID>
to attach the debugger to your dotnet core application.
Microsoft has shipped ProcDump to Linux which provides a convenient way for Linux developers to create core dumps of their application based on performance triggers.Eventually, the ProcDump will call gcore on Linux to generate the core dump.
It is convenient not only because it will help you to install and setup gcore automatically, but also helps to monitor the application and capture core dump automatically based on specific trigger conditions.
Install instruction for ProcDump of Linux.
At the LLDB prompt, type: plugin load libsosplugin.so
.
Then type: clrstack
. You will see clearly what managed code is being executed for that thread.
As with any debug session that involves production running applications, it is not a first choice to live attaching to the process.
In order to enable core dumps generation, type: ulimit -c unlimited
in terminal. This command sets the generated maximum core file size to unlimited in current terminal session.
To generate core dump using ProcDump, type: sudo procdump [options] -p <PID of the app>
. You can use the options for ProcDump as below:
Usage: procdump [OPTIONS...] TARGET OPTIONS -C CPU threshold at which to create a dump of the process from 0 to 100 * nCPU -c CPU threshold below which to create a dump of the process from 0 to 100 * nCPU -M Memory commit threshold in MB at which to create a dump -m Trigger when memory commit drops below specified MB value. -n Number of dumps to write before exiting -s Consecutive seconds before dump is written (default is 10) TARGET must be exactly one of these: -p pid of the process
Launch LLDB and type in prompt: target create -c <dump file path>
Load SOS plugin type any command you need for the memory analysis. The available command are list below:
Type "soshelp <functionname>" for detailed info on that function. Object Inspection Examining code and stacks ----------------------------- ----------------------------- DumpObj (dumpobj) Threads (clrthreads) DumpArray ThreadState DumpStackObjects (dso) IP2MD (ip2md) DumpHeap (dumpheap) u (clru) DumpVC DumpStack (dumpstack) GCRoot (gcroot) EEStack (eestack) PrintException (pe) ClrStack (clrstack) GCInfo EHInfo bpmd (bpmd) Examining CLR data structures Diagnostic Utilities ----------------------------- ----------------------------- DumpDomain VerifyHeap EEHeap (eeheap) FindAppDomain Name2EE (name2ee) DumpLog (dumplog) DumpMT (dumpmt) CreateDump (createdump) DumpClass (dumpclass) DumpMD (dumpmd) Token2EE DumpModule (dumpmodule) DumpAssembly DumpRuntimeTypes DumpIL (dumpil) DumpSig DumpSigElem Examining the GC history Other ----------------------------- ----------------------------- HistInit (histinit) FAQ HistRoot (histroot) Help (soshelp) HistObj (histobj) HistObjFind (histobjfind) HistClear (histclear)
To gather detailed information about a performance issue of .NET Core Application on Linux, you can follow the simple instructions here:
To gather detailed information about a performance issue of .NET Core Application on Linux, you can follow the simple instructions here:
curl -OL http://aka.ms/perfcollect
chmod +x perfcollect
sudo ./perfcollect install
export COMPlus_PerfMapEnabled=1
export COMPlus_EnableEventLog=1
./perfcollect collect tracefile
Reference:
Analyzing .NET Core memory on Linux with LLDB
背景介绍:
.NET Windows project running on Linux in Kubernetes.It’s not as crazy as it sounds.
目的:
several things about debugging on Linux
示例介绍:
create Ubuntu 16.04 VM with the help of Vagrant and VirtualBox,
vagrant up
will bring that VM to life and we can get into it with vagrant ssh
command.Vagrant.configure("2") do |config| config.vm.box = "ubuntu/xenial64" config.vm.provider "virtualbox" do |vb| vb.memory = "3072" end config.vm.provision "shell", inline: <<-SHELL # Install .net core SDK curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-xenial-prod xenial main" > /etc/apt/sources.list.d/dotnetdev.list' apt-get update && apt-get install -y dotnet-sdk-2.0.2 # Dev tools apt-get install -y vim gdb lldb-3.6 SHELL end
put the project in it and we can experiment in there.
dotnet new console -o memApp
creates almost sufficient project template, which I improved very slightly by adding a static array full of dummy stringsusing System; using System.Linq; using System.Text; namespace memApp { class Program { static Random random = new Random((int)DateTime.Now.Ticks); static char RandomChar() => Convert.ToChar(random.Next(65, 90)); static string RandomString(int length) => String.Concat(Enumerable.Range(0, length).Select(_ => RandomChar())); static void Main(string[] args) { var dummyStringsCollection = Enumerable.Range(0, 10000) .Select(_ => "Random string: " + RandomString(10000)).ToArray(); Console.WriteLine("Hello World!"); Console.ReadLine(); } } }
dotnet build #... #Build succeeded. # 0 Warning(s) # 0 Error(s) # #Time Elapsed 00:00:02.06 dotnet bin/Debug/netcoreapp2.0/memApp.dll # Hello World!
ps u #USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND #ubuntu 4058 7.9 7.9 2752512 243908 pts/0 SLl+ 04:10 0:06 dotnet bin/Debug/netcoreapp2.0/memApp.dll #...
gcore
utility
gdb
debugger and that’s the only reason I had to install it.gcore
, however, in most cases requires elevated permissions.
sudo gcore
, but inside of Kubernetes pod even that wasn’t enough and I had to go to underlying node and add the following option to sysctl.conf
:
echo "kernel.yama.ptrace_scope=0" | sudo tee -a /etc/sysctl.conf # Append config line sudo sysctl -p # Apply changes
sudo gcore
works just fine and I can create a core dump just by providing target process id (PID):
sudo gcore 4058 # ... # Saved corefile core.4058
gcServer
option in .NET, however, reduced default address space and therefore core file size down to more manageable 5 GiB.
ls -lh #total 2.6G #-rw-r--r-- 1 root root 2.6G Dec 12 04:25 core.4058
We can use either gdb
or lldb
debuggers to works with core files, but only lldb
has .NET debugging support via SOS plugin called libsosplugin.so
.
lldb
, so if you don’t want to recompile CoreCLR and libsosplugin.so
locally (not that hard), the safest lldb
version to use at the moment is 3.6.
find /usr -name libsosplugin.so #/usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/libsosplugin.so
lldb
, point it to dotnet
executable, which started our application, it’s core dump and then load the plugin:$ lldb-3.6 `which dotnet` -c core.4058 # (lldb) target create "/usr/bin/dotnet" --core "core.4058" # Core file '/home/ubuntu/core.4058' (x86_64) was loaded. # (lldb) plugin load /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/libsosplugin.so # (lldb)
SOS plugin added a set of commands which are aware of .NET managed nature,
soshelp
command prints out all .NET commands it added to lldb
soshelp commandname
will explain how to use a particular one. Well, except when it won’t.
DumpHeap
command, which is basically the entry point for memory analysis, has no help at all. (lldb) soshelp #... #Object Inspection Examining code and stacks #----------------------------- ----------------------------- #DumpObj (dumpobj) Threads (clrthreads) #DumpArray ThreadState #.. (lldb) soshelp DumpHeap ------------------------------------------------------------------------------- (lldb)
We have a working debugger, we have a DumpHeap
command – let’s take a look at managed memory statistics:
(lldb) sos DumpHeap -stat #Statistics: # MT Count TotalSize Class Name #00007f6d32992aa8 1 24 UNKNOWN #00007f6d329911d8 1 24 UNKNOWN #.... #00007f6d323defd8 4 17528 System.Object[] #00007f6d323e08a8 25 40644 System.Int32[] #00007f6d323e0168 29 82664 System.String[] #00007f6d323e3440 335 952398 System.Char[] #000000000223b860 10092 6083604 Free #00007f6d3242b460 150846 204845172 System.String #Total 161886 objects (lldb)
Not surprisingly, System.String objects use the most of the memory.
Btw, if you summarize total sizes of all managed objects (like I did), resulting memory count comes very close to physical memory count reported by ps u
. 202 MiB of managed objects vs 238 MiB of physical memory.
The delta, I suppose, goes to the code itself and executing environment.
But we can go further. We know that System.String uses the most of the memory. Can we take a closer look at those strings?
(lldb) sos DumpHeap -type System.String # Address MT Size #00007f6d0bfff3f0 00007f6d3242b460 26 #00007f6d0bfff4c0 00007f6d3242b460 42 #... #00007f6d0c099ab0 00007f6d3242b460 20056 #00007f6d0c09e920 00007f6d3242b460 20056 #... #00007f6d323e0168 29 82664 System.String[] #00007f6d3242b460 150846 204845172 System.String #Total 150895 objects
-type
works as a mask, so the output also contains System.String[] and a few Dictionaries.
Also strings vary in size, whereas I’m actually interested in large ones, at least 1000 bytes:
sos DumpHeap -type System.String -min 1000 # ... # 00007f6d0e8810f0 00007f6d3242b460 20056 # 00007f6d0e885f60 00007f6d3242b460 20056 # 00007f6d0e88add0 00007f6d3242b460 20056 # ...
Having the list of suspicious objects we can drill down even more: examine the objects one by one.
DumpObj
can look into the managed object details at given memory address.
We have a whole first column of addresses and I just picked one of them:
(lldb) sos DumpObj 00007f6d0e8810f0 #Name: System.String #MethodTable: 00007f6d3242b460 #EEClass: 00007f6d31c49eb8 #Size: 20056(0x4e58) bytes #File: /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.0/System.Private.CoreLib.dll #String: #Fields: # MT Field Offset Type VT Attr Value Name #00007f6d3244b020 40001c9 8 System.Int32 1 instance 10015 m_stringLength #00007f6d3242f420 40001ca c System.Char 1 instance 52 m_firstChar #00007f6d3242b460 40001cb 38 System.String 0 shared static Empty # >> Domain:Value 00000000022ab050:NotInit <<
It’s actually pretty cool. We immediately can see the type name (System.String) and what fields it is made of.
I also noticed that for small strings we’d see the value right away (line 7), but not for the large ones.
I was puzzled at first about how to get the value for those.
m_firstChar
field, but is it like a linked list or what?Only after checking out the source code for System.String I realized that m_firstChar
can be used as a pointer itself and the whole string is stored somewhere as continuous block of memory.
This means I can use lldb’s native memory read
command to get the whole string back!
For that I just need to take object’s address (00007f6d0e8810f0), add m_firstChar
‘s field offset (c
, third column in fields table) and then do something like this:
(lldb) memory read 00007f6d0e8810f0+0xc #0x7f6d0e8810fc: 52 00 61 00 6e 00 64 00 6f 00 6d 00 20 00 73 00 R.a.n.d.o.m. .s. #0x7f6d0e88110c: 74 00 72 00 69 00 6e 00 67 00 3a 00 20 00 43 00 t.r.i.n.g.:. .C.
Does it look familiar?
“R.a.n.d.o.m. .s.t.r.i.n.g.”. C# char
defaults to UTF16 encoding and therefore it takes two bytes.
Even though one of them is always zero for ASCII characters.
We also can experiment with memory read
formatting, but even with default settings we can get the idea what’s inside.
(lldb) memory read 00007f6d0e8810f0+0xc -f s -c 13 #0x7f6d0e8810fc: "R" #0x7f6d0e8810fe: "a" #0x7f6d0e881100: "n" #0x7f6d0e881102: "d" #0x7f6d0e881104: "o" #0x7f6d0e881106: "m" #0x7f6d0e881108: " " #0x7f6d0e88110a: "s" #0x7f6d0e88110c: "t" #0x7f6d0e88110e: "r" #0x7f6d0e881110: "i" #0x7f6d0e881112: "n" #0x7f6d0e881114: "g"
started to think what’s happening that deep under the hood.
m_firstChar
of identical strings will point to the same block of memory? Can I check that?
how debugging .NET code with lldb
looks like.
gdb
, so I kind of know the feeling.
https://github.com/Microsoft/ProcDump-for-Linux:A Linux version of the ProcDump Sysinternals tool
CoreCLR is the runtime for .NET Core. It includes the garbage collector, JIT compiler, primitive data types and low-level classes.
https://docs.microsoft.com/dotnet/core/
背景:
had to open a core dump of a .NET Core application on Linux
Before you begin, you need to configure your Linux box to generate core dumps in the first place.
# echo core > /proc/sys/kernel/core_pattern
To open the core dump, you’ll need LLDB built with the same architecture as your CoreCLR.
$ find /usr/share/dotnet -name libsosplugin.so
/usr/share/dotnet/shared/Microsoft.NETCore.App/1.1.0/libsosplugin.so $ ldd $(find /usr/share/dotnet -name libsosplugin.so) | grep lldb liblldb-3.5.so.1 => /usr/lib/x86_64-linux-gnu/liblldb-3.5.so.1 (0x00007f0a6b2d8000)
Seeing that LLDB 3.5 was required, I installed it with sudo apt install lldb-3.5, but YMMV on other distros, of course.
Now you’re ready to open the core file in LLDB.
(lldb) plugin load /usr/share/dotnet/shared/Microsoft.NETCore.App/1.1.1/libsosplugin.so
(lldb) setclrpath /usr/share/dotnet/shared/Microsoft.NETCore.App/1.1.1
With that, SOS should be loaded and ready for use.
You’d think you can just start running the SOS commands you know and love, but there’s one final hurdle.
Here’s what happened when I opened a core file generated from a crash, and tried to get the exception information (note that you should prefix SOS commands with ‘sos’):
(lldb) sos PrintException The current thread is unmanaged
Considering that the process crashed as a result of a managed exception.
Looking at the docs, it looks like SOS and LLDB have trouble communicating around the current thread’s identity.
So first, let’s find the thread that encountered an exception:
(lldb) sos Threads
……
XXXX 8 5a15 00007F5AC006A3F0 21020 Preemptive 0x7f5ad594dd10:0x7f5ad594ece8 0000000000C195C0 0 Ukn System.IO.FileNotFoundException 00007f5ad593fa80 (nested exceptions)
Thread #8 looks suspicious, what with the System.IO.FileNotFoundException in the Exception column. Now, let’s see all the LLDB threads:
(lldb) thread list
Process 0 stopped
* thread #1: tid = 0, 0x00007f5c5d83b7ef libc.so.6`__GI_raise(sig=2) + 159 at raise.c:58, name = 'dotnet', stop reason = signal SIGABRT
Here, it looks like thread 1 is the one with the exception being raised.
So we have to map the OS thread ID from the first command, to the LLDB thread id from the second command:
(lldb) setsostid 5a15 1
Mapped sos OS tid 0x5a15 to lldb thread index 1
And now, we’re ready to roll:
(lldb) sos PrintException
Exception object: 00007f5ad593fa80
Exception type: System.IO.FileNotFoundException Message: Could not load the specified file. InnerException: <none>
This gives us the exception information and the thread’s current stack, if we want it.
We could similarly inspect other threads by mapping the OS thread id to the LLDB thread id,
but for a thread that didn’t have an exception, where do you get that clue that connects the OS thread id to the debugger thread ID?
Well, it seems that GDB is using the same numbering as LLDB, but in GDB you can actually see the LWP id (on Linux, GDB LWP = kernel pid = thread) using ‘info threads’:
$ gdb $(which dotnet) --core ./core
...
(gdb) info threads
Id Target Id Frame 5 Thread 0x7f5c5a40f700 (LWP 22531) 0x00007f5c5d9020bd in poll () at ../sysdeps/unix/syscall-template.S:84 6 Thread 0x7f5c59c0e700 (LWP 22532) 0x00007f5c5e485d8d in __pause_nocancel () at ../sysdeps/unix/syscall-template.S:84 7 Thread 0x7f5c5940d700 (LWP 22533) 0x00007f5c5e482510 in pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:219 8 Thread 0x7f5c589b2700 (LWP 22534) 0x00007f5c5e482510 in pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:219 9 Thread 0x7f5c498ae700 (LWP 22535) 0x00007f5c5e4828b9 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:258 10 Thread 0x7f5c454ef700 (LWP 22538) 0x00007f5c5e4856ed in __close_nocancel () at ../sysdeps/unix/syscall-template.S:84 11 Thread 0x7f5ad2324700 (LWP 22540) 0x00007f5c5e4856ed in __close_nocancel () at ../sysdeps/unix/syscall-template.S:84 12 Thread 0x7f5ad1b23700 (LWP 22541) syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 13 Thread 0x7f5ad2b25700 (LWP 23059) 0x00007f5c5e4828b9 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:258 ... more output snipped for brevity ...
So, for example, suppose we wanted to know what managed thread #6 (OS thread id 0x580d from the ‘sos Threads’ output above) was doing when the dump file was generated.
0x580d = 22541, which is thread #12 in the output above.
Going back to LLDB (note the hex notation for both thread ids):
(lldb) setsostid 580d c
Mapped sos OS tid 0x580d to lldb thread index 12
(lldb) clrstack OS Thread Id: 0x580d (12) Child SP IP Call Site 00007F5AD1B227F8 00007f5c5d907d29 [InlinedCallFrame: 00007f5ad1b227f8] Microsoft.AspNetCore.Server.Kestrel.Internal.Networking.Libuv+NativeMethods.uv_run(Microsoft.AspNetCore.Server.Kestrel.Internal.Networking.UvLoopHandle, Int32)
Other SOS commands that don’t depend on thread context (e.g. listing assemblies, heap objects, finalization queues and so on) do not require any fiddling with thread ids, and you can just run them directly.
So, what we had to do in order to open a .NET Core core dump from a Linux system was: