JDK tools之jps和jstack诊断Java程序

大部分Java开发者可能知道有这么个工具,可是没怎么用过,每次还得百度一下。我也是之一 -_-!!。java

每次遇到👇下面的问题,头是否是有点大:c++

  • OutOfMemoryError
  • 程序忽然卡顿死锁
  • 内存泄露

我的感受jps、jstack、jmap、visualVM、jConsole、MAT都是解决这些问题的利器,今天先介绍jps、jstack。git

一、jps,先man一下看看jps是干什么的

NAME
       jps - Java Virtual Machine Process Status Tool
DESCRIPTION
       The  jps  tool  lists  the  instrumented  HotSpot Java Virtual Machines (JVMs) on the target system. The tool is limited to reporting  information on JVMs for which it has the access permissions.

从名字就大概知道jps是查看Java进程信息的工具,跟LInux/Unix下的ps(process status)相似,只是jps只针对Java程序。数据库

使用语法以下:

SYNOPSIS
       jps [options] [hostid]

主要参数:

-q 不输出类名、Jar名和传入main方法的参数
-m 输出传入main方法的参数
-l 输出main类或Jar的全限名
-v 输出传入JVM的参数

例如:

➜  ~ jps -m -l
2281 /Users/xxxxxx/Applications/data-integration-7.0/launcher/pentaho-application-launcher-7.0.0.0-25.jar -lib ./../libswt/osx64/ 
2312 sun.tools.jps.Jps -m -l
➜  ~ 

包括jps在内一共有两个Java进程,第一列是pid,也是后面咱们要用到的信息。缓存

固然上面的信息还能够用ps命令来查询ps  -ef或ps aux,前者Linux下的标准语法,后者是BSD风格的语法。同样能够查到第二列的pid。网络

➜  ~ ps -ef | grep java
  501  2281  2261   0 11:48PM ttys000    0:52.81 /Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home/bin/java -XstartOnFirstThread -Xms1024m -Xmx2048m -XX:MaxPermSize=256m -Dhttps.protocols=TLSv1,TLSv1.1,TLSv1.2 -Djava.library.path=./../libswt/osx64/ -DKETTLE_HOME= -DKETTLE_REPOSITORY= -DKETTLE_USER= -DKETTLE_PASSWORD= -DKETTLE_PLUGIN_PACKAGES= -DKETTLE_LOG_SIZE_LIMIT= -DKETTLE_JNDI_ROOT= -jar /Users/xxxxxx/Applications/data-integration-7.0/launcher/pentaho-application-launcher-7.0.0.0-25.jar -lib ./../libswt/osx64/ 
  501  2361  2285   0 11:51PM ttys001    0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn java
➜  ~ 

二、jstack

NAME
       jstack - stack trace
DESCRIPTION
       jstack  prints  Java  stack  traces  of  Java threads for a given Java process or core file or a remote debug server. 
For each Java frame, the full class name, method name, 'bci' (byte code index) and line number, if available, are printed.
With the -m option,jstack prints both Java and native frames of all threads along with the 'pc' (program counter).
For each native frame, the closest native symbol to 'pc', if available, is printed. C++ mangled names are not demangled.
To demangle C++ names, the output of this command may be piped to c++filt.

说白了jstack就是查看当前Java程序内线程详细堆栈信息的工具。多线程

使用语法以下:

SYNOPSIS
       jstack [ option ] pid
       jstack [ option ] executable core
       jstack [ option ] [server-id@]remote-hostname-or-IP

主要参数:

Options:
    -F  to force a thread dump. Use when jstack <pid> does not respond (process is hung)(强制线程转储,此时线程会被挂起)
    -m  to print both java and native frames (mixed mode)(打印Java和native接口的堆栈信息)
    -l  long listing. Prints additional information about locks(打印关于锁的详细信息,若是有线程死锁可使用jstack -l pid查看锁或资源的持有信息)
    -h or -help to print this help message

 为了配合实战,写了一个会发生死锁的多线程程序代码以下:app

public class MyThread extends Thread {
    private Object object1;
    private Object object2;

    public MyThread(Object object1, Object object2, String threadName) {
        this.object1 = object1;
        this.object2 = object2;
        setName(threadName);
    }

    @Override
    public void run() {
        String name = Thread.currentThread().getName();
        System.out.println(name + " acquiring lock on " + object1);
        synchronized (object1) {
            System.out.println(name + " acquired lock on " + object1);
            sleep();
            System.out.println(name + " acquiring lock on " + object2);
            synchronized (object2) {
                System.out.println(name + " acquired lock on " + object2);
                sleep();
            }
            System.out.println(name + " released lock on " + object2);
        }
        System.out.println(name + " released lock on " + object2);
        System.out.println(name + " finished execution.");
    }

    private void sleep() {
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

main方法以下:ide

public class MainTest {
    public static void main(String[] args) {
        Object object1 = new Object();
        Object object2 = new Object();
        Object object3 = new Object();
        MyThread myThread1 = new MyThread(object1, object2, "myThread1");
        MyThread myThread2 = new MyThread(object2, object3, "myThread2");
        MyThread myThread3 = new MyThread(object3, object1, "myThread3");

        myThread1.start();
        myThread2.start();
        myThread3.start();
    }
}

运行程序后输出以下:svn

[root@localhost MutipleThreadTest]# java MainTest
myThread1 acquiring lock on java.lang.Object@796523ab
myThread3 acquiring lock on java.lang.Object@4335236e
myThread2 acquiring lock on java.lang.Object@5f08edd0
myThread3 acquired lock on java.lang.Object@4335236e
myThread1 acquired lock on java.lang.Object@796523ab
myThread2 acquired lock on java.lang.Object@5f08edd0
myThread3 acquiring lock on java.lang.Object@796523ab
myThread2 acquiring lock on java.lang.Object@4335236e
myThread1 acquiring lock on java.lang.Object@5f08edd0

从上面信息可知myThread一、myThread二、myThread3三个线程分别拿到一把锁,而后等待其余线程持有的锁。
因为没有任何一个线程释放锁因而整个程序进入死锁的状态。
现场的生产状况每每会更复杂,须要借助工具定位分析发生死锁的位置和缘由,下面上jstack分析一下当前的堆栈信息。
上面的介绍已经知道若是有线程死锁可使用`jstack -l pid`查看锁或资源的持有信息

[root@localhost ~]# jstack -l 3661
2018-01-09 17:24:15
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f03a0001000 nid=0xeab waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"DestroyJavaVM" prio=10 tid=0x00007f03d4008800 nid=0xe4e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"myThread3" prio=10 tid=0x00007f03d40b5800 nid=0xe5e waiting for monitor entry [0x00007f03d05f4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at MyThread.run(MyThread.java:26)
    - waiting to lock <0x00000000d784b978> (a java.lang.Object)
    - locked <0x00000000d784b998> (a java.lang.Object)

   Locked ownable synchronizers:
    - None

"myThread2" prio=10 tid=0x00007f03d40b4000 nid=0xe5d waiting for monitor entry [0x00007f03d06f5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at MyThread.run(MyThread.java:26)
    - waiting to lock <0x00000000d784b998> (a java.lang.Object)
    - locked <0x00000000d784b988> (a java.lang.Object)

   Locked ownable synchronizers:
    - None

"myThread1" prio=10 tid=0x00007f03d40b1800 nid=0xe5c waiting for monitor entry [0x00007f03d07f6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at MyThread.run(MyThread.java:26)
    - waiting to lock <0x00000000d784b988> (a java.lang.Object)
    - locked <0x00000000d784b978> (a java.lang.Object)

   Locked ownable synchronizers:
    - None

"Service Thread" daemon prio=10 tid=0x00007f03d4097800 nid=0xe5a runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"C2 CompilerThread1" daemon prio=10 tid=0x00007f03d4095000 nid=0xe59 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"C2 CompilerThread0" daemon prio=10 tid=0x00007f03d4092000 nid=0xe58 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Signal Dispatcher" daemon prio=10 tid=0x00007f03d408f800 nid=0xe57 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Finalizer" daemon prio=10 tid=0x00007f03d406f000 nid=0xe55 in Object.wait() [0x00007f03d0dfc000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000d7804858> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
    - locked <0x00000000d7804858> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

   Locked ownable synchronizers:
    - None

"Reference Handler" daemon prio=10 tid=0x00007f03d406d000 nid=0xe54 in Object.wait() [0x00007f03d0efd000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000d7804470> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:503)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
    - locked <0x00000000d7804470> (a java.lang.ref.Reference$Lock)

   Locked ownable synchronizers:
    - None

"VM Thread" prio=10 tid=0x00007f03d4068800 nid=0xe53 runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f03d401e000 nid=0xe4f runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f03d4020000 nid=0xe50 runnable 

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f03d4022000 nid=0xe51 runnable 

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f03d4024000 nid=0xe52 runnable 

"VM Periodic Task Thread" prio=10 tid=0x00007f03d40a2000 nid=0xe5b waiting on condition 

JNI global references: 114


Found one Java-level deadlock:
=============================
"myThread3":
  waiting to lock monitor 0x00007f03ac0062c8 (object 0x00000000d784b978, a java.lang.Object),
  which is held by "myThread1"
"myThread1":
  waiting to lock monitor 0x00007f0388002258 (object 0x00000000d784b988, a java.lang.Object),
  which is held by "myThread2"
"myThread2":
  waiting to lock monitor 0x00007f03ac004e28 (object 0x00000000d784b998, a java.lang.Object),
  which is held by "myThread3"

Java stack information for the threads listed above:
===================================================
"myThread3":
    at MyThread.run(MyThread.java:26)
    - waiting to lock <0x00000000d784b978> (a java.lang.Object)
    - locked <0x00000000d784b998> (a java.lang.Object)
"myThread1":
    at MyThread.run(MyThread.java:26)
    - waiting to lock <0x00000000d784b988> (a java.lang.Object)
    - locked <0x00000000d784b978> (a java.lang.Object)
"myThread2":
    at MyThread.run(MyThread.java:26)
    - waiting to lock <0x00000000d784b998> (a java.lang.Object)
    - locked <0x00000000d784b988> (a java.lang.Object)

Found 1 deadlock.

简单分析以下:

  • 以上结果末尾可知jstack已经检测到了死锁。
  • "myThread3"在等待"myThread1"释放资源、"myThread1"在等待"myThread2"释放资源、"myThread2"在等待"myThread3"释放资源。简单说就是资源循环依赖了致使了死锁。
  • 现实的状况每每不会发生结构如此简单的死锁,不少状况都是等待网络IO、数据读写等耗时操做,致使线程占用的资源不能释放,从而致使程序假死、卡顿等状况。
  • 好比以前遇到过的一个场景saiku每次刷新缓存的时候都要卡好久,最后经过jstack发现每次刷缓存都会去拿olap数据库链接,而且加锁了。必须等刷缓存的相关操做执行完才能恢复,这其中就包含执行一系列的select COUNT(*)操做,而刚好存在这比样的大表4千万多万行数据,一个select COUNT(*)要执行90多秒,这就悲剧了,所有刷完须要8分钟左右系统才能恢复

堆栈中一些常见值解析

线程状态(java.lang.Thread.State)

  • NEW:至今还没有启动的线程的状态。 
  • RUNNABLE:可运行线程的线程状态。处于可运行状态的某一线程正在 Java 虚拟机中运行,但它可能正在等待操做系统中的其余资源,好比处理器。
  • BLOCKED:受阻塞而且正在等待monitor lock的某一线程的线程状态。处于受阻塞状态的某一线程正在等待monitor lock,以便进入一个同步的块/方法,或者在调用 Object.wait 以后再次进入同步的块/方法。 好比等待执行synchronized代码块或者使用synchronized标记的方法。
  • WAITING:某一等待线程的线程状态。某一线程由于调用下列方法之一而处于等待状态。
    • 不带超时值的 Object.wait
    • 不带超时值的 Thread.join
    • LockSupport.park 处于等待状态的线程正等待另外一个线程,以执行特定操做。 例如,已经在某一对象上调用了 Object.wait() 的线程正等待另外一个线程,以便在该对象上调用 Object.notify() 或 Object.notifyAll()。已经调用了 Thread.join() 的线程正在等待指定线程终止。 
  • TIMED_WAITING:具备指定等待时间的某一等待线程的线程状态。某一线程由于调用如下带有指定正等待时间的方法之一而处于定时等待状态: 
    • Thread.sleep
    • 带有超时值的 Object.wait
    • 带有超时值的 Thread.join
    • LockSupport.parkNanos
    • LockSupport.parkUntil
  • TERMINATED:已终止线程的线程状态。线程已经结束执行。 

各类线程ID

  • prio:线程的优先级;
  • tid:java Thread id;
  • nid:native线程的id, 很是关键,后面再使用jstack时补充;

使用top -Hp pid查看进程的线程状态时线程的id就对应这里的nid,在堆栈中nid是使用16进制表示的所以可使用以下工具作简单的转换

printf "%x\n" 21742

 

实战一下,找当前我的博客系统中最运行时间最长的线程

首先找到博客进程id

$ ps -ef | grep java

结果为:

zhangsan   27952     1  4  2017 ?        1-07:19:12 java -Xms128m -Xmx128m -Dfile.encoding=UTF-8 -jar tale-least.jar

查看当前线程信息

 

$ top -Hp 27952

top - 23:32:06 up 59 days, 10:06,  2 users,  load average: 0.06, 0.04, 0.01
Tasks:  16 total,   1 running,  15 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.7%us,  3.0%sy,  0.0%ni, 91.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1019988k total,   907552k used,   112436k free,   151028k buffers
Swap:        0k total,        0k used,        0k free,   387324k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                       
27965 zhangsan    20   0 2116m 213m  13m R  7.3 21.4   1727:05 java                                                           
27954 zhangsan    20   0 2116m 213m  13m S  0.7 21.4 133:27.59 java                                                           
27952 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.00 java                                                           
27953 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.22 java                                                           
27955 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.06 java                                                           
27956 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.55 java                                                           
27957 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.00 java                                                           
27958 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:28.53 java                                                           
27959 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:09.86 java                                                           
27960 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.00 java                                                           
27961 zhangsan    20   0 2116m 213m  13m S  0.0 21.4  13:00.98 java                                                           
27962 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.57 java                                                           
27969 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   2:04.10 java                                                           
27970 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:49.48 java                                                           
27971 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   2:16.69 java                                                           
18940 zhangsan    20   0 2116m 213m  13m S  0.0 21.4   0:00.00 java

 

发现线程pid 27954运行时间最长,转换成16进制

$ printf "%x\n" 27954
6d32

使用jstack该线程的堆栈信息

$ jstack 27952 | grep 6d32 | grep -v grep
"VM Thread" os_prio=0 tid=0x00007fc6ec094000 nid=0x6d32 runnable

发现是Java VM相关的线程怪不得运行时间最长,这里根据实际状况鉴别VM系统线程和本身的业务线程

相关文章
相关标签/搜索