前几天在帮同事排查生产一个线上偶发的线程池错误html
逻辑很简单,线程池执行了一个带结果的异步任务。可是最近有偶发的报错:java
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@a5acd19 rejected from java.util.concurrent.ThreadPoolExecutor@30890a38[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
本文中的模拟代码已经问题都是在HotSpot java8 (1.8.0_221)版本下模拟&出现的oracle
下面是模拟代码,经过Executors.newSingleThreadExecutor建立一个单线程的线程池,而后在调用方获取Future的结果less
public class ThreadPoolTest { public static void main(String[] args) { final ThreadPoolTest threadPoolTest = new ThreadPoolTest(); for (int i = 0; i < 8; i++) { new Thread(new Runnable() { @Override public void run() { while (true) { Future<String> future = threadPoolTest.submit(); try { String s = future.get(); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } catch (Error e) { e.printStackTrace(); } } } }).start(); } //子线程不停gc,模拟偶发的gc new Thread(new Runnable() { @Override public void run() { while (true) { System.gc(); } } }).start(); } /** * 异步执行任务 * @return */ public Future<String> submit() { //关键点,经过Executors.newSingleThreadExecutor建立一个单线程的线程池 ExecutorService executorService = Executors.newSingleThreadExecutor(); FutureTask<String> futureTask = new FutureTask(new Callable() { @Override public Object call() throws Exception { Thread.sleep(50); return System.currentTimeMillis() + ""; } }); executorService.execute(futureTask); return futureTask; } }
第一个思考的问题是:线程池为何关闭了,代码中并无手动关闭的地方。看一下Executors.newSingleThreadExecotor
的源码实现:异步
public static ExecutorService newSingleThreadExecutor() { return new FinalizableDelegatedExecutorService (new ThreadPoolExecutor(1, 1, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>())); }
这里建立的其实是一个FinalizableDelegatedExecutorService
,这个包装类重写了finalize
函数,也就是说这个类会在被GC回收以前,先执行线程池的shutdown方法。jvm
问题来了,GC只会回收不可达(unreachable)的对象,在submit
函数的栈帧未执行完出栈以前,executorService
应该是可达的才对。ide
对于此问题,先抛出结论:函数
当对象仍存在于做用域(stack frame)时,finalize
也可能会被执行测试
oracle jdk文档中有一段关于finalize的介绍:优化
https://docs.oracle.com/javas...
A reachable object is any object that can be accessed in any potential continuing computation from any live thread.Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
大概意思是:可达对象(reachable object)是能够从任何活动线程的任何潜在的持续访问中的任何对象;java编译器或代码生成器可能会对再也不访问的对象提早置为null,使得对象能够被提早回收
也就是说,在jvm的优化下,可能会出现对象不可达以后被提早置空并回收的状况
举个例子来验证一下(摘自https://stackoverflow.com/questions/24376768/can-java-finalize-an-object-when-it-is-still-in-scope):
class A { @Override protected void finalize() { System.out.println(this + " was finalized!"); } public static void main(String[] args) throws InterruptedException { A a = new A(); System.out.println("Created " + a); for (int i = 0; i < 1_000_000_000; i++) { if (i % 1_000_00 == 0) System.gc(); } System.out.println("done."); } } //打印结果 Created A@1be6f5c3 A@1be6f5c3 was finalized!//finalize方法输出 done.
从例子中能够看到,若是a在循环完成后已经再也不使用了,则会出现先执行finalize的状况;虽然从对象做用域来讲,方法没有执行完,栈帧并无出栈,可是仍是会被提早执行。
如今来增长一行代码,在最后一行打印对象a,让编译器/代码生成器认为后面有对象a的引用
... System.out.println(a); //打印结果 Created A@1be6f5c3 done. A@1be6f5c3
从结果上看,finalize方法都没有执行(由于main方法执行完成后进程直接结束了),更不会出现提早finalize的问题了
基于上面的测试结果,再测试一种状况,在循环以前先将对象a置为null,而且在最后打印保持对象a的引用
A a = new A(); System.out.println("Created " + a); a = null;//手动置null for (int i = 0; i < 1_000_000_000; i++) { if (i % 1_000_00 == 0) System.gc(); } System.out.println("done."); System.out.println(a); //打印结果 Created A@1be6f5c3 A@1be6f5c3 was finalized! done. null
从结果上看,手动置null的话也会致使对象被提早回收,虽然在最后还有引用,但此时引用的也是null了
如今再回到上面的线程池问题,根据上面介绍的机制,在分析没有引用以后,对象会被提早finalize
可在上述代码中,return以前明明是有引用的executorService.execute(futureTask)
,为何也会提早finalize呢?
猜想多是因为在execute方法中,会调用threadPoolExecutor,会建立并启动一个新线程,这时会发生一次主动的线程切换,致使在活动线程中对象不可达
结合上面Oracle Jdk文档中的描述“可达对象(reachable object)是能够从任何活动线程的任何潜在的持续访问中的任何对象”,能够认为多是由于一次显示的线程切换,对象被认为不可达了,致使线程池被提早finalize了
下面来验证一下猜测:
//入口函数 public class FinalizedTest { public static void main(String[] args) { final FinalizedTest finalizedTest = new FinalizedTest(); for (int i = 0; i < 8; i++) { new Thread(new Runnable() { @Override public void run() { while (true) { TFutureTask future = finalizedTest.submit(); } } }).start(); } new Thread(new Runnable() { @Override public void run() { while (true) { System.gc(); } } }).start(); } public TFutureTask submit(){ TExecutorService TExecutorService = Executors.create(); TExecutorService.execute(); return null; } } //Executors.java,模拟juc的Executors public class Executors { /** * 模拟Executors.createSingleExecutor * @return */ public static TExecutorService create(){ return new FinalizableDelegatedTExecutorService(new TThreadPoolExecutor()); } static class FinalizableDelegatedTExecutorService extends DelegatedTExecutorService { FinalizableDelegatedTExecutorService(TExecutorService executor) { super(executor); } /** * 析构函数中执行shutdown,修改线程池状态 * @throws Throwable */ @Override protected void finalize() throws Throwable { super.shutdown(); } } static class DelegatedTExecutorService extends TExecutorService { protected TExecutorService e; public DelegatedTExecutorService(TExecutorService executor) { this.e = executor; } @Override public void execute() { e.execute(); } @Override public void shutdown() { e.shutdown(); } } } //TThreadPoolExecutor.java,模拟juc的ThreadPoolExecutor public class TThreadPoolExecutor extends TExecutorService { /** * 线程池状态,false:未关闭,true已关闭 */ private AtomicBoolean ctl = new AtomicBoolean(); @Override public void execute() { //启动一个新线程,模拟ThreadPoolExecutor.execute new Thread(new Runnable() { @Override public void run() { } }).start(); //模拟ThreadPoolExecutor,启动新建线程后,循环检查线程池状态,验证是否会在finalize中shutdown //若是线程池被提早shutdown,则抛出异常 for (int i = 0; i < 1_000_000; i++) { if(ctl.get()){ throw new RuntimeException("reject!!!["+ctl.get()+"]"); } } } @Override public void shutdown() { ctl.compareAndSet(false,true); } }
执行若干时间后报错:
Exception in thread "Thread-1" java.lang.RuntimeException: reject!!![true]
从错误上来看,“线程池”一样被提早shutdown了,那么必定是因为新建线程致使的吗?
下面将新建线程修改成Thread.sleep
测试一下:
//TThreadPoolExecutor.java,修改后的execute方法 public void execute() { try { //显式的sleep 1 ns,主动切换线程 TimeUnit.NANOSECONDS.sleep(1); } catch (InterruptedException e) { e.printStackTrace(); } //模拟ThreadPoolExecutor,启动新建线程后,循环检查线程池状态,验证是否会在finalize中shutdown //若是线程池被提早shutdown,则抛出异常 for (int i = 0; i < 1_000_000; i++) { if(ctl.get()){ throw new RuntimeException("reject!!!["+ctl.get()+"]"); } } }
执行结果同样是报错
Exception in thread "Thread-3" java.lang.RuntimeException: reject!!![true]
由此可得,若是在执行的过程当中,发生一次显式的线程切换,则会让编译器/代码生成器认为外层包装对象不可达
虽然GC只会回收不可达GC ROOT的对象,可是在编译器(没有明确指出,也多是JIT)/代码生成器的优化下,可能会出现对象提早置null,或者线程切换致使的“提早对象不可达”的状况。
因此若是想在finalize方法里作些事情的话,必定在最后显示的引用一下对象(toString/hashcode均可以),保持对象的可达性(reachable)
上面关于线程切换致使的对象不可达,没有官方文献的支持,只是我的一个测试结果,若有问题欢迎指出
综上所述,这种回收机制并非JDK的bug,而算是一个优化策略,提早回收而已;但Executors.newSingleThreadExecutor
的实现里经过finalize来自动关闭线程池的作法是有Bug的,在通过优化后可能会致使线程池的提早shutdown,从而致使异常。
线程池的这个问题,在JDK的论坛里也是一个公开但未解决状态的问题https://bugs.openjdk.java.net/browse/JDK-8145304。
不过在JDK11下,该问题已经被修复:
JUC Executors.FinalizableDelegatedExecutorService public void execute(Runnable command) { try { e.execute(command); } finally { reachabilityFence(this); } }