Android疑难杂症之TimeoutException

时间 2019-11-07

标签 android 疑难杂症 timeoutexception 栏目 Android 繁體版

原文原文链接

1.分析缘由

在android开发中常常会到一些即便看了堆栈也没法快速定位的问题，由于这些堆栈几乎都是系统代码，并没有业务代码，并且发生crash打印的堆栈也不必定是这个地方致使的。例如咱们今天要讨论的java.util.concurrent.TimeoutException，咱们这里能查询到一个上报的堆栈以下：java

java.util.concurrent.TimeoutException: android.content.res.AssetManager.finalize() timed out after 10 seconds
android.content.res.AssetManager.destroy(Native Method)
android.content.res.AssetManager.finalize(AssetManager.java:591)
java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:250)
java.lang.Daemons$FinalizerDaemon.runInternal(Daemons.java:237)
java.lang.Daemons$Daemon.run(Daemons.java:103)
java.lang.Thread.run(Thread.java:764)

能够看到这些都是系统的堆栈，咱们也没法快速定位到业务中究竟是哪里致使了这个crash，只能从给出的堆栈知道是在系统回收资源AssetManager进行析构时超时致使的异常。android

上网查询后发现，这其实已经算是一个比较广泛的问题，并且大多发生在OPPO和360手机中，究其缘由：安全

Android在启动后会建立一些守护线程，其中涉及到该问题的有两个，分别是FinalizerDaemon和FinalizerWatchdogDaemon.函数
对FinalizerDaemon析构守护线。对于重写了成员函数finalize的对象，当它们被GC决定要被回收时，并不会立刻被回收，而是被放入到一个队列中，等待FinalizerDaemon守护线程去调用它们的成员函数finalize后再被回收。this
FinalizerWatchdogDaemon析构监听守护线程，用来监控FinalizerDaemon线程的执行。一旦监测到那些重写了finalize的对象在执行成员函数finalize时超出必定时间，那么就会退出VM。编码

从上面的分析知道，若是FinalizerDaemon进行对象析构时超过了MAX_FINALIZE_NANOS（默认10s，各个Rom厂商极可能会更改这个参数。例如OPPO不少机器上这个参数被改为了120s），FinalizerWatchdogDaemon进行就会抛出TimeoutExceptionspa

Daemons.java#FinalizerWatchdogDaemon线程

private static void finalizerTimedOut(Object object) {
    // The current object has exceeded the finalization deadline; abort!
    String message = object.getClass().getName() + ".finalize() timed out after "+ (MAX_FINALIZE_NANOS / NANOS_PER_SECOND) + " seconds";
    Exception syntheticException = new TimeoutException(message);
    ……
}
复制代码

10s的超时实际上是很大的一个值，通常的析构方法的执行时间很难超过这个数。咱们大体推断发生这种crash的特色：3d

从数据来看，崩溃都是应用处于后台不可见的状况下发生
崩溃时应用已经被长时间使用

从Stack Overflow上找到了一个相对比较合理的出现场景：code

当你的应用处于后台，有对象须要释放回收内存时
记录一个start_time，而后FinalizerDaemon开始析构AssetManager对象
在这个过程当中，设备忽然进入了休眠状态，析构执行被暂停
当过了一段时间，设备被唤醒，析构任务被恢复，继续执行，直至结束
在析构完成后，获得一个end_time
FinalizerWatchdogDaemon对end_time与start_time进行差值并与MAX_FINALIZE_NANOS比较，发现超过了MAX_FINALIZE_NANOS，因而就抛出了TimeOut异常

可见应用后台执行的时间越长，出现的几率应该就会越大。

2.解决方案

咱们上面分析了发生这种TimeOut异常的缘由，知道要根治这个问题，仍是要合理的编码，特别在涉及到内存分配方面时。那么到底什么才是合理编码，怎么才能合理的申请的内存、复用内存和回收内存呢。这是一个仁者见仁智者见智的事情，也不是咱们本文讨论的重点。这里咱们提供一种折中的补救措施。就是在咱们的应用进程起来后，咱们经过反射主动关闭FinalizerWatchdogDaemon线程对析构过程的监听，这样即便FinalizerDaemon 调用对象的finalize进行析构回收超时了，也不会抛出这个TimeOut异常了。

private void stopWatchdogDaemon() {
    GLog.i(TAG, "---stopWatchdogDaemon---");
    try {
        /** * 1.获取Daemons$FinalizerWatchdogDaemon的单例实例INSTANCE */
        Class clazz = Class.forName("java.lang.Daemons$FinalizerWatchdogDaemon");
        Field field = clazz.getDeclaredField("INSTANCE");
        field.setAccessible(true);
        Object watchDog = field.get(null);
        try {
            /** * 2.将Daemon的成员变量thread设置为null */
            Field thread = clazz.getSuperclass().getDeclaredField("thread");
            thread.setAccessible(true);
            thread.set(watchDog, null);
        } catch (Throwable throwable) {
            GLog.e(TAG, "set thread null to stop watchDog error, throwable: " + throwable.getMessage());
            try {
                /** * 3.若是2中将thread置null失败，则直接调用Daemon的stop方法 */
                Method method = clazz.getSuperclass().getDeclaredMethod("stop");
                method.setAccessible(true);
                method.invoke(watchDog);
            } catch (Throwable error) {
                GLog.e(TAG, "invoke stop method to stop watchDog error, throwable: " + error.getMessage());
            }
        }
    } catch (Throwable throwable) {
        GLog.e(TAG, "get obj to stop watchDog error, throwable: " + throwable.getMessage());
    }
}
复制代码

上面经过反射首先将FinalizerWatchdogDaemon父类Daemon中的thread置空，若是失败再经过反射调用FinalizerWatchdogDaemon父类Daemon的stop方法继续将成员变量thread置空（以下代码中的1处注释所示），并中止线程（下代码中的2处注释所示）

SDK=28 Daemons.java#Daemon

/** * Waits for the runtime thread to stop. This interrupts the thread * currently running the runnable and then waits for it to exit. */
public void stop() {
    Thread threadToStop;
    synchronized (this) {
        // 1.外部调用置空
        threadToStop = thread;
        thread = null;
    }
    if (threadToStop == null) {
        throw new IllegalStateException("not running");
    }
    // 2.中止线程
    interrupt(threadToStop);
    while (true) {
        try {
            threadToStop.join();
            return;
        } catch (InterruptedException ignored) {
        } catch (OutOfMemoryError ignored) {
            // An OOME may be thrown if allocating the InterruptedException failed.
        }
    }
}

// 3.线程安全的操做
public synchronized void interrupt(Thread thread) {
    if (thread == null) {
        throw new IllegalStateException("not running");
    }
    thread.interrupt();
}
复制代码

Ps:在6.0以前，当使用stop方法来中止线程时，是一个不安全的操做，可能会存在线程安全问题。以下代码所示

参考