揭开ThreadLocal的面纱

时间 2019-12-06

原文原文链接

当初使用C#时，研究过好一阵它的ThreadLocal，以及能够跨线程传递的LogicalCallContext（ExecutionContext）,无奈C#不开源（所幸有了.Net Core），只能满世界找文档，找博客。切换到Java后，终于接触到了另外一种研究问题的方法：相比于查资料，更能够看代码，调试代码。而后，一切都不那么神秘了。html

做用及核心原理

在我看来，Thread Local主要提供两个功能：git

方便传参。提供一个方便的“货架子”，想存就存，想取的时候能取到，不用每层方法调用都传一大堆参数。（咱们一般倾向于把公共的数据放到货架子里）
线程隔离。各个线程的值互不相干，屏蔽了多线程的烦恼。

This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable. ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g., a user ID or Transaction ID)github

代码的注释太到位了。ThreadLocal应该翻译为【线程本地变量】，意为和普通变量相对。ThreadLocal一般是一个静态变量，但其get()获得的值在各个线程中互不相干。web

ThreadLocal的几个核心方法：spring

get() 获得变量的值。若是此ThreadLocal在当前线程中被设置过值，则返回该值；不然，间接地调用initialValue()初始化当前线程中的变量，再返回初始值。
set() 设置当前线程中的变量值。
protected initialValue() 初始化方法。默认实现是返回null。
remove() 删除当前线程中的变量。

原理简述

每一个线程都有一个 ThreadLocalMap 类型的 threadLocals 属性，ThreadLocalMap 类至关于一个Map，key 是 ThreadLocal 自己，value 就是咱们设置的值。

public class Thread implements Runnable {
    ThreadLocal.ThreadLocalMap threadLocals = null;
}
复制代码

当咱们经过 threadLocal.set("xxx"); 的时候，就是在这个线程中的 threadLocals 属性中放入一个键值对，key 是当前线程，value 就是你设置的值。

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}
复制代码

当咱们经过 threadlocal.get() 方法的时候，就是根据当前线程做为key来获取这个线程设置的值。

public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
             @SuppressWarnings("unchecked")
             T result = (T)e.value;
             return result;
        }
    }
    return setInitialValue();
}
复制代码

核心：ThreadLocalMap

ThreadLocalMap is a customized hash map suitable only for maintaining thread local values. To help deal with very large and long-lived usages, the hash table entries use WeakReferences for keys. However, since reference queues are not used, stale entries are guaranteed to be removed only when the table starts running out of space.数据库

ThreadLocalMap是一个定制的Hash map，使用开放寻址法解决冲突。bash

它的Entry是一个WeakReference，准确地说是继承了WeakReference
ThreadLocal对象的引用被传到WeakReference的reference中，entry.get()被看成map元素的key，而Entry还多了一个字段value，用来存放ThreadLocal变量实际的值。
因为是弱引用，若ThreadLocal对象再也不有普通引用，GC发生时会将ThreadLocal对象清除。而Entry的key，即entry.get()会变为null。然而，GC只会清除被引用对象，Entry还被线程的ThreadLocalMap引用着，于是不会被清除。于是，value对象就不会被清除。除非线程退出，形成该线程的ThreadLocalMap总体释放，不然value的内存就没法释放，内存泄漏！
JDK的做者天然想到了这一点，所以在ThreadLocalMap的不少方法中，调用expungeStaleEntries()清除entry.get() == null 的元素，将Entry的value释放。因此，只要线程还在使用其余ThreadLocal，已经无效的ThreadLocal内存就会被清除。
然而，咱们大部分的使用场景是，ThreadLocal是一个静态变量，所以永远有普通引用指向每一个线程中的ThreadLocalMap的该entry。所以该ThreadLocal的Entry永远不会被释放，天然expungeStaleEntries()就无能为力，value的内存也不会被释放。因此在咱们确实用完了ThreadLocal后，能够主动调用remove()方法，主动删掉entry。

然而，真的有必要调用remove()方法吗？一般咱们的场景是服务端，线程在不断地处理请求，每一个请求到来会致使某线程中的Thread Local变量被赋予一个新的值，而原来的值对象天然地就失去了引用，被GC清理。因此当使用static的Thread Local且不设置其为null时，不存在泄露！session

跨线程传递

Thread Local是不能跨线程传递的，线程隔离嘛！但有些场景中咱们又想传递。例如：多线程

启动一个新线程执行某个方法，但但愿新线程也能经过Thread Local获取当前线程拥有的上下文(e.g., User ID, Transaction ID)。
将任务提交给线程池执行时，但愿未来执行任务的那个线程也能继承当前线程的Thread Local，从而可使用当前的上下文。

下面咱们就来看一下有哪些方法。并发

InheritableThreadLocal

原理：InheritableThreadLocal这个类继承了ThreadLocal，重写了3个方法。

public class InheritableThreadLocal<T> extends ThreadLocal<T> {
    // 能够忽略
    protected T childValue(T parentValue) {
        return parentValue;
    }

    /**
     * Get the map associated with a ThreadLocal.
     *
     * @param t the current thread
     */
    ThreadLocalMap getMap(Thread t) {
       return t.inheritableThreadLocals;
    }

    /**
     * Create the map associated with a ThreadLocal.
     *
     * @param t the current thread
     * @param firstValue value for the initial entry of the table.
     */
    void createMap(Thread t, T firstValue) {
        t.inheritableThreadLocals = new ThreadLocalMap(this, firstValue);
    }
}
复制代码

能够看到使用InheritableThreadLocal时，map使用了线程的inheritableThreadLocals 字段，而不是以前的threadLocals 字段。

而inheritableThreadLocals 字段既然叫可继承的，天然在建立新线程的时候会传递。代码在Thread的init()方法中：

if (inheritThreadLocals && parent.inheritableThreadLocals != null)
            this.inheritableThreadLocals =
                ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);
复制代码

到此为止，经过inheritableThreadLocals咱们能够在父线程建立子线程的时候将ThreadLocal中的值传递给子线程，这个特性已经可以知足大部分的需求了[1]。可是还有一个很严重的问题会出如今线程复用的状况下[2]，好比线程池中去使用inheritableThreadLocals 进行传值，由于inheritableThreadLocals 只是会在新建立线程的时候进行传值，线程复用并不会作这个操做。

到这里JDK就无能为力了。C#提供了LogicalCallContext（以及Execution Context机制）来解决，Java要解决这个问题就得本身去扩展线程类，实现这个功能。

阿里开源的transmittable-thread-local

GitHub地址。

transmittable-thread-local使用方式分为三种：（装饰器模式哦！）

修饰Runnable和Callable
修饰线程池
Java Agent来修饰（运行时修改）JDK线程池实现类。

具体使用方式官方文档很是清楚。

下面简析原理：

既然要解决在使用线程池时的thread local传递问题，就要把任务提交时的当前ThreadLocal值传递到任务执行时的那个线程。
而如何传递，天然是在提交任务前**捕获（capture）当前线程的全部ThreadLocal，存下来，而后在任务真正执行时在目标线程中放出(replay)**以前捕获的ThreadLocal。

代码层面，以修饰Runnable举例：

建立TtlRunnable()时，必定先调用capture()捕获当前线程中的ThreadLocal

private TtlCallable(@Nonnull Callable<V> callable, boolean releaseTtlValueReferenceAfterCall) {
    this.capturedRef = new AtomicReference<Object>(capture());
    ...
}
复制代码

capture() 方法是Transmitter类的静态方法：

public static Object capture() {
        Map<TransmittableThreadLocal<?>, Object> captured = new HashMap<TransmittableThreadLocal<?>, Object>();
        for (TransmittableThreadLocal<?> threadLocal : holder.get().keySet()) {
            captured.put(threadLocal, threadLocal.copyValue());
        }
        return captured;
}
复制代码

在run()中，先放出以前捕获的ThreadLocal。

public void run() {
    Object captured = capturedRef.get();
    ...
    Object backup = replay(captured);
    try {
        runnable.run();
    } finally {
        restore(backup); 
    }
}
复制代码

时序图：

应用

Spring MVC的静态类 RequestContextHolder，getRequestAttributes()实际上得到的就是InheritableThreadLocal<RequestAttributes>在当前线程中的值。也能够说明它能够传递到自身建立的线程中，但对已有的线程无能为力。

至于它是什么什么被设置的，能够参考其注释：Holder class to expose the web request in the form of a thread-bound RequestAttributes object. The request will be inherited by any child threads spawned by the current thread if the inheritable flag is set to true. Use RequestContextListener or org.springframework.web.filter.RequestContextFilter to expose the current web request. Note that org.springframework.web.servlet.DispatcherServlet already exposes the current request by default.
Spring中的数据库链接，Hibernate中的session。
阿里巴巴TTL总结的几个应用场景

...

一些坑

好文：谈谈ThreadLocal的设计及不足

提到了设计ThreadLocal须要考虑的两个问题，ThreadLocal又是如何解决的。
1. 当Thread退出时，资源如何释放，避免内存泄漏问题。
2. Map数据可能由于会被多线程访问，存在资源竞争，须要考虑并发同步问题。
提到了TheadLocal被gc但其关联的Entry还在的内存泄露问题，在Lucene中获得了解决：
1. 我看ThreadLocal时，也在想，既然Entry的key是WeakReference，为啥Value不也作成WeakReference，这样不就没泄露了？
2. 转念一想，value是弱引用的话，就不能保证使用的时候它还在了，由于会被gc掉。
3. 而Lucene为了解决以上，又保存了一个WeakHashMap<Thread, T>，这样只要线程还在，value就不会被清掉。
4. 然而又带来了多线程访问的问题，须要加锁。

你看，全部的东西都不是十全十美的，咱们掌握那个平衡点就好了。