HashMap在并发场景下踩过的坑

时间 2019-11-13

标签 hashmap 并发场景繁體版

原文原文链接

做者：张伟
java

关于HashMap在并发场景下的问题有不少人，不少公司遇到过！也不少人总结过，咱们不少时候都认为这样都坑距离本身很远，本身必定不会掉入这样都坑。但是咱们随时都有就遇到了这样都问题，坑一直都在咱们身边。今天遇到了一个非线程安全对象在并发场景下使用的问题，经过这个案例分析HashMap 在并发场景下使用存在的问题（固然在这个案例中还有不少问题值得咱们去分析，值得你们引觉得戒。）经过分析问题产生都缘由，让咱们从此更好远离这个BUG。web

代码如图所示，你们都应该知道HashMap不是线程安全的。那么 HashMap在并发场景下可能存在哪些问题？ 编程

数据丢失安全
数据重复并发
死循环app

关于死循环的问题，在Java8中我的认为是不存在了，在Java8以前的版本中之因此出现死循环是由于在resize的过程当中对链表进行了倒序处理；在Java8中再也不倒序处理，天然也不会出现死循环。源码分析

对这个问题Doug Lea 是这样说的：this

Doug Lea writes:

"This is a classic symptom of an incorrectly synchronized use ofHashMap. Clearly, the submitters need to use a thread-safe
HashMap. If they upgraded to Java 5, they could just useConcurrentHashMap. If they can't do this yet, they can use
either the pre-JSR166 version, or better, the unofficial backport
as mentioned by Martin. If they can't do any of these, they canuse Hashtable or synchhronizedMap wrappers, and live with poorer
performance. In any case, it's not a JDK or JVM bug."

I agree that the presence of a corrupted data structure alone
does not indicate a bug in the JDK.

首先看一下put源码
spa

public V put(K key, V value) {        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }        if (key == null)            return putForNullKey(value);        int hash = hash(key);        int i = indexFor(hash, table.length);        for (Entry e = table[i]; e != null; e = e.next) {
            Object k;            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);        return null;
    } void addEntry(int hash, K key, V value, int bucketIndex) {        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }   
    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

经过上面Java7中的源码分析一下为何会出现数据丢失，若是有两条线程同时执行到这条语句 table[i]=null,时两个线程都会区建立Entry,这样存入会出现数据丢失。

若是有两个线程同时发现本身都key不存在，而这两个线程的key实际是相同的，在向链表中写入的时候第一线程将e设置为了本身的Entry,而第二个线程执行到了e.next，此时拿到的是最后一个节点，依然会将本身持有是数据插入到链表中，这样就出现了数据重复。经过商品put源码能够发现，是先将数据写入到map中，再根据元素到个数再决定是否作resize.在resize过程当中还会出现一个更为诡异都问题死循环。这个缘由主要是由于hashMap在resize过程当中对链表进行了一次倒序处理。假设两个线程同时进行resize,

A->B 第一线程在处理过程当中比较慢，第二个线程已经完成了倒序编程了B-A 那么就出现了循环，B->A->B.这样就出现了就会出现CPU使用率飙升。

在下午忽然收到其中一台机器CPU利用率不足告警，将jstack内容分析发现，可能出现了死循环和数据丢失状况，固然对于链表的操做一样存在问题。

PS:在这个过程当中能够发现，之因此出现死循环，主要仍是在于对于链表对倒序处理，在Java 8中，已经不在使用倒序列表，死循环问题获得了极大改善。

下图是负载和CPU的表现：

下面是线程栈的部分日志：

DubboServerHandler-10.172.75.33:20880-thread-139" daemon prio=10 tid=0x0000000004a93000 nid=0x76fe runnable [0x00007f0ddaf2d000]
   java.lang.Thread.State: RUNNABLE
	at java.util.HashMap.getEntry(HashMap.java:465)
	at java.util.HashMap.containsKey(HashMap.java:449)
	
"pool-9-thread-16" prio=10 tid=0x00000000033ef000 nid=0x4897 runnable [0x00007f0dd62cb000]
   java.lang.Thread.State: RUNNABLE
	at java.util.HashMap.put(HashMap.java:494)

DubboServerHandler-10.172.75.33:20880-thread-189" daemon prio=10 tid=0x00007f0de99df800 nid=0x7722 runnable [0x00007f0dd8b09000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.Thread.yield(Native Method)
	

	DubboServerHandler-10.172.75.33:20880-thread-157" daemon prio=10 tid=0x00007f0de9a94800 nid=0x7705 runnable [0x00007f0dda826000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.Thread.yield(Native Method)

网易云大礼包：https://www.163yun.com/gift

本文来自网易实践者社区，经做者张伟受权发布