ArrayMap详解及源码分析

时间 2019-11-24

标签 arraymap 详解源码分析繁體版

原文原文链接

1、前言

在《SparseArray详解及源码简析》中，咱们熟悉了 SparseArray 的基本用法、特色以及实现原理。而在 Android SDK 的这个工具包中还有一个一样重要的数据结构 ArrayMap，其目的也是在当数据量较小，好比几百个的时候，能够用来替代 HashMap，以提升内存的使用效率。算法

若是对 HashMap 的实现感兴趣的话，能够看看《HashMap详解以及源码分析》，而这篇文章就来了解一下 ArrayMap 的使用及其实现原理。数组

2、源码简析

1. demo 及其简析

分析代码以前一样先看一段 demo，后面一样经过 demo 进行实现原理的分析。bash

ArrayMap<String,String> arrayMap = new ArrayMap<>();
        arrayMap.put(null,"张大哥");
        arrayMap.put("abcd","A大哥");
        arrayMap.put("aabb","巴大哥");
        arrayMap.put("aacc","牛大哥");
        arrayMap.put("aadd","牛大哥");
        arrayMap.put("abcd","B大哥");

        Set<ArrayMap.Entry<String,String>> sets = arrayMap.entrySet();
        for (ArrayMap.Entry<String,String> set : sets) {
            Log.d(TAG, "arrayMapSample: key = " + set.getKey() + ";value = " + set.getValue());
        }
复制代码

代码中，实际插入了 6 个 Key-Value，然而输出只有 5 个，其中 Key 为 “abcd” 的重复了而发生了覆盖。另外，还有一点注意的是 null 为 key 是容许插入的。如下是其输出的结果。数据结构

arrayMapSample: key = null;value = 张大哥 arrayMapSample: key = aabb;value = 巴大哥 arrayMapSample: key = aacc;value = 牛大哥 arrayMapSample: key = aadd;value = 牛大哥 arrayMapSample: key = abcd;value = B大哥app

经过 Android Studio 的 Debug 功能，也能够简单观察一下其在内存中的存储。 ide

2.源码分析

先来简单看一下 ArrayMap 的类图结构。工具

与 HashMap 不一样的是，它是直接实现自接口 map。一样，存储 key-value 的方式也不一样。ArrayMap 是经过数组直接存储了全部的 key-value。其中，mHashes 在 index 处存储了 key 的 hash code，而 mArray 则在 hash code 的 index<<1 处存储 key，在 index<<1 + 1 处存储 value。简单点说就是偶数处存储 key，相邻奇数处存储 value。源码分析

ArrayMap 的初始化

/**
     * Create a new empty ArrayMap.  The default capacity of an array map is 0, and
     * will grow once items are added to it.
     */
    public ArrayMap() {
        this(0, false);
    }

    /**
     * Create a new ArrayMap with a given initial capacity.
     */
    public ArrayMap(int capacity) {
        this(capacity, false);
    }

    /** {@hide} */
    public ArrayMap(int capacity, boolean identityHashCode) {
        mIdentityHashCode = identityHashCode;

        // If this is immutable, use the sentinal EMPTY_IMMUTABLE_INTS
        // instance instead of the usual EmptyArray.INT. The reference
        // is checked later to see if the array is allowed to grow.
        if (capacity < 0) {
            mHashes = EMPTY_IMMUTABLE_INTS;
            mArray = EmptyArray.OBJECT;
        } else if (capacity == 0) {
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
        } else {
            allocArrays(capacity);
        }
        mSize = 0;
    }
复制代码

ArrayMap 的构造方法有 3 个重载的版本都列在上面了，通常咱们都用默认的构造方法，那也就是说默认容量大小就是 0，须要等待到插入元素时才会进行扩容的动做。构造方法中的另外一个参数 identityHashCode 控制 hashCode 是由 System 类产生仍是由 Object.hashCode() 返回。这二者之间的实现其实没太大区别，由于 System 类最终也是经过 Object.hashCode() 来实现的。其主要就是对 null 进行了特殊处理，好比一概为 0。而在 ArrayMap 的 put() 方法中，若是 key 为 null 也将其 hashCode 视为 0 了。因此这里 identityHashCode 为 true 或者 false 都是同样的。ui

插入元素 put()

public V put(K key, V value) {
        final int osize = mSize;
        // 1.计算 hash code 并获取 index
        final int hash;
        int index;
        if (key == null) {
            // 为空直接取 0
            hash = 0;
            index = indexOfNull();
        } else {
            // 不然取 Object.hashCode()
            hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
            index = indexOf(key, hash);
        }
        // 2.若是 index 大于等于 0 ，说明以前存在相同的 hash code 且 key 也相同，则直接覆盖
        if (index >= 0) {
            index = (index<<1) + 1;
            final V old = (V)mArray[index];
            mArray[index] = value;
            return old;
        }
        // 3.若是没有找到则上面的 indexOf() 或者  indexOfNull() 就会返回一个负数，而这个负数就是由将要插入的位置 index 取反获得的，因此这里再次取反就变成了将进行插入的位置
        index = ~index;
        // 4.判断是否须要扩容
        if (osize >= mHashes.length) {
            final int n = osize >= (BASE_SIZE*2) ? (osize+(osize>>1))
                    : (osize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

            if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            // 5.申请新的空间
            allocArrays(n);

            if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
                throw new ConcurrentModificationException();
            }

            if (mHashes.length > 0) {
                if (DEBUG) Log.d(TAG, "put: copy 0-" + osize + " to 0");
                // 将数据复制到新的数组中
                System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
                System.arraycopy(oarray, 0, mArray, 0, oarray.length);
            }
            // 6.释放旧的数组
            freeArrays(ohashes, oarray, osize);
        }

        if (index < osize) {
            // 7.若是 index 在当前 size 以内，则须要将 index 开始的数据移到 index + 1 处，以腾出 index 的位置
            if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (osize-index)
                    + " to " + (index+1));
            System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
            System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
        }

        if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
            if (osize != mSize || index >= mHashes.length) {
                throw new ConcurrentModificationException();
            }
        }
        // 8.而后根据计算获得的 index 分别插入 hash，key，以及 code
        mHashes[index] = hash;
        mArray[index<<1] = key;
        mArray[(index<<1)+1] = value;
        mSize++;
        return null;
    }
复制代码

put 方法调用了其余几个内部的方法，其中关于扩容以及如何释放空间，申请新的空间这些，从算法层来说其实不重要，只要知道一点就是，扩容会发生数据的复制，这个是会影响效率的就能够了。而与算法相关性较大的 indexOfNull() 方法以及 indexOf() 方法的实现。因为这两个方法的实现基本同样，所以这里只分析 indexOf() 的实现。this

int indexOf(Object key, int hash) {
        final int N = mSize;

        // Important fast case: if nothing is in here, nothing to look for.
        if (N == 0) {
            return ~0;
        }

        int index = binarySearchHashes(mHashes, N, hash);

        // If the hash code wasn't found, then we have no entry for this key. if (index < 0) { return index; } // If the key at the returned index matches, that's what we want.
        if (key.equals(mArray[index<<1])) {
            return index;
        }

        // Search for a matching key after the index.
        int end;
        for (end = index + 1; end < N && mHashes[end] == hash; end++) {
            if (key.equals(mArray[end << 1])) return end;
        }

        // Search for a matching key before the index.
        for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
            if (key.equals(mArray[i << 1])) return i;
        }

        // Key not found -- return negative value indicating where a
        // new entry for this key should go.  We use the end of the
        // hash chain to reduce the number of array entries that will
        // need to be copied when inserting.
        return ~end;
    }

复制代码

其实它原来的注释已经很详细了，详细的步骤是：

(1) 若是当前为空表，则直接返回 ~0，注意不是 0 ，而是最大的负数。

(2) 在 mHashs 数组中进行二分查找，找到 hash 的 index。

(3) 若是 index < 0，说明没有找到。

(4) 若是 index >= 0，且在 mArray 中对应的 index<<1 处的 key 与要找的 key 又相同，则认为是同一个 key，说明找到了。

(5) 若是 key 不相同，说明只是 hash code 相同，那么分别向后和向前进行搜索，若是找到了就返回。若是没找到，那么对 end 取反就是当前须要插入的 index 位置。

再回过头来看 put() 方法， put() 方法的具体实现都在源码中加以了详细的说明，感兴趣的能够详细阅读一下。而从 put 方法得出如下几个结论：

(1) mHashs 数组以升序的方式保存了全部的 hash code。

(2) 经过 hash code 在 mHashs 数组里的 index 值来肯定 key 以及 value 在 mArrays 数组中的存储位置。通常来讲分别就是 index << 1 以及 index << 1 + 1。再简单点说就是 index * 2 以及 index * 2 + 1。

(3) hashCode 必然可能存在冲突，这里是怎么解决的呢？这个是由上面的第 3 步和第 7 步所决定。第 3 步是得出应该插入的 index 的位置，而第 7 步则是若是 index < osize ，则说明原来 mArrays 中必然已经存在相同 hashCode 的值了，那么就把数据所有日后移一位，从而在 mHashs 中插入多个相同的 hash code 而且必定是链接在一块儿的，而在 mArrays 中插入新的 key 和 value，最终得以解决 hash 冲突。

上面的结论可能仍是让人以为有点晕，那么再来看看下面的图吧，就必定能明白了。

上面图说， index == 0 时和 index == 1时的 hash code 是同样的，说明 key1 与 key2 的 hash code 是同样的，也就是存在 hash 冲突了。那么，如上，这里的解决办法就是 hash code 存储了 2 份，而 key-value 分别存储一份。

get() 方法

public V get(Object key) {
        final int index = indexOfKey(key);
        return index >= 0 ? (V)mArray[(index<<1)+1] : null;
    }
复制代码

主要就是经过 indexOfKey() 计算出 index，而 indexOfKey() 的实现就是调用 indexOfNull () 和 indexOf()，其具体的实现已经上面分析过了。这里若是返了 index >= 0，则说明必定是找到了，那么根据前面的规则，在 mArray 中，index<<1 + 1 就是所要获取的 value 了。

remove() 方法

public V remove(Object key) {
        final int index = indexOfKey(key);
        if (index >= 0) {
            return removeAt(index);
        }
        return null;
    }
复制代码

首先经过 indexOfKey() 计算出 index 以判断其是否存在，若是存在则进一步调用 removeAt() 来删除相应的 hash code 以及 key-value。

public V removeAt(int index) {
        final Object old = mArray[(index << 1) + 1];
        final int osize = mSize;
        final int nsize;
        // 若是 size 小于等于1 ，移除后数组长度将为 0。为了压缩内存，这里直接将mHashs 以及 mArray 置为了空数组
        if (osize <= 1) {
            // Now empty.
            if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
            freeArrays(ohashes, oarray, osize);
            nsize = 0;
        } else {
            // size > 1 的状况，则先将 size - 1
            nsize = osize - 1;
            if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
                // 若是上面的条件符合，那么就要进行数据的压缩。 
                // Shrunk enough to reduce size of arrays.  We don't allow it to // shrink smaller than (BASE_SIZE*2) to avoid flapping between // that and BASE_SIZE. final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2); if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n); final int[] ohashes = mHashes; final Object[] oarray = mArray; allocArrays(n); if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) { throw new ConcurrentModificationException(); } if (index > 0) { if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0"); System.arraycopy(ohashes, 0, mHashes, 0, index); System.arraycopy(oarray, 0, mArray, 0, index << 1); } if (index < nsize) { if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + nsize + " to " + index); System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index); System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1); } } else { if (index < nsize) { // 若是 index 在 size 内，则将数据往前移一位 if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + nsize + " to " + index); System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index); System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1); } // 而后将最后一位数据置 null mArray[nsize << 1] = null; mArray[(nsize << 1) + 1] = null; } } if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) { throw new ConcurrentModificationException(); } mSize = nsize; return (V)old; } 复制代码

通常状况下删除一个数据，只须要将 index 后面的数据都往 index 方向移一位，而后删除末位数便可。而若是当前的数组中的条件达到 mHashs 的长度大于 BASE_SIZE2 且实际大小又小于其长度的 1/3，那么就要进行数据的压缩。而压缩后的空间至少也是 BASE_SIZE2 的大小。

3、总结

ArrayMap 中比较重要的是 put() 方法以及 remvoeAt() 方法的实现，这两个方法基本实现了 ArrayMap 的全部重要的特性。这里再重复一下以做为全文的总结。

mHashs 数组以升序的方式保存了全部的 hash code，在查找数据时则经过二分查找 hash code 所对应的 index。这也是它的 get() 比 HashMap 慢的根据缘由所在。
经过 hash code 在 mHashs 数组里的 index 值来肯定 key 以及 value 在 mArrays 数组中的存储位置。通常来讲分别就是 index << 1 以及 index << 1 + 1。再简单点说就是 index * 2 以及 index * 2 + 1。
hashCode 必然可能存在冲突，这里是怎么解决的呢？简单点说就是，在 mHashs 中相邻地存多份 hash code，而在 mArray 中分别以它们的 index 来计算 key-value 的存储位置。
当进行 remove 操做时，在必定条件下，可能会发生数据的压缩，从而节省内存的使用。