上一篇文章HashMap内部结构提到了 HashMap 有一个扰动函数,来判断元素落在数组的位置。下面经过具体的例子说明。html
public V get(Object key) { Node<K,V> e; return (e = getNode(hash(key), key)) == null ? null : e.value; } static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } final Node<K,V> getNode(int hash, Object key) { Node<K,V>[] tab; Node<K,V> first, e; int n; K k; if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) { ... } return null; }
get 方法如何肯定key在数组中的位置,先经过 hash(key) 再经过 tab[(n-1) & hash] 来肯定位置。java
(h = key.hashCode) ^ (h >>> 16) 这个是什么意思?数组
h >>> 16 表示将hashCode的二进制码右移16位函数
^ 表示按位异或,也就是2个二进制码异或,2个数不一样则结果为1,不然为0。spa
举个例子.net
混合高位和地位来加大随机性3d
那么 (n-1) & hash 又是什么意思?code
n-1 表示map数组的长度减1htm
& 按位与,2个进制码相与,2个数相同则结果1,不然为0。blog
接着上面的例子,假设如今n等于map的默认长度 16
其实就是保留最后4位,将其余位都清零,再转换成10进制 0100就是4,也就是在 tab[4] 这个地方读取数据。若是进行了一次扩容那么数组的长度会扩展到32,这样就是根据二进制最后的5位来判断数组的位置(32 的二进制为 100000,31为 11111)。这也是为何map数组的长度必需是2的n次方(a power of two),2的n次方-1 转换成二进制末尾都是1,长度不一样。利用这种方式来使得插入的数据尽可能不会落在同一个地方,均匀分布在数组的各个位置。
http://vanillajava.blogspot.com/2015/09/an-introduction-to-optimising-hashing.html
上面这篇文章详细的说明了 hash 策略,hash冲撞发生的几率。
经过本身写的简单代码模拟一下
String str1 = "abcd"; String str2 = "a"; String str3 = "cc"; String str4 = "d"; String[] table = new String[4]; System.out.println("str1 = " + str1); System.out.println("str1.hashCode() = " + str1.hashCode()); System.out.println("str1.hashCode() >>> 16 = " + (str1.hashCode() >>> 16)); System.out.println("str1.hashCode() ^ (str1.hashCode() >>> 16) = " + ((str1.hashCode()) ^ (str1.hashCode() >>> 16))); System.out.println("table.length - 1 = " + (table.length - 1)); System.out.println("(table.length - 1) & hash1 = " + ((table.length - 1) & ((str1.hashCode()) ^ (str1.hashCode() >>> 16)))); System.out.println(); System.out.println("str2 = " + str2); System.out.println("str2.hashCode() = " + str2.hashCode()); System.out.println("str2.hashCode() >>> 16 = " + (str2.hashCode() >>> 16)); System.out.println("str2.hashCode() ^ (str2.hashCode() >>> 16) = " + ((str2.hashCode()) ^ (str2.hashCode() >>> 16))); System.out.println("table.length - 1 = " + (table.length - 1)); System.out.println("(table.length - 1) & hash2 = " + ((table.length - 1) & ((str2.hashCode()) ^ (str2.hashCode() >>> 16)))); System.out.println(); System.out.println("str3 = " + str3); System.out.println("str3.hashCode() = " + str3.hashCode()); System.out.println("str3.hashCode() >>> 16 = " + (str3.hashCode() >>> 16)); System.out.println("str3.hashCode() ^ (str3.hashCode() >>> 16) = " + ((str3.hashCode()) ^ (str3.hashCode() >>> 16))); System.out.println("table.length - 1 = " + (table.length - 1)); System.out.println("(table.length - 1) & hash3 = " + ((table.length - 1) & ((str3.hashCode()) ^ (str3.hashCode() >>> 16)))); System.out.println(); System.out.println("str4 = " + str4); System.out.println("str4.hashCode() = " + str4.hashCode()); System.out.println("str4.hashCode() >>> 16 = " + (str4.hashCode() >>> 16)); System.out.println("str4.hashCode() ^ (str4.hashCode() >>> 16) = " + ((str4.hashCode()) ^ (str4.hashCode() >>> 16))); System.out.println("table.length - 1 = " + (table.length - 1)); System.out.println("(table.length - 1) & hash4 = " + ((table.length - 1) & ((str4.hashCode()) ^ (str4.hashCode() >>> 16)))); int hash1 = hash(str1); int index1 = (table.length - 1) & hash1; table[index1] = str1; int hash2 = hash(str2); int index2 = (table.length - 1) & hash2; table[index2] = str2; int hash3 = hash(str3); int index3 = (table.length - 1) & hash3; table[index3] = str3; int hash4 = hash(str4); int index4 = (table.length - 1) & hash4; table[index4] = str4; System.out.println(JSON.toJSONString(table));
初始化了一个长度为4的数组,利用扰动函数分别将 str1,str2,str3,str4插入到数组。
输出的结果是
str1 = abcd str1.hashCode() = 2987074 str1.hashCode() >>> 16 = 45 str1.hashCode() ^ (str1.hashCode() >>> 16) = 2987119 table.length - 1 = 3 (table.length - 1) & hash1 = 3 str2 = a str2.hashCode() = 97 str2.hashCode() >>> 16 = 0 str2.hashCode() ^ (str2.hashCode() >>> 16) = 97 table.length - 1 = 3 (table.length - 1) & hash2 = 1 str3 = cc str3.hashCode() = 3168 str3.hashCode() >>> 16 = 0 str3.hashCode() ^ (str3.hashCode() >>> 16) = 3168 table.length - 1 = 3 (table.length - 1) & hash3 = 0 str4 = d str4.hashCode() = 100 str4.hashCode() >>> 16 = 0 str4.hashCode() ^ (str4.hashCode() >>> 16) = 100 table.length - 1 = 3 (table.length - 1) & hash4 = 0 ["d","a",null,"abcd"]
由于被输出到控制台,因此二进制被转换成10进制了,在计算机内部都是二进制计算的。可是不影响看结果,结果是cc字符串和d字符串计算出来数组的位置都是0,我这里是直接覆盖了,若是是 HashMap 的话,这里就要转换成链表了。