hashmap的hash方法源doc解读

时间 2019-12-14

标签 hashmap hash 法源 doc 解读繁體版

原文原文链接

/**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

上次在面试中被问及一个问题：若是直接拿key的内存地址的long值与table的长度作取余操做（%），有什么很差？面试

我作了一番研究。app

first = tab[(n - 1) & hash]

首先，在计算一个key在table中的位置时，用的是table的长度减1，与hash值取位与的结果。而不是取余(%)操做。ide

若是一个table的长度为8，那么n=8 (1000)，n-1=7 (111)，若是hash是什么值，取and的结果必定是000 ~ 111 之间，即0-7，正好对应table的index的范围。spa

注释中写道，Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide.翻译

翻译过来就是：table的长度老是2的n次幂，若是一组hash值只是在(111....1111)之上的高位互相不一样，那么它们与(n-1) 位与的结果总会碰撞。code

一句话归纳就是，key只有与(n-1)低位为1的长度相同位参与了hash碰撞的计算，高位没有体现出来。orm

JDK做者的解决方案是：(h = key.hashCode()) ^ (h >>> 16)， JDK的doc中一开始说: spread higher bits of hash to lowerblog

将高位的影响传播到低位，这样与(n-1)位与的计算，高低位就同时参与了。内存

咱们都知道，一个int值是32位的，hash >>> 16 的含义就是右移16位，左边以0补齐。移位的结果是，低16位被抛弃，原高16位变成新低16位，新高16位用0补充。hash

0与0异或是0，0与1异或是1，即一个bit与0异或结果不变。因此，hash xor (hash >>> 16) 的最终结果是：高16位不变，低16位与高16位异或。

若是 (n-1) 的二进制表示有16位，那么 n = 2的16次方 = 65536，hashmap的容量只要不大于65536，都是高低混合之16位在参与碰撞检测。