一致性Hash算法在Redis分布式中的使用

时间 2019-11-13

标签一致性 hash 算法 redis 分布式使用栏目 Zookeeper 繁體版

原文原文链接

　　因为redis是单点，可是项目中不可避免的会使用多台Redis缓存服务器，那么怎么把缓存的Key均匀的映射到多台Redis服务器上，且随着缓存服务器的增长或减小时作到最小化的减小缓存Key的命中率呢？这样就须要咱们本身实现分布式。html

　　Memcached对你们应该不陌生，经过把Key映射到Memcached Server上，实现快速读取。咱们能够动态对其节点增长，并未影响以前已经映射到内存的Key与memcached Server之间的关系，这就是由于使用了一致性哈希。
由于Memcached的哈希策略是在其客户端实现的，所以不一样的客户端实现也有区别，以Spymemcache、Xmemcache为例，都是使用了KETAMA做为其实现。java

　　所以，咱们也可使用一致性hash算法来解决Redis分布式这个问题。在介绍一致性hash算法以前，先介绍一下我以前想的一个方法，怎么把Key均匀的映射到多台Redis Server上。node

　　因为LZ水平有限且对Redis研究的不深，文中有写的不对的地方请指正。redis

方案一

该方案是前几天想的一个方法，主要思路是经过对缓存Key中的字母和数字的ascii码值求sum，该sum值对Redis Server总数取余获得的数字即为该Key映射到的Redis Server，该方法有一个很大的缺陷就是当Redis Server增长或减小时，基本上全部的Key都映射不到对应的的Redis Server了。代码以下：算法

    /// <summary>
        /// 根据缓存的Key映射对应的Server
        /// </summary>
        /// <param name="Key"></param>
        /// <returns></returns>
        public static RedisClient GetRedisClientByKey(string Key)
        {
            List<RedisClientInfo> RedisClientList = new List<RedisClientInfo>();
            RedisClientList.Add(new RedisClientInfo() { Num = 0, IPPort = "127.0.0.1:6379" });
            RedisClientList.Add(new RedisClientInfo() { Num = 1, IPPort = "127.0.0.1:9001" });

            char[] charKey = Key.ToCharArray();
            //记录Key中的全部字母与数字的ascii码和
            int KeyNum = 0;
            //记录余数
            int Num = 0;
            foreach (var c in charKey)
            {
                if ((c >= 'a' && 'z' >= c) || (c >= 'A' && 'Z' >= c))
                {
                    System.Text.ASCIIEncoding asciiEncoding = new System.Text.ASCIIEncoding();
                    KeyNum = KeyNum + (int)asciiEncoding.GetBytes(c.ToString())[0];
                }
                if (c >= '1' && '9' >= c)
                {
                    KeyNum += Convert.ToInt32(c.ToString());
                }
            }
            Num = KeyNum % RedisClientList.Count;
            return new RedisClient(RedisClientList.Where(it => it.Num == Num).First().IPPort);
        }
        //Redis客户端信息
        public class RedisClientInfo
        {
            //Redis Server编号
            public int Num { get; set; }
            //Redis Server IP地址和端口号
            public string IPPort { get; set; }
        }

方案二

一、分布式实现

经过key作一致性哈希，实现key对应redis结点的分布。数组

一致性哈希的实现：缓存

hash值计算：经过支持MD5与MurmurHash两种计算方式，默认是采用MurmurHash，高效的hash计算。
一致性的实现：经过java的TreeMap来模拟环状结构，实现均匀分布

什么也很少说了，直接上代码吧，LZ也是只知道点皮毛，代码中还有一些看不懂的地方，留着之后慢慢琢磨服务器

public class KetamaNodeLocator
    {
        //原文中的JAVA类TreeMap实现了Comparator方法，这里我图省事，直接用了net下的SortedList，其中Comparer接口方法）
        private SortedList<long, string> ketamaNodes = new SortedList<long, string>();
        private HashAlgorithm hashAlg;
        private int numReps = 160;
        //此处参数与JAVA版中有区别，由于使用的静态方法，因此再也不传递HashAlgorithm alg参数
        public KetamaNodeLocator(List<string> nodes/*，int nodeCopies*/)
        {
            ketamaNodes = new SortedList<long, string>();
            //numReps = nodeCopies;
            //对全部节点，生成nCopies个虚拟结点
            foreach (string node in nodes)
            {
                //每四个虚拟结点为一组
                for (int i = 0; i < numReps / 4; i++)
                {
                    //getKeyForNode方法为这组虚拟结点获得唯一名称 
                    byte[] digest = HashAlgorithm.computeMd5(node + i);
                    /** Md5是一个16字节长度的数组，将16字节的数组每四个字节一组，分别对应一个虚拟结点，这就是为何上面把虚拟结点四个划分一组的缘由*/
                    for (int h = 0; h < 4; h++)
                    {
                        long m = HashAlgorithm.hash(digest, h);
                        ketamaNodes[m] = node;
                    }
                }
            }
        }
        public string GetPrimary(string k)
        {
            byte[] digest = HashAlgorithm.computeMd5(k);
            string rv = GetNodeForKey(HashAlgorithm.hash(digest, 0));
            return rv;
        }
        string GetNodeForKey(long hash)
        {
            string rv;
            long key = hash;
            //若是找到这个节点，直接取节点，返回   
            if (!ketamaNodes.ContainsKey(key))
            {
                //获得大于当前key的那个子Map，而后从中取出第一个key，就是大于且离它最近的那个key 说明详见: http://www.javaeye.com/topic/684087
                var tailMap = from coll in ketamaNodes
                              where coll.Key > hash
                              select new { coll.Key };
                if (tailMap == null || tailMap.Count() == 0)
                    key = ketamaNodes.FirstOrDefault().Key;
                else
                    key = tailMap.FirstOrDefault().Key;
            }
            rv = ketamaNodes[key];
            return rv;
        }
    }
    public class HashAlgorithm
    {
        public static long hash(byte[] digest, int nTime)
        {
            long rv = ((long)(digest[3 + nTime * 4] & 0xFF) << 24)
                    | ((long)(digest[2 + nTime * 4] & 0xFF) << 16)
                    | ((long)(digest[1 + nTime * 4] & 0xFF) << 8)
                    | ((long)digest[0 + nTime * 4] & 0xFF);
            return rv & 0xffffffffL; /* Truncate to 32-bits */
        }
        /**
         * Get the md5 of the given key.
         */
        public static byte[] computeMd5(string k)
        {
            MD5 md5 = new MD5CryptoServiceProvider();

            byte[] keyBytes = md5.ComputeHash(Encoding.UTF8.GetBytes(k));
            md5.Clear();
            //md5.update(keyBytes);
            //return md5.digest();
            return keyBytes;
        }
    }

二、分布式测试

一、假设有两个server：0001和0002，循环调用10次看看Key值能不能均匀的映射到server上，代码以下：分布式

    static void Main(string[] args)
        {
            //假设的server
            List<string> nodes = new List<string>() { "0001","0002" };
            KetamaNodeLocator k = new KetamaNodeLocator(nodes);
            string str = "";
            for (int i = 0; i < 10; i++)
            {
                string Key="user_" + i;
                str += string.Format("Key:{0}分配到的Server为：{1}\n\n", Key, k.GetPrimary(Key));
            }
            
            Console.WriteLine(str);
           
            Console.ReadLine();
             
        }

程序运行两次的结果以下，发现Key基本上均匀的分配到Server节点上了。ide

二、咱们在添加一个0003的server节点，代码以下：

  static void Main(string[] args)
        {
            //假设的server
            List<string> nodes = new List<string>() { "0001","0002" ,"0003"};
            KetamaNodeLocator k = new KetamaNodeLocator(nodes);
            string str = "";
            for (int i = 0; i < 10; i++)
            {
                string Key="user_" + i;
                str += string.Format("Key:{0}分配到的Server为：{1}\n\n", Key, k.GetPrimary(Key));
            }
            
            Console.WriteLine(str);
           
            Console.ReadLine();
             
        }

程序运行两次的结果以下：

对比第一次的运行结果发现只有user_5,user_7,user_9的缓存丢失，其余的缓存还能够命中。

三、咱们去掉server 0002，运行两次的结果以下:

对比第二次和本次运行结果发现 user_0,user_1,user_6 缓存丢失。

结论

经过一致性hash算法能够很好的解决Redis分布式的问题，且当Redis server增长或减小的时候，以前存储的缓存命中率仍是比较高的。

关于Redis的其余文章

http://www.cnblogs.com/lc-chenlong/p/4194150.html
http://www.cnblogs.com/lc-chenlong/p/4195033.html
http://www.cnblogs.com/lc-chenlong/p/3218157.html

本文参考

一、http://blog.csdn.net/chen77716/article/details/5949166

二、http://www.cr173.com/html/6474_2.html