缓存雪崩、穿透如何解决，如何确保Redis只缓存热点数据？

<div class="rich_media_content " id="js_content">面试
<p><br></p><ul class="list-paddingleft-2" style="list-style-type: disc;margin-left: 8px;margin-right: 8px;"><li><p class=""><span style="font-size: 15px;">缓存雪崩如何解决?</span><br></p></li><li><p class=""><span style="font-size: 15px;">缓存穿透如何解决?</span></p></li><li><p class=""><span style="font-size: 15px;">如何确保Redis缓存的都是热点数据？</span></p></li><li><p class=""><span style="font-size: 15px;">如何更新缓存数据？</span></p></li><li><p class=""><span style="font-size: 15px;">如何处理请求倾斜？</span></p></li><li><p class=""><span style="font-size: 15px;">实际业务场景下，如何选择缓存数据结构</span></p></li></ul><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-weight: 700;font-size: 20px;color: rgb(0, 0, 0);">缓存雪崩</span></section><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">缓存雪崩简单说就是全部请求都从缓存中拿不到数据，好比大批量数据同一时间过时。对于大批量数据同时过时的场景，能够为数据设置过时时间指定一个时间范围内的随机值，好比一天到一天零一小时之间的随机值，但不适用于集合类型，好比hash。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">还有小数场景，好比高峰流量致使Redis集群崩溃；未配置持久化的redis无从节点Cluster集群重启、集群迁移。当Redis集群发生故障时，可先启用内存缓存方案，好比Ehcache，同时根据状况作限流与降级，最后快速重启集群，必须配置持久化策略，根据流量状况扩展集群。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-weight: 700;font-size: 20px;color: rgb(0, 0, 0);">缓存穿透</span></section><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">缓存穿透简单理解就是数据库中也没有对应的记录，永远都不会命中缓存。好比表中的记录只有id从1000到100000，请求查询id为10000000的记录。通常是恶意攻击，针对这种状况最好的处理方式就是判断id的有效范围，其它状况能够针对查询的key缓存一个null值，并设置ttl过时时间。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-weight: bold;color: rgb(0, 0, 0);font-size: 20px;">如何确保Redis缓存的都是热点数据</span></section><section data-role="outer" label="Powered by 135editor.com" style="font-size: 16px;"><section class="" data-role="paragraph"><p><br></p></section></section><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">A、为key设置ttl过时时间</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">适用于对实时性要求不高的业务场景；适用于能够容忍获取到的是过时数据的业务场景。过时时间会在每次读写key时刷新。为确保缓存中不遗留垃圾数据，通常都会为key设置过时时间，除了那些不会改变且一直会用到，也不会更新的数据，好比笔者前几篇文章提到的IP库。</span></section><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">B、选择缓存淘汰策略</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">选择淘汰最近最少使用的缓存淘汰策略能够保证缓存中都是热点数据，但这个策略只会在内存吃紧的状况下起效果，通常要保证缓存的数据都是热点数据就是在redis内存不够用的状况下。建议及时作缓存数据清理，依靠缓存淘汰策略的时候性能也会有所降低。</span></section><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">C、缓存访问次数，定时清除访问次数少的记录</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">好比用Sorted Set缓存key的读次数，周期性的去删除访问次数小于多少的key。适用于hash等集合类型，计录field的读次数，缺点是每次请求都有统计次数的性能开销。</span></section><p class=""><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-weight: bold;color: rgb(0, 0, 0);font-size: 20px;">如何更新缓存数据</span></section><section data-role="outer" label="Powered by 135editor.com" style="font-size: 16px;"><section class="" data-role="paragraph"><p><br></p></section></section><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;font-family: mp-quote, -apple-system-font, BlinkMacSystemFont, 'Helvetica Neue', 'PingFang SC', 'Hiragino Sans GB', 'Microsoft YaHei UI', 'Microsoft YaHei', Arial, sans-serif;">A、在数据库修改记录时使用MQ队列通知更新</span><br></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">适用于那种比较少改动的缓存记录，好比用户信息；适用于要求数据修改及时更新缓存的业务场景，如一些配置的修改要求及时生效。但不适用于要求很是实时的场景，好比商品库存。</span></section><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">B、在修改数据库记录时直接更新缓存</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">这种方法与前一种方法均可利用AOP方式去更新，区别在于，前者解决多个服务之间的耦合问题，用于跨服务数据更新。小公司为考虑成本问题不会为每一个服务使用独立的Redis集群，后者只能用于单个服务内的数据更新。即使是多个微服务使用同一个Redis集群，也不要经过共用key的方式共享缓存，不然耦合性太大，容易出问题。</span></section><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">C、定时任务批量更新</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">配合ttl使用，ttl的时间设置比定时任务周期长一点，避免数据过时了新的任务还没执行完成。适用于实时性要求不是很高，且短期内大量数据更新的业务场景。好比数据库有10w数据，每15分钟都会有百分七八十的数据变动，且变动时间只在一分钟内。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">若是是集合类型、Hash类型，通常会配合Rename使用，只有全部数据写入到redis成功，才原子性替换旧数据。且数据量大的状况下使用pipeline批量写入，避免使用hmset这类批量操做。使用hash这类集合类型时，必定要考虑到脏数据的问题。</span></section><p class=""><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-weight: bold;color: rgb(0, 0, 0);font-size: 20px;">如何处理请求倾斜问题</span></section><section data-role="outer" label="Powered by 135editor.com" style="font-size: 16px;"><section class="" data-role="paragraph"><p><br></p></section></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">Cluster分槽会致使缓存数据倾斜，从而致使请求倾斜。假设一个三个小主从的Cluster集群，平均分配槽位，大量的key落到第二个节点上，致使请求都偏向第二个节点。致使这个问题的主要缘由是，大量key为hash、set、sorted sort类型，且每一个集合数据量都比较大。其次是HashTag的不合理使用。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">解决方案，一是将大hash分段存储，二是减小HashTag的使用，三是从新分配槽位，将第二个节点的槽位根据实际状况分配一些给其它两个节点。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-weight: bold;color: rgb(0, 0, 0);font-size: 20px;">实际业务场景下，如何选择缓存数据结构</span></section><p><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">拿我最熟悉的广告行业，举几个简单例子。</span></section><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">a、判断一个广告单是否过时</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">使用hash、bitmap均可实现。bitmap适用于判断true or false的业务需求。bitmap的读写速度都优于hash，且内存占用少。但出于其它需求，我选择hash。bitmap用于其它业务需求，如快速判断offer每日展现数是否达到上限。</span></section><p class=""><br></p><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">b、统计每一个渠道的拉取广告次数</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">简单的key-value以及hash都支持incr自增，且操做原子性。为减小缓存中key的数据，我选择hash，同时也由于hash支持hgetall，用于实时统计以及方便问题排查。</span></section><p class=""><br></p><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">c、根据标签限CAP</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">Capacity，即容量，如根据国家、城市、渠道、广告主等标签限制广告的展现次数，一个广告可能同时会匹配到多个标签，当达到最小Capacity时，即断定为true。经过Sorted Set存储一个广告匹配的全部标签，根据当前展现次数经过zcount获取匹配的标签总数，判断zcount结果是否大于零便可。</span></section><p class=""><br></p><p class=""><br></p><p><br></p><section style="text-indent: 2em;margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">d、过滤每日重复ip</span></section><p><br></p><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">如用于过滤短期内重复点击广告的用户，只是举个例子。这时就能够利用HyperLogLog存储IP，HyperLogLog会过滤重复数据，准确率有偏差，但对业务影响甚微。</span></section><p class=""><br></p><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;">仅为我的观点，假设你是面试者，欢迎留言写下你的答案！</span></section><section style="margin-left: 8px;margin-right: 8px;"><span style="font-size: 15px;"><br mpa-from-tpl="t"></span></section>
原文地址：https://mp.weixin.qq.com/s/-aOHMe3uOqiJt2Km4fkwGg </div>redis