深刻理解.NET MemoryCache

时间 2019-11-05

标签深刻理解 memorycache 繁體版

原文原文链接

摘要

MemoryCache是.Net Framework 4.0开始提供的内存缓存类，使用该类型能够方便的在程序内部缓存数据并对于数据的有效性进行方便的管理，借助该类型能够实现ASP.NET中经常使用的Cache类的类似功能，而且能够适应更加丰富的使用场景。在使用MemoryCache时经常有各类疑问，数据是怎么组织的？有没有可能用更高效的组织和使用方式？数据超时如何控制？为了知其因此然，本文中对于MemoryCache的原理和实现方式进行了深刻分析，同时在分析的过程当中学习到了许多业界成熟组件的设计思想，为从此的工做打开了更加开阔的思路html

本文面向的是.net 4.5.1的版本，在后续的.net版本中MemoryCache有略微的不一样，欢迎补充git

文章内容较长，预计阅读时间1小时左右github

MemoryCache类继承自ObjectCache抽象类，而且实现了IEnumerable和IDisposable接口。跟ASP.NET经常使用的Cache类实现了类似的功能，可是MemoryCache更加通用。使用它的时候没必要依赖于System.Web类库，而且在同一个进程中可使用MemoryCache建立多个实例。redis

在使用MemoryCache的时候一般会有些疑问，这个类到底内部数据是如何组织的？缓存项的超时是如何处理的？它为何宣传本身是线程安全的？为了回答这些问题，接下来借助Reference Source对于MemoryCache的内部实现一探究竟。数据库

MemoryCache内部数据结构

在MemoryCache类内部，数据的组织方式跟MemoryCacheStore、MemoryCacheKey和MemoryCacheEntry这三个类有关，它们的做用分别是：c#

MemoryCacheStore：承载数据
MemoryCacheKey：构造检索项
MemoryCacheEntry：缓存内部数据的真实表现形式

MemoryCache和MemoryCacheStore的关系大体以下图所示：缓存

从图上能够直观的看出，一个MemoryCache实例对象能够包含多个MemoryCacheStore对象，具体有几个须要取决于程序所在的硬件环境，跟CPU数目有关。在MemoryCache的内部，MemoryCacheStore对象就像一个个的小数据库同样，承载着各类数据。因此，要理解MemoryCache内部的数据结构，就须要先理解MemoryCacheStore的地位和做用。安全

MemoryCacheStore

该类型是MemoryCache内部真正用于承载数据的容器。它直接管理着程序的内存缓存项，既然要承载数据，那么该类型中必然有些属性与数据存储有关。其具体表现是：MemoryCache中有一个类型为HashTable的私有属性_entries，在该属性中存储了它所管理的全部缓存项。markdown

Hashtable _entries = new Hashtable(new MemoryCacheEqualityComparer());

当须要去MemoryCache中获取数据的时候，MemoryCache所作的第一步就是寻找存储被查找key的MemoryCacheStore对象，而并不是是咱们想象中的直接去某个Dictionary类型或者HashTable类型的对象中直接寻找结果。数据结构

在MemoryCache中查找MemoryCacheStore的方式也挺有趣，主要的逻辑在MemoryCache的GetStore方法中，源码以下（为了理解方便增长了部分注释）：

internal MemoryCacheStore GetStore(MemoryCacheKey cacheKey) {
    int hashCode = cacheKey.Hash;//获取key有关的hashCode值
    if (hashCode < 0) {
        //避免出现负数
        hashCode = (hashCode == Int32.MinValue) ? 0 : -hashCode;
    }
    int idx = hashCode & _storeMask;
    //_storeMask跟CPU的数目一致，经过&进行按位与计算获取到对应的Store
    //本处代码是.NET 4.5的样子，在.NET Framework 4.7.2版本已经改为了使用%进行取余计算，对于正整数来讲实际结果是同样的。
    return _stores[idx];
}

既然可能存在多个MemoryCacheStore对象，那么就须要有必定的规则来决定每一个Store中存储的内容。从源码中能够看出，MemoryCache使用的是CPU的核数做为掩码，并利用该掩码和key的hashcode来计算缓存项的归属地，确实是简单而高效。

MemoryCacheKey

MemoryCacheKey的类功能相对比较简单，主要用于封装缓存项的key及相关的经常使用方法。

上文提到了MemoryCacheStore中_entries的初始化方式，在构造函数的参数是一个MemoryCacheEqualityComparer对象，这是个什么东西，又是起到什么做用的呢？

MemoryCacheEqualityComparer类实现了IEqualityComparer接口，其中便定义了哈希表中判断值相等的方法，来分析下源码：

internal class MemoryCacheEqualityComparer: IEqualityComparer {

    bool IEqualityComparer.Equals(Object x, Object y) {
        Dbg.Assert(x != null && x is MemoryCacheKey);
        Dbg.Assert(y != null && y is MemoryCacheKey);

        MemoryCacheKey a, b;
        a = (MemoryCacheKey)x;
        b = (MemoryCacheKey)y;
        //MemoryCacheKey的Key属性就是咱们在获取和设置缓存时使用的key值
        return (String.Compare(a.Key, b.Key, StringComparison.Ordinal) == 0);
    }

    int IEqualityComparer.GetHashCode(Object obj) {
        MemoryCacheKey cacheKey = (MemoryCacheKey) obj;
        return cacheKey.Hash;
    }
}

从代码中能够看出,MemoryCacheEqualityComparer的真正做用就是定义MemoryCacheKey的比较方法。判断两个两个MemoryCacheKey是否相等使用的就是MemoryCacheKey中的Key属性。所以咱们在MemoryCache中获取和设置相关的内容时，使用的都是对于MemoryCacheKey的相关运算结果。

MemoryCacheEntry

此类型是缓存项在内存中真正的存在形式。它继承自MemoryCacheKey类型，并在此基础上增长了不少的属性和方法，好比判断是否超时等。

先来看下该类的总体状况：

总的来讲，MemoryCacheEntry中的属性和方法主要为三类：

缓存的内容相关，如Key、Value
缓存内容的状态相关，如State、HasExpiration方法等
缓存内容的相关事件相关，如CallCacheEntryRemovedCallback方法、CallNotifyOnChanged方法等

理解了MemoryCache中数据的组织方式后，能够帮助理解数据是如何从MemoryCache中被一步步查询获得的。

如何从MemoryCahe中查询数据

从MemoryCache中获取数据经历了哪些过程呢？从总体来说，大体能够分为两类：获取数据和验证有效性。

以流程图的方式表达上述步骤以下：

详细的步骤是这样的：

校验查询参数RegionName和Key，进行有效性判断
构造MemoryCacheKey对象，用于后续步骤查询和比对现有数据
获取MemoryCacheStore对象，缩小查询范围
从MemoryCacheStore的HashTable类型属性中提取MemoryCacheEntry对象，获得key对应的数据
判断MemoryCacheEntry对象的有效性，进行数据验证工做
处理MemoryCacheEntry的滑动超时时间等访问相关的逻辑

看到此处，不由想起以前了解的其余缓存系统中的设计，就像历史有时会有惊人的类似性，进行了良好设计的缓存系统在某些时候看起来确实有不少类似的地方。经过学习他人的优良设计，从中能够学到不少的东西，好比接下来的缓存超时机制。

MemoryCache超时机制

MemoryCache在设置缓存项时能够选择永久缓存或者在超时后自动消失。其中缓存策略能够选择固定超时时间和滑动超时时间的任意一种（注意这两种超时策略只能二选一，下文中会解释为何有这样的规则）。

缓存项的超时管理机制是缓存系统(好比Redis和MemCached)的必备功能，Redis中有主动检查和被动触发两种，MemCached采用的是被动触发检查，那么内存缓存MemoryCache内部是如何管理缓存项的超时机制？

MemoryCache对于缓存项的超时管理机制与Redis相似，也是有两种：按期删除和惰性删除。

按期删除

既然MemoryCache内部的数据是以MemoryCacheStore对象为单位进行管理，那么按期检查也颇有多是MemoryCacheStore对象内部的一种行为。

经过仔细阅读源码，发现MemoryCacheStore的构造函数中调用了InitDisposableMembers()这个方法，该方法的代码以下：

private void InitDisposableMembers() {
    //_insertBlock是MemoryCacheStore的私有属性
    //_insertBlock的声明方式是：private ManualResetEvent _insertBlock;
    _insertBlock = new ManualResetEvent(true);
    //_expires是MemoryCacheStore的私有属性
    //_expires的声明方式是：private CacheExpires _expires;
    _expires.EnableExpirationTimer(true);
}

其中跟本章节讨论的超时机制有关的就是_expires这个属性。因为《.NET reference source》中并无这个CacheExpires类的相关源码，没法得知具体的实现方式，所以从Mono项目中找到同名的方法探索该类型的具体实现。

class CacheExpires : CacheEntryCollection
{

    public static TimeSpan MIN_UPDATE_DELTA = new TimeSpan (0, 0, 1);
    public static TimeSpan EXPIRATIONS_INTERVAL = new TimeSpan (0, 0, 20);
    public static CacheExpiresHelper helper = new CacheExpiresHelper ();

    Timer timer;

    public CacheExpires (MemoryCacheStore store)
        : base (store, helper)
    {
    }

    public new void Add (MemoryCacheEntry entry)
    {
        entry.ExpiresEntryRef = new ExpiresEntryRef ();
        base.Add (entry);
    }

    public new void Remove (MemoryCacheEntry entry)
    {
        base.Remove (entry);
        entry.ExpiresEntryRef = ExpiresEntryRef.INVALID;
    }

    public void UtcUpdate (MemoryCacheEntry entry, DateTime utcAbsExp)
    {
        base.Remove (entry);
        entry.UtcAbsExp = utcAbsExp;
        base.Add (entry);
    }

    public void EnableExpirationTimer (bool enable)
    {
        if (enable) {
            if (timer != null)
                return;

            var period = (int) EXPIRATIONS_INTERVAL.TotalMilliseconds;
            timer = new Timer ((o) => FlushExpiredItems (true), null, period, period);
        } else {
            timer.Dispose ();
            timer = null;
        }
    }

    public int FlushExpiredItems (bool blockInsert)
    {
        return base.FlushItems (DateTime.UtcNow, CacheEntryRemovedReason.Expired, blockInsert);
    }
}

经过Mono中的源代码能够看出，在CacheExpires内部使用了一个定时器，经过定时器触发定时的检查。在触发时使用的是CacheEntryCollection类的FlushItems方法。该方法的实现以下；

protected int FlushItems (DateTime limit, CacheEntryRemovedReason reason, bool blockInsert, int count = int.MaxValue)
{
    var flushedItems = 0;
    if (blockInsert)
        store.BlockInsert ();

    lock (entries) {
        foreach (var entry in entries) {
            if (helper.GetDateTime (entry) > limit || flushedItems >= count)
                break;

            flushedItems++;
        }

        for (var f = 0; f < flushedItems; f++)
            store.Remove (entries.Min, null, reason);
    }

    if (blockInsert)
        store.UnblockInsert ();

    return flushedItems;
}

在FlushItems(***)的逻辑中，经过遍历全部的缓存项而且比对了超时时间，将发现的超时缓存项执行Remove操做进行清理，实现缓存项的按期删除操做。经过Mono项目中该类的功能推断，在.net framework中的实现应该也是有相似的功能，即每个MemoryCache的实例都会有一个负责定时检查的任务，负责处理掉全部超时的缓存项。

惰性删除

除了定时删除之外，MemoryCache还实现了惰性删除的功能，这项功能的实现相对于定时删除简单的多，并且很是的实用。

惰性删除是什么意思呢？简单的讲就是在使用缓存项的时候判断缓存项是否应该被删除，而不用等到被专用的清理任务清理。

前文描述过MemoryCache中数据的组织方式，既然是在使用时触发的逻辑，所以惰性删除必然与MemoryCacheStore获取缓存的方法有关。来看下它的Get方法的内部逻辑：

internal MemoryCacheEntry Get(MemoryCacheKey key) {
    MemoryCacheEntry entry = _entries[key] as MemoryCacheEntry;
    // 判断是否超时
    if (entry != null && entry.UtcAbsExp <= DateTime.UtcNow) {
        Remove(key, entry, CacheEntryRemovedReason.Expired);
        entry = null;
    }
    // 更新滑动超时的时间和相关的计数器
    UpdateExpAndUsage(entry);
    return entry;
}

从代码中能够看出，MemoryCacheStore查找到相关的key对应的缓存项之后，并无直接返回，而是先检查了缓存项目的超时时间。若是缓存项超时，则删除该项并返回null。这就是MemoryCache中惰性删除的实现方式。

MemoryCache的缓存过时策略

向MemoryCache实例中添加缓存项的时候，能够选择三种过时策略：

永不超时
绝对超时
滑动超时

缓存策略在缓存项添加/更新缓存时（不管是使用Add或者Set方法）指定，经过在操做缓存时指定CacheItemPolicy对象来达到设置缓存超时策略的目的。

缓存超时策略并不能随意的指定，在MemoryCache内部对于CacheItemPolicy对象有内置的检查机制。先看下源码：

private void ValidatePolicy(CacheItemPolicy policy) {
    //检查过时时间策略的组合设置
    if (policy.AbsoluteExpiration != ObjectCache.InfiniteAbsoluteExpiration
        && policy.SlidingExpiration != ObjectCache.NoSlidingExpiration) {
        throw new ArgumentException(R.Invalid_expiration_combination, "policy");
    }
    //检查滑动超时策略
    if (policy.SlidingExpiration < ObjectCache.NoSlidingExpiration || OneYear < policy.SlidingExpiration) {
        throw new ArgumentOutOfRangeException("policy", RH.Format(R.Argument_out_of_range, "SlidingExpiration", ObjectCache.NoSlidingExpiration, OneYear));
    }
    //检查CallBack设置
    if (policy.RemovedCallback != null
        && policy.UpdateCallback != null) {
        throw new ArgumentException(R.Invalid_callback_combination, "policy");
    }
    //检查优先级的设置
    if (policy.Priority != CacheItemPriority.Default && policy.Priority != CacheItemPriority.NotRemovable) {
        throw new ArgumentOutOfRangeException("policy", RH.Format(R.Argument_out_of_range, "Priority", CacheItemPriority.Default, CacheItemPriority.NotRemovable));
    }
}

总结下源码中的逻辑，超时策略的设置有以下几个规则：

绝对超时和滑动超时不能同时存在（这是前文中说二者二选一的缘由）
若是滑动超时时间小于0或者大于1年也不行
RemovedCallback和UpdateCallback不能同时设置
缓存的Priority属性不能是超出枚举范围（Default和NotRemovable）

MemoryCache线程安全机制

根据MSDN的描述：MemoryCache是线程安全的。那么说明，在操做MemoryCache中的缓存项时，MemoryCache保证程序的行为都是原子性的，而不会出现多个线程共同操做致使的数据污染等问题。

那么，MemoryCache是如何作到这一点的？

MemoryCache在内部使用加锁机制来保证数据项操做的原子性。该锁以每一个MemoryCacheStore为单位，即同一个MemoryCacheStore内部的数据共享同一个锁，而不一样MemoryCacheStore之间互不影响。

存在加锁逻辑的有以下场景：

遍历MemoryCache缓存项
向MemoryCache添加/更新缓存项
执行MemoryCache析构
移除MemoryCache中的缓存项

其余的场景都比较好理解，其中值得一提的就是场景1(遍历)的实现方式。在MemoryCache中，使用了锁加复制的方式来处理遍历的须要，保证在遍历过程当中不会发生异常。

在.net 4.5.1中的遍历的实现方式是这样的：

protected override IEnumerator<KeyValuePair<string, object>> GetEnumerator() {
    Dictionary<string, object> h = new Dictionary<string, object>();
    if (!IsDisposed) {
        foreach (MemoryCacheStore store in _stores) {
            store.CopyTo(h);
        }
    }
    return h.GetEnumerator();
}

其中store.CopyTo(h);的实现方式是在MemoryCacheStore中定义的，也就是说，每一个Store的加锁解锁都是独立的过程，缩小锁机制影响的范围也是提高性能的重要手段。CopyTo方法的主要逻辑是在锁机制控制下的简单的遍历：

internal void CopyTo(IDictionary h) {
    lock (_entriesLock) {
        if (_disposed == 0) {
            foreach (DictionaryEntry e in _entries) {
                MemoryCacheKey key = e.Key as MemoryCacheKey;
                MemoryCacheEntry entry = e.Value as MemoryCacheEntry;
                if (entry.UtcAbsExp > DateTime.UtcNow) {
                    h[key.Key] = entry.Value;
                }
            }
        }
    }
}

有些出乎意料，在遍历MemoryCache的时候，为了实现遍历过程当中的线程安全，实现的方式竟然是将数据另外拷贝了一份。固然了，说是彻底拷贝一份也不尽然，若是缓存项原本就是引用类型，被拷贝的也只是个指针而已。不过看起来最好仍是少用为妙，万一缓存的都是些基础类型，一旦数据量较大，在遍历过程当中的内存压力就不是能够忽略的问题了。

总结

在本文中以MemoryCache对于数据的组织管理和使用为轴线，深刻的分析了MemoryCache对于一些平常应用有直接关联的功能的实现方式。MemoryCache经过多个MemoryCacheStore对象将数据分散到不一样的HastTable中，而且使用加锁的方式在每一个Store内部保证操做是线程安全的，同时这种逻辑也在必定程度上改善了全局锁的性能问题。为了实现对于缓存项超时的管理，MemoryCache采起了两种不一样的管理措施，左右开弓，有效保证了缓存项的超时管理的有效性，并在超时后及时移除相关的缓存以释放内存资源。经过对于这些功能的分析，了解了MemoryCache内部的数据结构和数据查询方式，为从此的工做掌握了许多有指导性意义的经验。

本文还会有后续的篇章，敬请期待~~