OC底层原理05:类结构cache

以前分析类结构中谈到了cache:利用散列表来缓存方法,这里咱们具体深刻探索下cache。缓存

cache源码分析

struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED  explicit_atomic<struct bucket_t *> _buckets;  explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16  explicit_atomic<uintptr_t> _maskAndBuckets;  mask_t _mask_unused;  // 部分省略... #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4  // _maskAndBuckets stores the mask shift in the low 4 bits, and  // the buckets pointer in the remainder of the value. The mask  // shift is the value where (0xffff >> shift) produces the correct  // mask. This is equal to 16 - log2(cache_size).  explicit_atomic<uintptr_t> _maskAndBuckets;  mask_t _mask_unused;  // 部分省略... #else #error Unknown cache mask storage type. #endif  #if __LP64__  uint16_t _flags; // 位置标记,用来外部进行读取 #endif  uint16_t _occupied; // 占用状况   // 部分方法省略... public:  struct bucket_t *buckets(); // 获取buckets  mask_t mask(); // 获取掩码  mask_t occupied(); // 获取occupied  void incrementOccupied(); // occupied个数自增  void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);  void initializeToEmpty();   unsigned capacity(); // 缓存容量大小  bool isConstantEmptyCache();  bool canBeFreed();   // 开辟内容  void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);  // 插入sel和imp  void insert(Class cls, SEL sel, IMP imp, id receiver); 复制代码

1. CACHE_MASK_STORAGE

CACHE_MASK_STORAGE_OUTLINED:表示支持运行环境为MacOS或者模拟器
CACHE_MASK_STORAGE_HIGH_16:表示支持运行环境为64位的真机
CACHE_MASK_STORAGE_LOW_4:表示支持运行环境为非64位的真机

由于文章里设计的代码运行在MacOS下,编译后就决定了 CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED,因此点其它就会找不到.安全

explicit_atomic: cache用来作方法缓存,缓存过程当中确定会涉及到--增删改查.explictit_atomic表明了原子性,保证了线程安全性markdown

2. _buckets

从源码咱们能够看到, _buckets实际上是一个struct bucket_t *类型的数据, bucket_t源码:app

struct bucket_t {
private:  // IMP-first is better for arm64e ptrauth and no worse for arm64.  // SEL-first is better for armv7* and i386 and x86_64. #if __arm64__ // 64位真机  explicit_atomic<uintptr_t> _imp;  explicit_atomic<SEL> _sel; #else // 其他  explicit_atomic<SEL> _sel;  explicit_atomic<uintptr_t> _imp; #endif // 部分方法省略 public:  // 获取sel  inline SEL sel() const {  // ...  }  // 获取imp 须要传递类做为参数  inline IMP imp(Class cls) const {  // ...  } 复制代码

其中不管运行环境是怎样的,bucket_t结构体中,都有两个数据成员_imp和_sel,只是顺序的差异.less

sel和imp函数
- sel是方法的编号,能够理解为目录的名称
- imp是函数方法的指针地址,能够理解为目录的页码

cache调试

源码基础上调试代码:oop

@interface LGPerson : NSObject
@property (nonatomic, copy) NSString *lgName; @property (nonatomic, strong) NSString *nickName;  - (void)sayHello; - (void)sayCode; - (void)sayMaster; - (void)sayNB; + (void)sayHappy;  @end  // main  LGPerson *p = [LGPerson alloc]; [p sayHello]; [p sayCode]; [p sayMaster];  复制代码

首先,断点卡在[LGPerson alloc]以后,即还没有调用方法时:源码分析

此时能够看到,occupied和capacity都为0.断点向下,看下调用第一个方法sayHello以后:ui

此时由于代码调用了sayHello方法,系统会将该方法存在缓存中,以便下次调用时提升调用速度.因此咱们能够看到this

occupied = 1 , 方法有1个
capacity 为4 , 缓存大小为4(4个bucket_t结构体的大小)
sel = "sayHello"
imp = 0x0000000100000c00 - [LGPerson sayHello]

咱们确实从cache中找到了调用过的方法,那么多调用几个方法会是什么样子的呢? 接下来把断点断在sayMaster方法以后,那么,此时缓存中应该有三个方法.

可是实际调试后,咱们却发现缓存中只有一个方法,可是缓存容量变大为8.咱们遍历buckets中全部的数据,在第二个位置找到了缓存的方法sayMaster,即代码中调用的最后一个方法. 这是为何???

为何第三个方法调用后,缓存中的方法被清空了?
缓存被清空后,为何capacity仍然变大?
为何方法存入缓存顺序是乱序的?

方法插入缓存过程

怀揣着疑问,咱们研究下: 方法到底是如何插入到缓存的.相信这个过程可以帮咱们解答上边的疑问.

从新阅读下源码,咱们看到在cache_t中,有两个这样的方法

void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
void insert(Class cls, SEL sel, IMP imp, id receiver); 复制代码

insert源码

void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)
{ #if CONFIG_USE_CACHE_LOCK  cacheUpdateLock.assertLocked(); #else  runtimeLock.assertLocked(); #endif   ASSERT(sel != 0 && cls->isInitialized());   // Use the cache as-is if it is less than 3/4 full  mask_t newOccupied = occupied() + 1;  unsigned oldCapacity = capacity(), capacity = oldCapacity;  if (slowpath(isConstantEmptyCache())) {  // Cache is read-only. Replace it.  if (!capacity) capacity = INIT_CACHE_SIZE;  reallocate(oldCapacity, capacity, /* freeOld */false);  }  else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { // 4 3 + 1 bucket cache_t  // Cache is less than 3/4 full. Use it as-is.  }  else {  capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; // 扩容两倍 4  if (capacity > MAX_CACHE_SIZE) {  capacity = MAX_CACHE_SIZE;  }  reallocate(oldCapacity, capacity, true); // 内存 库容完毕  }   bucket_t *b = buckets();  mask_t m = capacity - 1;  mask_t begin = cache_hash(sel, m);  mask_t i = begin;   // Scan for the first unused slot and insert there.  // There is guaranteed to be an empty slot because the  // minimum size is 4 and we resized at 3/4 full.  do {  if (fastpath(b[i].sel() == 0)) {  incrementOccupied();  b[i].set<Atomic, Encoded>(sel, imp, cls);  return;  }  if (b[i].sel() == sel) {  // The entry was added to the cache by some other thread  // before we grabbed the cacheUpdateLock.  return;  }  } while (fastpath((i = cache_next(i, m)) != begin));   cache_t::bad_cache(receiver, (SEL)sel, cls); } 复制代码

咱们一段一段进行具体分析:

1. buckets为空时

// Use the cache as-is if it is less than 3/4 full
 mask_t newOccupied = occupied() + 1;  unsigned oldCapacity = capacity(), capacity = oldCapacity;  if (slowpath(isConstantEmptyCache())) {  // Cache is read-only. Replace it.  if (!capacity) capacity = INIT_CACHE_SIZE;  reallocate(oldCapacity, capacity, /* freeOld */false);  }  // isConstantEmptyCache源码 bool cache_t::isConstantEmptyCache() {  return  occupied() == 0 &&  buckets() == emptyBucketsForCapacity(capacity(), false); }  复制代码

经过判断`isConstantEmptyCache`方法,当条件知足时,即buckets为空时:

if (!capacity) capacity = INIT_CACHE_SIZE;

为capacity赋初值 INIT_CACHE_SIZE, 0001 << 2 = 0100 = 4.因此capacity初值为4.
reallocate(oldCapacity, capacity, false);

ALWAYS_INLINE
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld) {  bucket_t *oldBuckets = buckets();  bucket_t *newBuckets = allocateBuckets(newCapacity);   // Cache's old contents are not propagated.   // This is thought to save cache memory at the cost of extra cache fills.  // fixme re-measure this   ASSERT(newCapacity > 0);  ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);   setBucketsAndMask(newBuckets, newCapacity - 1);   if (freeOld) {  cache_collect_free(oldBuckets, oldCapacity);  } }  复制代码

allocateBuckets

reallocate开辟内存方法中,咱们先关注bucket_t *newBuckets = allocateBuckets(newCapacity);

再点进去看set方法,其中保存了方法的SEL和IMP:

此时咱们就获得了一个newBuckets,回看reallocate方法,咱们在获得newBuckets后,会继续向下调用setBucketsAndMask方法,newBuckets和capacity-1会做为参数传递进去:

setBucketsAndMask 方法,会根据不一样的运行环境下,store存储方法的调用.其中_buckets 、 _mask、 _occupied就是cache_t结构体中的对应数据.

到此简单分析完了insert源码中第一段if的过程, 简单总结以下:

2.buckets不为空时

else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { 
 // Cache is less than 3/4 full. Use it as-is.  // 其中CACHE_END_MARKER 为 宏 #define CACHE_END_MARKER 1  // 即当前的newOccupied+1以后,是否 小于等于capacity容量的四分之三 } else {  capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; // 扩容两倍  if (capacity > MAX_CACHE_SIZE) {  capacity = MAX_CACHE_SIZE;  }  reallocate(oldCapacity, capacity, true); // 内存 扩容完毕 } 复制代码

当前的方法个数newOccupied 加 1, 小于等于 capacity容量的四分之三, 则继续向下执行
当大于四分之三时,会将当前的capacity扩容两倍,并从新reallocate,此时调用reallocate方法中,传入的freeOld参数为true,则此次会调用到cache_collect_free方法

咱们来看下cache_collect_free源码

3.肯定插入的位置

bucket_t *b = buckets();
mask_t m = capacity - 1; mask_t begin = cache_hash(sel, m); mask_t i = begin;  // cache_hash static inline mask_t cache_hash(SEL sel, mask_t mask) {  return (mask_t)(uintptr_t)sel & mask; } 复制代码

bucket_t要插入的位置,并非顺序插入,由于顺序插入存储,不如哈希计算后直接取效率高.

咱们能够验证一下:

咱们在调用第一个方法sayHello后,看下它的存储状况.

4.插入位置的校验

哈希计算,可能会存储不一样方法时,计算结果相同的状况,因此在肯定插入前,须要再作下校验,判断要插入的位置是否已有数据

do {
 if (fastpath(b[i].sel() == 0)) {  incrementOccupied();  b[i].set<Atomic, Encoded>(sel, imp, cls);  return;  }  if (b[i].sel() == sel) {  // The entry was added to the cache by some other thread  // before we grabbed the cacheUpdateLock.  return;  } } while (fastpath((i = cache_next(i, m)) != begin));  // cache_next方法 static inline mask_t cache_next(mask_t i, mask_t mask) {  return (i+1) & mask; } 复制代码

校验位置在do..while中,循环条件为:fastpath((i = cache_next(i, m)) != begin),即从新哈希计算后的位置不一样于最初的哈希结构,即位置的计算会对最初的哈希结果再次进行哈希计算,下降计算结果相同的几率.

循环内部:

当要插入的数据为空时:fastpath(b[i].sel() == 0),会先调用incrementOccupied,进行缓存中方法个数自增,而后将sel、imp、cls保存在一块儿.
当插入位置的sel,相等于要插入的sel时,便可能存在,在不一样线程中,已经存储过的状况下,就再也不存储了.The entry was added to the cache by some other thread before we grabbed the cacheUpdateLock.

对insert简单作个总结

上边的疑问就不是疑问了...