iOS 底层探索篇 —— cache_t分析

时间 2020-01-06

标签 ios 底层探索 cache 分析栏目 iOS 繁體版

原文原文链接

前言算法

iOS 底层探索篇 —— alloc、init、new的探索

iOS 底层探索篇 —— 内存字节对齐分析

iOS 底层探索篇 —— 对象的本质

iOS 底层探索篇 —— isa的初始化&指向分析

iOS 底层探索篇 —— 类的结构分析

观察`cache_t`结构体

1. 类结构体

struct objc_class : objc_object {
    // Class ISA;           //8
    Class superclass;       //8
    cache_t cache;          //16        // formerly cache pointer and vtable
    class_data_bits_t bits;  
...省略部分信息...
};
复制代码

2. `cache_t`结构体

struct cache_t {
    struct bucket_t *_buckets;  //8
    mask_t _mask;               //4      typedef uint32_t mask_t;  
    mask_t _occupied;           //4

public:
    struct bucket_t *buckets();
    mask_t mask();
    mask_t occupied();
    void incrementOccupied();
    void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
    void initializeToEmpty();

    mask_t capacity();
    bool isConstantEmptyCache();
    bool canBeFreed();

    static size_t bytesForCapacity(uint32_t cap);
    static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap);

    void expand();
    void reallocate(mask_t oldCapacity, mask_t newCapacity);
    struct bucket_t * find(SEL sel, id receiver);

    static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn));
};
复制代码

3. `struct bucket_t *_buckets`

struct bucket_t {
private:
#if __arm64__
    uintptr_t _imp;   //typedef unsigned long           uintptr_t;
    SEL _sel;
#else
    SEL _sel;
    uintptr_t _imp;
#endif
    ...省略部分信息...
};
复制代码

_buckets一个指向bucket_t结构体的指针，能够看命名是一个数组；

_imp函数实现地址；

_sel函数；

注意在arm64的平台上_imp在_sel的前面。

4. `mask_t _mask`

一个修饰的值，在接下来的分析能够获得是 hash链表的长度减1。数组

5. `mask_t _occupied`

缓存已占用的空间缓存

`cache_t`缓存的内容

经过bucket_t结构体观察存的就是imp和sel，证明缓存的是OC的方法。bash

`cache_t`缓存的流程

1. 缓存入口

咱们在objc的源码objc_cache.mm文件的开始位置能够看到这么一段注释less

* Cache readers (PC-checked by collecting_in_critical())
 * objc_msgSend*
 * cache_getImp
 *
 * Cache writers (hold cacheUpdateLock while reading or writing; not PC-checked)
 * cache_fill         (acquires lock)
 * cache_expand       (only called from cache_fill)
 * cache_create       (only called from cache_expand)
 * bcopy               (only called from instrumented cache_expand)
 * flush_caches        (acquires lock)
 * cache_flush        (only called from cache_fill and flush_caches)
 * cache_collect_free (only called from cache_expand and cache_flush)
复制代码

既然是缓存就是要写入，咱们直接从Cache writers下面看，cache_fill就是缓存开始的位置。函数

2. `cache_fill`

void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if !DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}
复制代码

忽略其余信息，只有cache_fill_nolock这个函数是咱们的目标。post

3. `cache_fill_nolock`

static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();

    // Never cache before +initialize is done
    if (!cls->isInitialized()) return;

    // Make sure the entry wasnot added to the cache by some other thread 
    // before we grabbed the cacheUpdateLock.
    if (cache_getImp(cls, sel)) return;

    cache_t *cache = getCache(cls);

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = cache->occupied() + 1;
    mask_t capacity = cache->capacity();
    if (cache->isConstantEmptyCache()) {
       // Cache is read-only. Replace it.
        cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
    }
    else if (newOccupied <= capacity / 4 * 3) {
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        // Cache is too full. Expand it.
        cache->expand();
    }

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the 
    // minimum size is 4 and we resized at 3/4 full.
    bucket_t *bucket = cache->find(sel, receiver);
    if (bucket->sel() == 0) cache->incrementOccupied();
    bucket->set<Atomic>(sel, imp);
}
复制代码

`cache_t`缓存的流程深刻

从上面的分析咱们知道在函数cache_fill_nolock里面会实现缓存步骤。接下来开始对每一步深刻研究。ui

1. 缓存主线流程分析

1. 条件判断

if (!cls->isInitialized()) return;
 if (cache_getImp(cls, sel)) return;
复制代码

判断类是否已经初始化，尚未就直接返回了。

在类里面是否能够找到sel，能够找到，说明已经缓存了，直接返回。

2. `cache_t *cache = getCache(cls)`

cache_t *getCache(Class cls) 
{
    assert(cls);
    return &cls->cache;
}
复制代码

从当前类获取cache成员。this

3. `mask_t newOccupied = cache->occupied() + 1`

occupied()函数返回的是_occupied，即当前已经占用的空间大小。

_occupied + 1 ，表示新的即将占用的内存空间大小。

4. `mask_t capacity = cache->capacity()`

mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
    //mask() return _mask;
}
复制代码

获取缓存的容量，注意这里 + 1，缘由是由于赋值的时候，用了容量 - 1。atom

5. 缓存容量的处理

if (cache->isConstantEmptyCache()) {
        // Cache is read-only. Replace it.
        //INIT_CACHE_SIZE 4
        cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
    }
    else if (newOccupied <= capacity / 4 * 3) {
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        // Cache is too full. Expand it.
        cache->expand();
    }
复制代码

cache是空的，就须要reallocte。

newOccupied小于或者等于总容量的3/4，就走后面的。

若是大于3/4，就须要expand()即缓存扩容。

6. 查找`bucket`以及赋值

bucket_t *bucket = cache->find(sel, receiver);
if (bucket->sel() == 0) cache->incrementOccupied();
bucket->set<Atomic>(sel, imp);
复制代码

缓存的主线流程就是这样的一个过程的，接下来把第5步和第6步进行分析。

2. 缓存主线流程 `容量处理过程`和`查找过程`分析

1. 容量处理过程

cache->isConstantEmptyCache()

bool cache_t::isConstantEmptyCache()
{
    return 
        occupied() == 0  &&  
        buckets() == emptyBucketsForCapacity(capacity(), false);
}
复制代码

第一次进来的时候occupied()返回值_occupoed值为0。

buckets()返回值_buckets是一个尚未初始化的指针为nil。

bucket_t *emptyBucketsForCapacity(mask_t capacity, bool allocate = true)
{
    cacheUpdateLock.assertLocked();

    size_t bytes = cache_t::bytesForCapacity(capacity);
   
    // Use _objc_empty_cache if the buckets is small enough.

    if (bytes <= EMPTY_BYTES) {
        return (bucket_t *)&_objc_empty_cache;
    }
    
    *
    走到这个就会直接返回了
    bucket_t *          8                          1000
    _objc_empty_cache   16                        10000
                        &                         00000
    *
    
    //下面的代码与咱们的流程没有关系 暂不分析
    // Use shared empty buckets allocated on the heap.
    static bucket_t **emptyBucketsList = nil;
    static mask_t emptyBucketsListCount = 0;
    
    mask_t index = log2u(capacity);

    if (index >= emptyBucketsListCount) {
        if (!allocate) return nil;

        mask_t newListCount = index + 1;
        bucket_t *newBuckets = (bucket_t *)calloc(bytes, 1);
        emptyBucketsList = (bucket_t**)
            realloc(emptyBucketsList, newListCount * sizeof(bucket_t *));
        // Share newBuckets for every un-allocated size smaller than index.
        // The array is therefore always fully populated.
        for (mask_t i = emptyBucketsListCount; i < newListCount; i++) {
            emptyBucketsList[i] = newBuckets;
        }
        emptyBucketsListCount = newListCount;

        if (PrintCaches) {
            _objc_inform("CACHES: new empty buckets at %p (capacity %zu)", 
                         newBuckets, (size_t)capacity);
        }
    }

    return emptyBucketsList[index];
}
复制代码

cache_t::bytesForCapacity(capacity)，参数capacity为0，实现为sizeof(bucket_t) * (cap + 1)，bucket_t结构体里面有两个成员都是8字节总共16字节，运算结果为16。

EMPTY_BYTES，一个宏定义，在DEBUG值为(8+1)*16，不然为(1024+1)*16。

if (bytes <= EMPTY_BYTES)显然这个条件是知足的。

(bucket_t *)&_objc_empty_cache，这是一个&运算，OBJC_EXPORT struct objc_cache _objc_empty_cache这是_objc_empty_cache的定义，它的字节占用的值为16，咱们在objc_cache这个结构体里面是能够看到的。 8 & 16 记过计算出来为0。

在if条件里面返回了，后面的流程就不用看了。

则第一次进来的时候这个条件判断是否为空的缓存是存在的。

cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE)

表达的意思为缓存空的时候，就去开辟。

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    bool freeOld = canBeFreed();
   
    bucket_t *oldBuckets = buckets();

    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache is old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
  
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
        cache_collect(false);
    }
}
复制代码

bool freeOld = canBeFreed()，函数的实现就是!isConstantEmptyCache()，则freeOld的值为true。

bucket_t *newBuckets = allocateBuckets(newCapacity)，根据容量系统开辟一个新的buckets。

setBucketsAndMask(newBuckets, newCapacity - 1)。作了三件事，空的_buckets赋值；_mask赋值，注意这里是容量 - 1；_occupied已占用的空间置为空。

if (freeOld)经过上面的取值以及知道是成立的，主要作了清空旧的缓存。

newOccupied <= capacity / 4 * 3

这个意思就是新占用的空间是否小于总容量的3/4。

cache->expand()

同比第7步，若是容量大于3/4的时候，内存须要扩容。

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        // mask overflow - can it grow further
        // fixme this wastes one bit of mask
        newCapacity = oldCapacity;
    }

    reallocate(oldCapacity, newCapacity);
}
复制代码

uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;，这里扩容后的容量会是旧容量的两倍。

reallocate(oldCapacity, newCapacity)，这一步如同刚开始没有缓存的时候，从新让系统来开辟。

2. 查找过程

bucket_t *bucket = cache->find(sel, receiver)

bucket_t * cache_t::find(SEL s, id receiver)
{
    assert(s != 0);

    bucket_t *b = buckets(); //获取_buckets
    mask_t m = mask(); //获取_mask

    mask_t begin = cache_hash(s, m);
    mask_t i = begin;
    do {
        if (b[i].sel() == 0  ||  b[i].sel() == s) {
            return &b[i];
        }
    } while ((i = cache_next(i, m)) != begin); 
    
    // hack
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    cache_t::bad_cache(receiver, (SEL)s, cls);
}
复制代码

cache_hash(sel, _mask)，这是一个hash算法return (mask_t)(uintptr_t)sel & mask，这里也能够知道_mask会比总容量小1的缘由，就是这里计算的时候保证不会越界。

do...while循环，当_sel不存在或者在缓存里面匹配到了，就返回当前的bucket。循环的条件在arm64下面是return i ? i-1 : mask;，实际上就是造成一个闭环来查找。

if (bucket->sel() == 0) cache->incrementOccupied()

void cache_t::incrementOccupied() 
{
    _occupied++;
}
复制代码

表示在缓存里面没有找到的时候，这个时候要存到缓存里面，则已占用的空间须要增长。

bucket->set<Atomic>(sel, imp)

void bucket_t::set(SEL newSel, IMP newImp)
{
    _imp = (uintptr_t)newImp;
    
    if (_sel != newSel) {
        if (atomicity == Atomic) {
            mega_barrier();
        }
        _sel = newSel;
    }
}
复制代码

这一步就是表示当前取出来的bucket来从新赋值_sel和_imp这两个成员。

到此为止cache的缓存流程就结束了。

iOS 底层探索篇 —— cache_t分析

观察cache_t结构体

1. 类结构体

2. cache_t结构体

3. struct bucket_t *_buckets

4. mask_t _mask

5. mask_t _occupied

cache_t缓存的内容

cache_t缓存的流程