iOS底层学习 - 从编译到启动的奇幻旅程(三)

在上两章节,咱们已经了解了一个App从编译到main函数的调用,发生了什么事情,而且咱们知道了_objc_init在加载镜像文件时,会在dyld动态连接器中去注册,二者以前经过此来进行通信。可是dyld加载相关镜像文件后,这些镜像文件是如何加载到内存当中的,是以什么方式存在于内存当中的,这就是本章探究的核心。html

相关文章传送门:swift

iOS底层学习 - 从编译到启动的奇幻旅程(二)数组

iOS底层学习 - 类的前世此生(一)bash

咱们知道dyld的主体流程就是连接动态库和镜像文件,那么objc的镜像文件自己是如何进行读取到内存中的,咱们从源码来解读app

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}
复制代码

经过源码,咱们基本能够得出如下结论:函数

  • _objc_initlibSystem库调用
  • 在此方法里进行自举镜像加载到内存

下面咱们逐步进行分析,objc相关类等,是如何加载的oop

准备工做

在镜像加载以前,objc进行了一系列的准备工做,咱们来逐步分析,以下图:post

environ_init

根据字面意思咱们能够得出,这个方法是读取影响运行时的环境变量,可使用 export OBJC_HELP=1 来打印环境变量,从而进行一些调试,能够再Xcode中进行设置,从而达到想要的效果打印。相关能够参考OBJC_HELP学习

  • OBJC_DISABLE_NONPOINTER_ISA 这个能够设置non-pointer的ISA,ISA的值不须要和mask进行与操做,直接指向
  • OBJC_PRINT_LOAD_METHODS这个能够打印类和分类的load方法,对咱们进行启动优化颇有帮助。

tls_init

这个函数是关于线程Key的绑定,好比线程数据的析构函数优化

static_init

根据注释能够得出,这个函数主要作了以下事情

  • 运行C++静态构造函数
  • 在dyld调用咱们静态构造函数以前,libc会调用_objc_init(),因此必须在此前调用
/***********************************************************************
* static_init
* Run C++ static constructor functions.
* libc calls _objc_init() before dyld would call our static constructors, 
* so we have to do it ourselves.
**********************************************************************/
static void static_init()
{
    size_t count;
    auto inits = getLibobjcInitializers(&_mh_dylib_header, &count);
    for (size_t i = 0; i < count; i++) {
        inits[i]();
    }
}
复制代码

lock_init

是一个空实现,说明objc采用的是C++的加锁机制

exception_init

初始化 libobjc的异常处理系统,用来监控崩溃等,好比未实现的方法

_dyld_objc_notify_register

经过上一章,咱们对这个方法已经有了了解,这是一个dyld的注册回调函数,从而让dyld能够连接加载镜像

  • 这个函数只在运行时提供给objc使用
  • 注册处理程序,以便在映射和取消映射和初始化objc镜像是调用
  • dyld将使用包含objc_image_info的镜像文件的数组,回调给mapped函数

//
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to
// call dlopen() on them to keep them from being unloaded.  During the call to _dyld_objc_notify_register(),
// dyld will call the "mapped" function with already loaded objc images.  During any later dlopen() call,
// dyld will also call the "mapped" function.  Dyld will call the "init" function when dyld would be called
// initializers in that image.  This is when objc calls any +load methods in that image.
//
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped);
复制代码

map_images

这个方法是将镜像加载到内存时候,触发的主要方法,因此咱们主要来探究这个方法是怎样将数据,类,分类,方法等以什么方式加载到内存中的。

/***********************************************************************
* map_images
* Process the given images which are being mapped in by dyld.
* Calls ABI-agnostic code after taking ABI-specific locks.
*
* Locking: write-locks runtimeLock
**********************************************************************/
void
map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    mutex_locker_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}
复制代码

经过map_images_nolock的源码咱们能够发现,若是hCount表示镜像文件的个数,则调用_read_images函数来进行加载镜像文件。因此加载内存确定在此

_read_images解析

因为代码比较长,咱们先作一个大致的归纳,而后逐步进行研究,基本处理以下:

  • 第一次加载全部类到表中

    gdb_objc_realized_classes为全部类的表-包括实现和未实现的

    allocatedClasses包含使用objc_allocateClassPair分配的全部类(元类)的表

  • 对全部的类作重映射

  • 将全部的SEL注册到namedSelector表中

  • 修复旧的objc_msgSend_fixup调用致使一些消息没有处理

  • 将全部的Protocol都添加到protocol_map表中

  • 对全部的Protocol作重映射,获取到引用

  • 初始化全部非懒加载的类,进行rwro等操做

  • 遍历已经标记的懒加载的类,并作相应的初始化

  • 处理全部的Category,包括类和元类

  • 初始化全部未初始化的类

下面咱们主要对类的加载来进行重点的分析

doneOnce

变量doneOnce表示这个操做只进行一次,由于是建立表的操做,因此只须要一次建立便可,主要的代码以下

if (!doneOnce) {
        doneOnce = YES;
        ...
        initializeTaggedPointerObfuscator();
        // namedClasses
        // Preoptimized classes don't go in this table. // 4/3 is NXMapTable's load factor
        //✅实例化存储类的哈希表,而且根据当前类数量作动态扩容
        int namedClassesSize = 
            (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3;
        //✅建立一张包含全部的类和元类的表
        gdb_objc_realized_classes =
            NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);
        //✅建立只包含已经初始化的类的表
        allocatedClasses = NXCreateHashTable(NXPtrPrototype, 0, nil);
        
        ts.log("IMAGE TIMES: first time tasks");
    }

复制代码

类的重映射

类的重映射相关代码以下。主要是从列表中遍历出类,并进行处理和添加到相对应的表中

// Discover classes. Fix up unresolved future classes. Mark bundle classes.

    for (EACH_HEADER) {
        // ✅从编译后的类列表中取出全部类,获取到的是一个classref_t类型的指针
        classref_t *classlist = _getObjc2ClassList(hi, &count);
        
        if (! mustReadClasses(hi)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->isPreoptimized();
        
        for (i = 0; i < count; i++) {
             // ✅数组中会取出OS_dispatch_queue_concurrent、OS_xpc_object、NSRunloop等系统类,例如CF、Fundation、libdispatch中的类。以及本身建立的类
             // ✅这时候的类只有相对应的地址,无其余信息
            Class cls = (Class)classlist[i];
            
            // ✅经过readClass函数获取处理后的新类,下面具体分析readClass
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            //✅ 初始化全部懒加载的类须要的内存空间 - 如今数据没有加载到的 - 连类都没有初始化的
            if (newCls != cls  &&  newCls) {
                // Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.

                //✅ 将懒加载的类添加到数组中
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class));
                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
            }
        }
    }
复制代码

下面看readClass是如何处理类的?

咱们能够看到有以下图的代码,里面对clsrw等进行了处理,咱们知道rw里存了类的方法等,因此是否是在这里处理的呢?

可是咱们在方法里打断点,发现并无执行,说明咱们建立的类和系统方法的类都没有走这个方法,因此类的rw数据填充并非在此

经过红框中的判断,咱们可得,这个判断条件是处理专门针对将来的待处理的类的特殊操做

那么继续向下能够看到以下代码,能够看到主要是执行了addNamedClassaddClassTableEntry两个函数

if (headerIsPreoptimized  &&  !replacing) {
        // class list built in shared cache
        // fixme strict assert does not work because of duplicates
        // assert(cls == getClass(name));
        assert(getClassExceptSomeSwift(mangledName));
    } else {
        addNamedClass(cls, mangledName, replacing);
        addClassTableEntry(cls);
}
复制代码

查看addNamedClass相关代码,主要将类添加到底层总的哈希表中

/***********************************************************************
* addNamedClass
* Adds name => cls to the named non-meta class map.
* Warns about duplicate class names and keeps the old mapping.
* Locking: runtimeLock must be held by the caller
**********************************************************************/
static void addNamedClass(Class cls, const char *name, Class replacing = nil)
{
    runtimeLock.assertLocked();
    Class old;
    if ((old = getClassExceptSomeSwift(name))  &&  old != replacing) {
        inform_duplicate(name, old, cls);

        // getMaybeUnrealizedNonMetaClass uses name lookups.
        // Classes not found by name lookup must be in the
        // secondary meta->nonmeta table.
        addNonMetaClass(cls);
    } else {
        //✅ 将类添加到总表中
        NXMapInsert(gdb_objc_realized_classes, name, cls);
    }
    assert(!(cls->data()->flags & RO_META));

    // wrong: constructed classes are already realized when they get here
    // assert(!cls->isRealized());
}
复制代码

查看addClassTableEntry相关代码,由于当前类已经有了地址,进行了初始化,因此也要添加到allocatedClasses哈希表中

/***********************************************************************
* addClassTableEntry
* Add a class to the table of all classes. If addMeta is true,
* automatically adds the metaclass of the class as well.
* Locking: runtimeLock must be held by the caller.
**********************************************************************/
static void addClassTableEntry(Class cls, bool addMeta = true) {
    runtimeLock.assertLocked();

    // This class is allowed to be a known class via the shared cache or via
    // data segments, but it is not allowed to be in the dynamic table already.
    assert(!NXHashMember(allocatedClasses, cls));

    if (!isKnownClass(cls))
        NXHashInsert(allocatedClasses, cls);
    if (addMeta)
        addClassTableEntry(cls->ISA(), false);
}
复制代码

至此,初始化类已经添加到两张表中了

SEL添加到表中

SEL相关代码的处理以下,主要也是一个表写入的操做,写入了namedSelector表中,和类并非一张表

//✅ 将全部SEL都注册到哈希表中,是另一张哈希表
    // Fix up @selector references
    static size_t UnfixedSelectors;
    {
        mutex_locker_t lock(selLock);
        for (EACH_HEADER) {
            if (hi->isPreoptimized()) continue;
            
            bool isBundle = hi->isBundle();
            SEL *sels = _getObjc2SelectorRefs(hi, &count);
            UnfixedSelectors += count;
            for (i = 0; i < count; i++) {
                const char *name = sel_cname(sels[i]);
                //✅  注册SEL的操做
                sels[i] = sel_registerNameNoLock(name, isBundle);
            }
        }
    }

复制代码

将全部的Protocol都添加到protocol_map表中

相关代码以下。

// Discover protocols. Fix up protocol refs.
    //✅ 遍历全部协议列表,而且将协议列表加载到Protocol的哈希表中
    for (EACH_HEADER) {
        extern objc_class OBJC_CLASS_$_Protocol;
        //✅ cls = Protocol类,全部协议和对象的结构体都相似,isa都对应Protocol类
        Class cls = (Class)&OBJC_CLASS_$_Protocol;
        assert(cls);
        //✅ 获取protocol哈希表
        NXMapTable *protocol_map = protocols();
        bool isPreoptimized = hi->isPreoptimized();
        bool isBundle = hi->isBundle();

        //✅ 从编译器中读取并初始化Protocol
        protocol_t **protolist = _getObjc2ProtocolList(hi, &count);
        for (i = 0; i < count; i++) {
            readProtocol(protolist[i], cls, protocol_map, 
                         isPreoptimized, isBundle);
        }
    }

复制代码

初始化全部非懒加载的类,进行rw、ro等操做

// Realize non-lazy classes (for +load methods and static instances)
    //✅ 实现非懒加载的类,对于load方法和静态实例变量
    for (EACH_HEADER) {
        //✅ 获取到非懒加载类的列表
        classref_t *classlist = 
            _getObjc2NonlazyClassList(hi, &count);
        for (i = 0; i < count; i++) {
            //✅  从镜像列表中映射出来
            Class cls = remapClass(classlist[i]);
            // printf("non-lazy Class:%s\n",cls->mangledName());
            if (!cls) continue;

            // hack for class __ARCLite__, which did not get this above
#if TARGET_OS_SIMULATOR
            if (cls->cache._buckets == (void*)&_objc_empty_cache  &&  
                (cls->cache._mask  ||  cls->cache._occupied)) 
            {
                cls->cache._mask = 0;
                cls->cache._occupied = 0;
            }
            if (cls->ISA()->cache._buckets == (void*)&_objc_empty_cache  &&  
                (cls->ISA()->cache._mask  ||  cls->ISA()->cache._occupied)) 
            {
                cls->ISA()->cache._mask = 0;
                cls->ISA()->cache._occupied = 0;
            }
#endif
            //✅ 再次插入到表allocatedClasses表中
            addClassTableEntry(cls);

            if (cls->isSwiftStable()) {
                if (cls->swiftMetadataInitializer()) {
                    _objc_fatal("Swift class %s with a metadata initializer "
                                "is not allowed to be non-lazy",
                                cls->nameForLogging());
                }
                // fixme also disallow relocatable classes
                // We can not disallow all Swift classes because of
                // classes like Swift.__EmptyArrayStorage
            }
            //✅ 实现全部非懒加载的类(实例化类对象的一些信息,例如rw)
            realizeClassWithoutSwift(cls);
        }
    }

复制代码

查看realizeClassWithoutSwift相关代码

ro表示readonly,是在编译时刻就已经赋值的,可是此时rw还并无赋值,因此这一步,主要是初始化rw

// fixme verify class is not in an un-dlopened part of the shared cache?

    ro = (const class_ro_t *)cls->data();
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.
        rw = cls->data();
        ro = cls->data()->ro;
        cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
    } else {
        // Normal class. Allocate writeable class data.
        rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
        rw->ro = ro;
        rw->flags = RW_REALIZED|RW_REALIZING;
        cls->setData(rw);
    }
复制代码

经过下面的代码咱们能够发现,此时对类的superclassisa进行了处理,会根据类的继承链关系进行递归操做,知道类为nil,也就是NSObject的父类

if (!cls) return nil;
...
supercls = realizeClassWithoutSwift(remapClass(cls->superclass));
metacls = realizeClassWithoutSwift(remapClass(cls->ISA()));

cls->superclass = supercls;
cls->initClassIsa(metacls);
复制代码

最后,咱们发现函数最后调用了methodizeClass方法,根据明明,猜想是方法等的初始化

static Class realizeClassWithoutSwift(Class cls)
{
    ...
    // Attach categories
    methodizeClass(cls);
    return cls;
}

复制代码

methodizeClass函数的实现中,咱们发现了,刚刚初始化的rw的赋值,是从ro中取出相关数据,直接赋值给rw

// Install methods and properties that the class implements itself.
    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1, YES, isBundleClass(cls));
        rw->methods.attachLists(&list, 1);
    }

    property_list_t *proplist = ro->baseProperties;
    if (proplist) {
        rw->properties.attachLists(&proplist, 1);
    }

    protocol_list_t *protolist = ro->baseProtocols;
    if (protolist) {
        rw->protocols.attachLists(&protolist, 1);
    }
复制代码

那么attachLists是如何插入数据的呢

根据代码咱们能够发现,主要经过把oldList向后偏移addedCount的位置,而后把新的addedLists总体插入到表的前面,从而实现分类的方法覆盖本类同名方法,因此分类的方法会比原方法先调用,并无覆盖

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;//10
            uint32_t newCount = oldCount + addedCount;//4
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;// 10+4
   
            memmove(array()->lists + addedCount, array()->lists,
                    oldCount * sizeof(array()->lists[0]));
            
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }
复制代码

一个类的加载的主体流程以下

read_images内部先建立一个全局类的表gdb_objc_realized_classes和一个已经初始化的类的表allocatedClasses,以后对类进行初始化,并加载到表中,而后把SELProtocol等也映射到内存对应的表中中去,和类并非一个表,并在对非懒加载类进行处理的时候,经过realizeClassWithoutSwiftro进行赋值,而且初始化rw,以后经过methodizeClassrw赋值,完成数据的加载

至此,一个类所须要属性的赋值加载都已经完成

未完待续

相关文章
相关标签/搜索