Linux cma内存的使用

CMA的全称叫作contiguous memory allocator，它是为了便于进行连续物理内存申请的一块区域，通常咱们把这块区域定义为reserved-memory。node

早期的Linux内核中没有cma的实现，若是驱动想要申请一个大块的物理连续内存，那么只能经过预留专属内存的形式，而后在驱动中使用ioremap来映射后做为私有内存使用。这样带来的后果就是有一部份内存将被预留出来不能做为系统中的通用内存来使用，好比camera、audio设备，它们在工做时是须要大块连续内存进行DMA操做的，而当这些设备不工做时，预留的内存也没法被其余模块所使用。linux

如何使得操做系统可以充分的利用物理内存呢？好比当一些设备须要使用大块连续物理内存时，能够比较容易的申请到，而当这些设备不工做时，这些内存又能够当作普通的内存那样被系统其余模块申请使用。引入CMA就是为了解决这个问题的，定义为cma区域的内存，也是由操做系统来管理的，当一个驱动模块想要申请大块连续内存时，经过内存管理子系统把CMA区域的内存进行迁移，空出连续内存给驱动使用；而当驱动模块释放这块连续内存后，它又被归还给操做系统管理，能够给其余申请者分配使用。算法

我前面的文章有介绍过《对于MIGRATE_MOVABLE的理解》，其中有讲到，buddy system在对内存进行管理时，不一样size的内存块是分类管理的，其中有一类就是 MIGRATE_CMA 类型，这种类型的内存必须是能够迁移的，以保证在分配给dma使用时可以申请成功。

数组

enum {
    MIGRATE_UNMOVABLE,
    MIGRATE_RECLAIMABLE,
    MIGRATE_MOVABLE,
    MIGRATE_PCPTYPES,   /* the number of types on the pcp lists */
    MIGRATE_RESERVE = MIGRATE_PCPTYPES,
#ifdef CONFIG_CMA
    /*
     * MIGRATE_CMA migration type is designed to mimic the way
     * ZONE_MOVABLE works.  Only movable pages can be allocated
     * from MIGRATE_CMA pageblocks and page allocator never
     * implicitly change migration type of MIGRATE_CMA pageblock.
     *
     * The way to use it is to change migratetype of a range of
     * pageblocks to MIGRATE_CMA which can be done by
     * __free_pageblock_cma() function.  What is important though
     * is that a range of pageblocks must be aligned to
     * MAX_ORDER_NR_PAGES should biggest page be bigger then
     * a single pageblock.
     */
    MIGRATE_CMA,
#endif
#ifdef CONFIG_MEMORY_ISOLATION
    MIGRATE_ISOLATE,    /* can't allocate from here */
#endif
    MIGRATE_TYPES
};

CMA的定义app

按照CMA的使用范围，它也能够分为两种类型，一种是通用的CMA区域，该区域是给整个系统分配使用的，另外一种是专用的CMA区域，这种是专门为单个模块定义的，定义它的目的是不太但愿和其余模块共享该区域，咱们能够在dts中定义不一样的CMA区域，每一个区域实际上就是一个reserved memory，对于共享的CMA

函数

reserved_memory: reserved-memory {
    #address-cells = <2>;
    #size-cells = <2>;
    ranges;

    /* global autoconfigured region for contiguous allocations */
    linux,cma {
        compatible = "shared-dma-pool";
        alloc-ranges = <0x0 0x00000000 0x0 0xffffffff>;
        reusable;
        alignment = <0x0 0x400000>;
        size = <0x0 0x2000000>;
        linux,cma-default;
    };
};

对于CMA区域的dts配置来讲，有三个关键点：测试

第一点，必定要包含有reusable，表示当前的内存区域除了被dma使用以外，还能够被内存管理子系统reuse。
第二点，不能包含有no-map属性，该属性表示是否须要建立页表映射，对于通用的内存，必需要建立映射才可使用，而CMA是能够做为通用内存进行分配使用的，所以必需要建立页表映射。
第三点，对于共享的CMA区域，须要配置上linux,cma-default属性，标志着它是共享的CMA。
对于一个专用的CMA，它的配置方式以下：

ui

reserved_memory: reserved-memory {
    #address-cells = <2>;
    #size-cells = <2>;
    ranges;

    priv_mem: priv_region {
        compatible = "shared-dma-pool";
        alloc-ranges = <0x0 0x00000000 0x0 0xffffffff>;
        reusable;
        alignment = <0x0 0x400000>;
        size = <0x0 0xC00000>;
    };
};

先在reserved memory中定义专用的CMA区域，注意这里和上面共享的惟一区别就是在专用CMA区域中是不包含 linux,cma-default; 属性的。那么咱们在使用时怎么用呢？参见以下：
this

qcom,testmodule {
    compatible = "qcom,testmodule";
    memory-region = <&priv_mem>;
};

在须要使用的模块中定义memory-region属性，而且把对应CMA handler经过dts传递给该模块。这样在模块中可使用spa

struct page   *page = NULL;
page = cma_alloc(dev_get_cma_area(dev)，mem_size, 0, GFP_KERNEL);

这里利用了dev_get_cma_area能够获取对应的cma handler，若是获取不到，好比对应模块中并未定义memory-region，那么就会返回共享的cma handler，还记的上面的 linux,cma-default; 属性吗，共享cma区域会被做为缺省cma来使用。

CMA内存分配和释放

当一个内核模块要使用CMA内存时，使用的接口依然是dma的接口：

extern void *
dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle,
           gfp_t flag);

extern void
dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
            dma_addr_t dma_handle);

使能了CMA的平台上述的两个接口都会最终运行到以下的实现中来：

struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
                       unsigned int align, bool no_warn)
{
    if (align > CONFIG_CMA_ALIGNMENT)
        align = CONFIG_CMA_ALIGNMENT;

    return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
}

bool dma_release_from_contiguous(struct device *dev, struct page *pages,
                 int count)
{
    return cma_release(dev_get_cma_area(dev), pages, count);
}

这里最终使用到的是CMA中的实现来分配和释放内存。

generic dma coherent

对于dma framwork来讲，当咱们使能而且配置了CMA区域时会使用CMA进行内存分配，可是内核依然对于旧的实现方式进行了兼容，能够经过 CONFIG_HAVE_GENERIC_DMA_COHERENT 来进行配置。

obj-$(CONFIG_DMA_CMA)           += contiguous.o
obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += coherent.o removed.o

对于generic dma coherent的实现方式，依然借用了dts中的reserved memory节点，只不过在其中会定义 no-map 属性，进而这块内存就从系统中剥离出来了，没法被伙伴系统所使用，可是能够在dma核心层经过remap的形式建立页表映射来使用它。

linux kernel中的CMA即，连续内存区管理，其提供配置为CONFIG_CMA和CONFIG_CMA_DEBUG 毋庸置疑，其管理的是一块块连续内存块。这个在物理地址上是连续的。这点跟咱们使用的伙伴算法以及虚拟地址有点不同。尽管伙伴算法中使用kmalloc申请连续物理内存也能够，可是在长时间测试环境下，连续物理内存可能申请不到。所以，内核设计者设计了CMA，即连续物理内存管理。其定制了一块连续物理内存，专门用于须要连续物理内存的场景，好比DMA。对于这一块连续物理内存来讲，由于物理内存有限，而且使用对象也有限，因此须要很是严格的限制。整个CMA区大小以及base地址和对齐都有限制。函数cma_declare_contiguous()用于对这些CMA区进行一些申明。好比base，size，limit等函数cma_init_reserved_mem()用于从保留内存块里面获取一块内存用于CMA块。须要注意，这里定义的块数为MAX_CMA_AREAS，也就是说，你用户想使用的CMA块个数，或者用户数最大为MAX_CMA_AREAS 咱们CMA就是对这MAX_CMA_AREAS个块进行管理。以后调用函数cma_init_reserved_areas()把这些CMA块激活。固然，咱们正常使用时，能够调用函数cma_alloc()分配CMA内存或者cma_release()对申请的CMA内存释放。咱们先看内核对CMA内存的一个全局约束，即函数cma_declare_contiguous()实现： /** * cma_declare_contiguous() - reserve custom contiguous area * @base: Base address of the reserved area optional, use 0 for any * @size: Size of the reserved area (in bytes), * @limit: End address of the reserved memory (optional, 0 for any). * @alignment: Alignment for the CMA area, should be power of 2 or zero * @order_per_bit: Order of pages represented by one bit on bitmap. * @fixed: hint about where to place the reserved area * @res_cma: Pointer to store the created cma region. * * This function reserves memory from early allocator. It should be * called by arch specific code once the early allocator (memblock or bootmem) * has been activated and all other subsystems have already allocated/reserved * memory. This function allows to create custom reserved areas. * * If @fixed is true, reserve contiguous area at exactly @base. If false, * reserve in range from @base to @limit. */ int __init cma_declare_contiguous(phys_addr_t base, phys_addr_t size, phys_addr_t limit, phys_addr_t alignment, unsigned int order_per_bit, bool fixed, struct cma **res_cma) { phys_addr_t memblock_end = memblock_end_of_DRAM(); phys_addr_t highmem_start; int ret = 0; #ifdef CONFIG_X86 /* * high_memory isn't direct mapped memory so retrieving its physical * address isn't appropriate. But it would be useful to check the * physical address of the highmem boundary so it's justifiable to get * the physical address from it. On x86 there is a validation check for * this case, so the following workaround is needed to avoid it. */ highmem_start = __pa_nodebug(high_memory); #else highmem_start = __pa(high_memory); #endif pr_debug("%s(size %pa, base %pa, limit %pa alignment %pa)\n", __func__, &size, &base, &limit, &alignment); if (cma_area_count == ARRAY_SIZE(cma_areas)) { pr_err("Not enough slots for CMA reserved regions!\n"); return -ENOSPC; } if (!size) return -EINVAL; if (alignment && !is_power_of_2(alignment)) return -EINVAL; /* * Sanitise input arguments. * Pages both ends in CMA area could be merged into adjacent unmovable * migratetype page by page allocator's buddy algorithm. In the case, * you couldn't get a contiguous memory, which is not what we want. */ alignment = max(alignment, (phys_addr_t)PAGE_SIZE << max_t(unsigned long, MAX_ORDER - 1, pageblock_order)); base = ALIGN(base, alignment); size = ALIGN(size, alignment); limit &= ~(alignment - 1); if (!base) fixed = false; /* size should be aligned with order_per_bit */ if (!IS_ALIGNED(size >> PAGE_SHIFT, 1 << order_per_bit)) return -EINVAL; /* * If allocating at a fixed base the request region must not cross the * low/high memory boundary. */ if (fixed && base < highmem_start && base + size > highmem_start) { ret = -EINVAL; pr_err("Region at %pa defined on low/high memory boundary (%pa)\n", &base, &highmem_start); goto err; } /* * If the limit is unspecified or above the memblock end, its effective * value will be the memblock end. Set it explicitly to simplify further * checks. */ if (limit == 0 || limit > memblock_end) limit = memblock_end; /* Reserve memory */ if (fixed) { if (memblock_is_region_reserved(base, size) || memblock_reserve(base, size) < 0) { ret = -EBUSY; goto err; } } else { phys_addr_t addr = 0; /* * All pages in the reserved area must come from the same zone. * If the requested region crosses the low/high memory boundary, * try allocating from high memory first and fall back to low * memory in case of failure. */ if (base < highmem_start && limit > highmem_start) { addr = memblock_alloc_range(size, alignment, highmem_start, limit, MEMBLOCK_NONE); limit = highmem_start; } if (!addr) { addr = memblock_alloc_range(size, alignment, base, limit, MEMBLOCK_NONE); if (!addr) { ret = -ENOMEM; goto err; } } /* * kmemleak scans/reads tracked objects for pointers to other * objects but this address isn't mapped and accessible */ kmemleak_ignore_phys(addr); base = addr; } ret = cma_init_reserved_mem(base, size, order_per_bit, res_cma); if (ret) goto err; pr_info("Reserved %ld MiB at %pa\n", (unsigned long)size / SZ_1M, &base); return 0; err: pr_err("Failed to reserve %ld MiB\n", (unsigned long)size / SZ_1M); return ret; } /** * cma_init_reserved_mem() - create custom contiguous area from reserved memory * @base: Base address of the reserved area * @size: Size of the reserved area (in bytes), * @order_per_bit: Order of pages represented by one bit on bitmap. * @res_cma: Pointer to store the created cma region. * * This function creates custom contiguous area from already reserved memory. */ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, unsigned int order_per_bit, struct cma **res_cma) { struct cma *cma; phys_addr_t alignment; /* Sanity checks */ if (cma_area_count == ARRAY_SIZE(cma_areas)) { pr_err("Not enough slots for CMA reserved regions!\n"); return -ENOSPC; } if (!size || !memblock_is_region_reserved(base, size)) return -EINVAL; /* ensure minimal alignment required by mm core */ alignment = PAGE_SIZE << max_t(unsigned long, MAX_ORDER - 1, pageblock_order); /* alignment should be aligned with order_per_bit */ if (!IS_ALIGNED(alignment >> PAGE_SHIFT, 1 << order_per_bit)) return -EINVAL; if (ALIGN(base, alignment) != base || ALIGN(size, alignment) != size) return -EINVAL; /* * Each reserved area must be initialised later, when more kernel * subsystems (like slab allocator) are available. */ cma = &cma_areas[cma_area_count]; cma->base_pfn = PFN_DOWN(base); cma->count = size >> PAGE_SHIFT; cma->order_per_bit = order_per_bit; *res_cma = cma; cma_area_count++; totalcma_pages += (size / PAGE_SIZE); return 0; } 这些reserve的内存存放到cma_areas[]数组中。须要注意，这些reserve的内存是存放计入totalcma_pages中的。因为这些全部reserve的内存都是以cma_areas[]形式管理，因此，其管理的很是有限。函数cma_init_reserved_areas()会把早期reserve的内存放入zone管理中的MIGRATE_CMA链表中。 static int __init cma_init_reserved_areas(void) { int i; for (i = 0; i < cma_area_count; i++) { int ret = cma_activate_area(&cma_areas[i]); if (ret) return ret; } return 0; } core_initcall(cma_init_reserved_areas); static int __init cma_activate_area(struct cma *cma) { int bitmap_size = BITS_TO_LONGS(cma_bitmap_maxno(cma)) * sizeof(long); unsigned long base_pfn = cma->base_pfn, pfn = base_pfn; unsigned i = cma->count >> pageblock_order; struct zone *zone; cma->bitmap = kzalloc(bitmap_size, GFP_KERNEL); if (!cma->bitmap) return -ENOMEM; WARN_ON_ONCE(!pfn_valid(pfn)); zone = page_zone(pfn_to_page(pfn)); do { unsigned j; base_pfn = pfn; for (j = pageblock_nr_pages; j; --j, pfn++) { WARN_ON_ONCE(!pfn_valid(pfn)); /* * alloc_contig_range requires the pfn range * specified to be in the same zone. Make this * simple by forcing the entire CMA resv range * to be in the same zone. */ if (page_zone(pfn_to_page(pfn)) != zone) goto err; } init_cma_reserved_pageblock(pfn_to_page(base_pfn)); } while (--i); mutex_init(&cma->lock); #ifdef CONFIG_CMA_DEBUGFS INIT_HLIST_HEAD(&cma->mem_head); spin_lock_init(&cma->mem_head_lock); #endif return 0; err: kfree(cma->bitmap); cma->count = 0; return -EINVAL; } #ifdef CONFIG_CMA /* Free whole pageblock and set its migration type to MIGRATE_CMA. */ void __init init_cma_reserved_pageblock(struct page *page) { unsigned i = pageblock_nr_pages; struct page *p = page; do { __ClearPageReserved(p); set_page_count(p, 0); } while (++p, --i); set_pageblock_migratetype(page, MIGRATE_CMA); if (pageblock_order >= MAX_ORDER) { i = pageblock_nr_pages; p = page; do { set_page_refcounted(p); __free_pages(p, MAX_ORDER - 1); p += MAX_ORDER_NR_PAGES; } while (i -= MAX_ORDER_NR_PAGES); } else { set_page_refcounted(page); __free_pages(page, pageblock_order); } adjust_managed_page_count(page, pageblock_nr_pages); } #endif void adjust_managed_page_count(struct page *page, long count) { spin_lock(&managed_page_count_lock); page_zone(page)->managed_pages += count; totalram_pages += count; #ifdef CONFIG_HIGHMEM if (PageHighMem(page)) totalhigh_pages += count; #endif spin_unlock(&managed_page_count_lock); } EXPORT_SYMBOL(adjust_managed_page_count);