text, data and bss: Code and Data Size Explained

本文原文链接

In “Code Size Information with gcc for ARM/Kinetis” I use an option in the ARM gcc tool chain for Eclipse to show me the code size:

在“在ARM/Kinetis项目中用GCC编译器输出代码占用空间信息”一文中我用了一个用于Eclipse的ARM gcc工具链选项来输出显示代码大小:

text       data        bss        dec        hex    filename
0x1408       0x18      0x81c       7228       1c3c    size.elf

I have been asked by a reader of this blog what these item numbers really mean. Especially: what the heck is ‘bss’???

一个读者在这篇博客上问我这些字段的数值的真正含义, 特别是’bss’字段。

Note: I’m using the ARM GNU ‘printsize’ utility for gcc, with an example for Kinetis-L (KL25Z).
注:我用来输出这些代码空间信息工具是ARM GNU ‘printsize’,并且以Kinetis-L (KL25Z)作为示例。

text

‘text’ is what ends up in FLASH memory. I can show this with adding

text段最终是存放在FLASH存储器中的。通过增加如下代码到程序中:

void foo(void) {
  /* dummy function to show how this adds to 'text' */
}

to my program, the ‘text’ part increases so:

接着text段的大小增长如下:

text       data        bss
 0x1414       0x18      0x81c

Likewise, my new function ‘foo’ gets added to the .text segment, as I can see in the map file generated by the linker:

同样的,在链接器产生的map文件里也能看到我新增加的函数foo添加至text段。

*(.text*)
 .text.foo      0x000008c8        0x8 ./Sources/main_c.o
                0x000008c8                foo

But it does not only contain functions, it has constant data as too. If I have a constant table like

但text段不仅包含函数,还有常量。例如我有如下的一个常量表:

const int table[] = {5,0,1,5,6,7,9,10};

then this adds to ‘text’ too. That variable ‘table’ will be in FLASH, initialized with the values specified in the source.

这将会被添加到‘text’段,这个变量‘table’将会在FLASH中,被代码中指定的值所初始化。

Another thing which is included in ‘text’ is the interrupt vector table (more on this later).

还有一样包含在text段里的东西是中断向量表(后续详细说明),因此这也被计算到text段。变量table也会放在FLASH中,并以源码中的数据初始化。

In summary: ‘text’ is what ends up typically in FLASH and has code and constant data.

小结:text段最终存放在FLASH里而,所包含的内容是代码和常量。

data

‘data’ is used for initialized data. This is best explained with the following (global/extern) variable:

data段是用于初始化数据。用如下的变量(全局/外部)可以解释得很清楚:

int32_t myVar = 0x12345678;

Adding above variable to my application will increase the ‘data’ portion by 4 bytes:

加入上述变量会导致我的应用的data部分增长四个字节:

text       data        bss
 0x1414       0x1c      0x81c

This variable ‘myVar’ is not constant, so it will end up in RAM. But the initialization (0x12345678) is constant, and can live in FLASH memory. The initialization of the variable is done during the normal ANSI startup code. The code will assign/copy the initialization value. This is sometimes named ‘copy-down’. For the startup code used by CodeWarrior for MCU10.3 for Kinetis-L (ARM Cortex-M0+), this is performed in __copy_rom_sections_to_ram():

变量myVar不是常量,所以最终会存放于RAM内。但是初始值(0x12345678)是一个常量,因此可以放在FLASH里。这个变量的初始化在常规的ANSI启动代码中完成。有时这叫做“原样复制”。对用于Kinetics-L (ARM Cortex-M0+)的CodeWarrior的MCU10.3版本所用的启动代码而言,这种操作在__copy_rom_sections_to_ram()中进行。

ARM Startup Code Initializing Variables

Just one thing to consider: my variable ‘myVar’ will use space in RAM (4 bytes in my case), plus space in FLASH/ROM for the initialization value (0x12345678). So I need to count the ‘data’ size twice: that size will end up in RAM, plus will occupy FLASH/ROM. That amount of data in FLASH is not counted in the text portion.

还有一件事情需要考虑:变量myVar将占用RAM的空间(本例中占4个字节),还需累加在FLASH/ROM中初始值(0x12345678)所占用的空间。所以我需要计算data的段的大小两次:即RAM中占的加上FLASH/ROM中占的。而且FLASH中所占的部分并不会计入text部分。

In summary : The ‘data’ only has the initialization data (in my example 0x12345678. And not the variable (myVar).
小结:data段仅包含初始化所用的数据(本例中的0x12345678),并且不含变量(myVar)。

bss

The ‘bss’ contains all the uninitalized data.

bss段包含着所有未初始化的数据。

bss (or .bss, or BSS) is the abbreviation for ‘Block Started by Symbol’ by an old assembler (see this link).
bss(.bss, BSS ) 是旧式汇编器中‘Block Started by Symbol’的简称(详情参看 link)。

This is best explained with following (global/extern) variable:

用如下的变量(全局/外部)可以解释得很清楚:

int32_t myGlobal;

Adding this variable will increase the ‘bss’ portion by 4:

加入上述变量会导致bss部分增长4个字节:

text       data        bss
 0x1414       0x18      0x820

I like to remember ‘bss’ as ‘Better Save Space’ 😃. As bss ends up in RAM, and RAM is very valuable for a microcontroller, I want to keep the amount of variables which end up in the .bss at the absolute minimum.
我喜欢把bss当作‘Better Save Space’(最好节省空间)的简称。因为bss最终存放在RAM内,而且RAM对于单片机来讲是一种宝贵的资源,所以我会令存放在bss中的变量数量尽可能的少。

The bss segment is initialized in the startup code by the zero_fill_bss() function:

启动代码中调用zero_fill_bss()函数初始化bss段:

static void zero_fill_bss(void)
{
    extern char __START_BSS[];
    extern char __END_BSS[];
 
    memset(__START_BSS, 0, (__END_BSS - __START_BSS));
}

dec

The ‘dec’ (as a decimal number) is the sum of text, data and bss:

dec(decimal的缩写,即十进制数)是text,data和bss的算术和。

dec = text + data + bss

Size – GNU Utility

The size (or printsize) GNU utility has more options:
GNU工具 size ( printsize)有许多选项:

size [-A|-B|--format=compatibility]
          [--help]
          [-d|-o|-x|--radix=number]
          [--common]
          [-t|--totals]
          [--target=bfdname] [-V|--version]
          [objfile...]

The ‘System V’ option can be set directly in the Eclipse panel:

‘System V’选项能直接在Eclipse中设置:
GNU Print Size Option in CodeWarrior for MCU10.3
It produces similar information as shown above, but with greater detail.

这将会输出和上面差不多的代码尺寸信息,但是会更详细。

To illustrate this, I use

为了解释这点,我用如下数组变量做示例:

int table[] = {1,2,3,4,5};

While in ‘Berkeley’ mode I get:

当选择‘Berkeley’模式时输出如下:

text       data        bss        dec        hex    filename
 0x140c       0x2c      0x81c       7252       1c54    size.elf

I get this in ‘System V’ mode:

当选择’System V’模式时输出如下:

section                size         addr
.interrupts            0xc0          0x0
.text                0x134c        0x800
.data                  0x14   0x1ffff000
.bss                   0x1c   0x1ffff014
.romp                  0x18   0x1ffff030
._user_heap_stack     0x800   0x1ffff048
.ARM.attributes        0x31          0x0
.debug_info          0x2293          0x0
.debug_abbrev         0xe66          0x0
.debug_loc           0x27df          0x0
.debug_aranges        0x318          0x0
.debug_macinfo      0x53bf3          0x0
.debug_line          0x1866          0x0
.debug_str            0xc23          0x0
.comment               0x79          0x0
.debug_frame          0x594          0x0
Total               0x5defe

I’m using an ARM Cortex-M0+ in my example, so addresses greater 0x1ffff000 are in RAM.

例子中我用的是ARM Cortex-M0+内核,故在RAM中的地址从0x1ffff000开始。

The lines from .ARM.attributes up to .debug_frame are not ending up in the target, they are debug and other information.

其中.ARM.attributes up 到 .debug_frame所列内容最终不会放在目标硬件中,这些事调试或者其他信息。

.interrupts is my interrupt vector table, and .text is my code plus constants, and is in FLASH memory. That makes the 0xc0+0x134c=0x140c for text in ‘Berkeley’.

.interrupts其是本例的中断向量表,.text是存放在FLASH里的代码和常量。故‘Berkeley’下text段的大小即为:0xc0+0x134c=0x140c。

.bss is my uninitialized (zero-outed) variable area. Additionally there is .user_heap_stack: this is the heap defined in the ANSI library for malloc() calls. That makes the total of 0x1c+0x800=0x81c shown in ‘Berkeley’ format.

.bss是本例中未初始化(为0)变量区域。此外还有一个.user_heap_stack:段用于预留ANSI库中malloc()调用分配的内存。故‘Berkeley’下bss段的大小即为:0x1c+0x800=0x81c。

.data is for my initialized ‘table[]’ variable in RAM (5*4 bytes=0x14)

.data存放了本例中存放在RAM内的初始化了的table[]变量。

The .romp is used by the linker for the ‘copy-down’ and initialization of .data. But it looks confusing: it is shown with addresses in RAM? Checking the linker map file shows:

romp是链接器用于需要‘copy-down’的data中的初始化数据。但是这看起来有点迷:这显示的地址是RAM的地址?链接器生成的map文件说如是:

.romp           0x1ffff030       0x18 load address 0x00001b60
                0x00001b60                __S_romp = _romp_at
                0x1ffff030        0x4 LONG 0x1b4c ___ROM_AT
                0x1ffff034        0x4 LONG 0x1ffff000 _sdata
                0x1ffff038        0x4 LONG 0x14 ___data_size
                0x1ffff03c        0x4 LONG 0x0
                0x1ffff040        0x4 LONG 0x0
                0x1ffff044        0x4 LONG 0x0

Ah! That actually is not in RAM, but in FLASH: the linker maps this to the FLASH address 0x1b60! So this size 0x18 really needs to be added to the FLASH size too!

啊!实际上这并不是在RAM里,而是在FLASH里:链接器映射这段到FLASH地址0x1b60中!故这个0x18大小的尺寸实际上也需要加到FLASH所占空间里去!

Summary

I hope I have sorted out things in a correct way. The way how the initialized data is reported might be confusing. But with the right knowledge (and .map file in mind), things get much clearer:

我希望我把事情表达正确了。虽然初始化数据的报告信息可能挺迷的,但是通过正确的分析(脑中的 .mao文件),事情反而更清楚了。

text’ is my code, vector table plus constants.

text放的是是代码,向量表及常量。

data’ is for initialized variables, and it counts for RAM and FLASH. The linker allocates the data in FLASH which then is copied from ROM to RAM in the startup code.

data放的是初始化的变量,且同时计入RAM和FLASH。链接器把数据分配在FLASH中然后在启动代码中从ROM拷贝到RAM。

bss’ is for the uninitialized data in RAM which is initialized with zero in the startup code.

bss放的是RAM中未初始化的变量,这些变量将在启动代码中填充0。

Happy Sizing