C语言中的结构体位域本质(bits field)与应用

时间 2019-11-20

标签 c语言结构体位本质 bits field 应用繁體版

原文原文链接

基本形式

例如ui

struct bits{
    uint32_t low:9;      //低9位
    uint32_t middle: 13; //中间13位
    uint32_t high: 10;   //高10位
};

假设值为0xABCDEF09, 二进制为10101011110011011110111100001001scala

1010101111   0011011110111  100001001
high(0x2AF)  middle(0X6F7)  low(0x109)

本质

本质就是位操做(移位/与/或)
对位域进行操做时, 并非简单的赋值, 好比code

struct bits{
    uint32_t low:9;      //低9位
    uint32_t middle: 13; //中间13位
    uint32_t high: 10;   //高10位
};

struct bits b = {1, 2, 3};
int x = b.middle;

它所对应的汇编以下内存

struct bits b = {1, 2, 3};
  4004e1:	0f b7 45 f0          	movzwl -0x10(%rbp),%eax
  4004e5:	66 25 00 fe          	and    $0xfe00,%ax
  4004e9:	83 c8 01             	or     $0x1,%eax
  4004ec:	66 89 45 f0          	mov    %ax,-0x10(%rbp)
  4004f0:	8b 45 f0             	mov    -0x10(%rbp),%eax
  4004f3:	25 ff 01 c0 ff       	and    $0xffc001ff,%eax
  4004f8:	80 cc 04             	or     $0x4,%ah
  4004fb:	89 45 f0             	mov    %eax,-0x10(%rbp)
  4004fe:	0f b7 45 f2          	movzwl -0xe(%rbp),%eax
  400502:	83 e0 3f             	and    $0x3f,%eax
  400505:	0c c0                	or     $0xc0,%al
  400507:	66 89 45 f2          	mov    %ax,-0xe(%rbp)
	int x = b.middle;
  40050b:	8b 45 f0             	mov    -0x10(%rbp),%eax
  40050e:	c1 e8 09             	shr    $0x9,%eax
  400511:	66 25 ff 1f          	and    $0x1fff,%ax
  400515:	0f b7 c0             	movzwl %ax,%eax
  400518:	89 45 fc             	mov    %eax,-0x4(%rbp)

上面的汇编能够看出, 编译器自动帮咱们生成了各类移位/与/或相关的位操做.ci

若是不使用位域, 使用普通的结构体编译器

struct bits{
    uint32_t low;    
    uint32_t middle; 
    uint32_t high;  
};

struct bits b = {1, 2, 3};
int x = b.middle;

对应的汇编是这样的it

struct bits b = {1, 2, 3};
  4004e1:	c7 45 f0 01 00 00 00 	movl   $0x1,-0x10(%rbp)
  4004e8:	c7 45 f4 02 00 00 00 	movl   $0x2,-0xc(%rbp)
  4004ef:	c7 45 f8 03 00 00 00 	movl   $0x3,-0x8(%rbp)
	int x = b.middle;
  4004f6:	8b 45 f4             	mov    -0xc(%rbp),%eax
  4004f9:	89 45 ec             	mov    %eax,-0x14(%rbp)

就是简单的赋值操做io

应用

压缩结构体大小这是网上所提到的最多的做用. 这实际上是大材小用了(如今的各类设备内存, 也不差这几个字节). 最适合它的用途是协议解析. 好比aac的adts(总共7个字节), 它的定义是这样的编译

名称	比特数	说明
syncword	12	must be 0xFFF
ID	1	0 for mpeg-4, 1 for mpeg-2
layer	2	must be 00
protect	1
profile	2	0 for main profile, 1 for low complexity profile, 2 for scalable sampling rate profile, 3 reserved
frequency	4
private	1
channel	3	0:Defined in AOT Specifc Config, 1-6 for channel count, 7 for 8 channel
copy	1
home	1
copyright	1
copyright-start	1
frame-len	13
adts-fullness	11
blocks	2	无

如今定义一个位域结构体来解析它.table

typedef union {
    struct {
        uint64_t padding:8;
        uint64_t block:2;
        uint64_t fullness: 11;
        uint64_t frame_len:13;
        uint64_t coypr_s:2;
        uint64_t copy_home:2;
        uint64_t channel:3;
        uint64_t priv:1;
        uint64_t freq:4;
        uint64_t profile:2;
        uint64_t protect:1;
        uint64_t layer:2;
        uint64_t id:1;
        uint64_t syncword:12;
    } bits;
    uint64_t value;
} adts;

adts h;
//7个字节的头.
uint8_t bytes[7] = {0xFF, 0xF1, 0x5C, 0x80, 0x05, 0x7F, 0xFC};
memcpy(&h.value, bytes, 7);
//通常的系统都是小端, 上面这句拷贝以后, value的值,0x**FC7F05805CF1FF. (*号表示未知)
//须要将其反转一下. 系统并无ntoh64, 这是自定义的
h.value = ntoh64(h.value);
//...消息头的处理(略)
printf("syncword: %d\n", h.bits.syncword);
printf("channel: %d\n", h.bits.channel);

上面这段代码, 省略了大量的位操做代码, 所有由编译器代劳了.