ziplist
也就是压缩列表,是Redis
为了节约内存而开发的,是由一系列特殊编码的连续内存块组成的顺序型数据结构,在Redis2.8
的时候是做为小数据量的列表底层实现。ziplist
自己并无定义结构体,下面是《redis设计与实现》里的说明图。程序员
下面是ziplist
的entry
逻辑定义,并非实际的编码,可是能够用来理解entry
是怎么组成的。redis
/* We use this function to receive information about a ziplist entry. * Note that this is not how the data is actually encoded, is just what we * get filled by a function in order to operate more easily. */
typedef struct zlentry {
unsigned int prevrawlensize; // 存储上一个元素的长度数值所须要的字节数
unsigned int prevrawlen; // 前一个元素的长度
unsigned int lensize; // 存储元素的长度数值所须要的字节数,能够是一、2或者5字节,整数老是使用一个字节
unsigned int len; // 表示元素的长度
unsigned int headersize; /* prevrawlensize + lensize. */
unsigned char encoding; // 标记是字节数组仍是整数
unsigned char *p; // 压缩链表以字符串的形式保存,该指针指向当前元素起始位置
} zlentry;
复制代码
一个entry
的数据组成以下图所示:数组
一个entry
的第一部分是prevrawlen
,编码了前一个元素的长度;entry
的第二部分是encoding
,编码了元素自身的长度和类型;entry
的第三部分是value
,存放了元素自己,有些entry
并无这一部分。数据结构
每一个zlentry
都存储了前一个元素的长度,用字段prevrawlen
来表示。ui
经过下面的源码能够知道在上一个元素小于254字节时,prevlensize
等于1字节,不然prevlensize
等于5字节。当prevlensize
等于5字节时,prevrawlen
的第一个字节会被设置为0xFE
,后面的四个字节才是前一个元素的长度。this
#define ZIP_BIG_PREVLEN 254 /* Max number of bytes of the previous entry, for the "prevlen" field prefixing each entry, to be represented with just a single byte. Otherwise it is represented as FF AA BB CC DD, where AA BB CC DD are a 4 bytes unsigned integer representing the previous entry len. */
/* Return the number of bytes used to encode the length of the previous * entry. The length is returned by setting the var 'prevlensize'. */
#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do {
if ((ptr)[0] < ZIP_BIG_PREVLEN) {
(prevlensize) = 1;
} else {
(prevlensize) = 5;
}
} while(0);
复制代码
在处理完prevrawlen
字段,接下来就是处理encoding
字段了。encoding
字段的前两个bit
用来标识类型,若是元素是一个整数的话encoding
字段固定为一个字节,字节前两个bit
固定为11
。而后判断元素的大小进而将整数范围存储在encoding
字段的后6个bit
里。其中特殊的是当元素大于等于0且小于等于12的时候,元素会直接被写入encoding
字段中,就没有后面的value
字段了。编码
/* Check if string pointed to by 'entry' can be encoded as an integer. * Stores the integer value in 'v' and its encoding in 'encoding'. */
int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) {
long long value;
if (entrylen >= 32 || entrylen == 0) return 0;
if (string2ll((char*)entry,entrylen,&value)) {
/* Great, the string can be encoded. Check what's the smallest * of our encoding types that can hold this value. */
if (value >= 0 && value <= 12) {
*encoding = ZIP_INT_IMM_MIN+value;
} else if (value >= INT8_MIN && value <= INT8_MAX) {
*encoding = ZIP_INT_8B;
} else if (value >= INT16_MIN && value <= INT16_MAX) {
*encoding = ZIP_INT_16B;
} else if (value >= INT24_MIN && value <= INT24_MAX) {
*encoding = ZIP_INT_24B;
} else if (value >= INT32_MIN && value <= INT32_MAX) {
*encoding = ZIP_INT_32B;
} else {
*encoding = ZIP_INT_64B;
}
*v = value;
return 1;
}
return 0;
}
复制代码
当元素是一个字符串时,一样会根据元素的长度设置encoding
字段:spa
encoding
字段为1字节,前两位bit
为00,后6位用于存储元素长度encoding
字段为2字节,前两位bit
为01,其他位用于存储元素长度encoding
字段为5字节,第1个字节前两位bit
为10,后4个字节用于存储元素长度/* Write the encoidng header of the entry in 'p'. If p is NULL it just returns * the amount of bytes required to encode such a length. Arguments: * * 'encoding' is the encoding we are using for the entry. It could be * ZIP_INT_* or ZIP_STR_* or between ZIP_INT_IMM_MIN and ZIP_INT_IMM_MAX * for single-byte small immediate integers. * * 'rawlen' is only used for ZIP_STR_* encodings and is the length of the * srting that this entry represents. * * The function returns the number of bytes used by the encoding/length * header stored in 'p'. */
unsigned int zipStoreEntryEncoding(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
unsigned char len = 1, buf[5];
if (ZIP_IS_STR(encoding)) {
/* Although encoding is given it may not be set for strings, * so we determine it here using the raw length. */
if (rawlen <= 0x3f) {
if (!p) return len;
buf[0] = ZIP_STR_06B | rawlen;
} else if (rawlen <= 0x3fff) {
len += 1;
if (!p) return len;
buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f);
buf[1] = rawlen & 0xff;
} else {
len += 4;
if (!p) return len;
buf[0] = ZIP_STR_32B;
buf[1] = (rawlen >> 24) & 0xff;
buf[2] = (rawlen >> 16) & 0xff;
buf[3] = (rawlen >> 8) & 0xff;
buf[4] = rawlen & 0xff;
}
} else {
/* Implies integer encoding, so length is always 1. */
if (!p) return len;
buf[0] = encoding;
}
/* Store this length at p. */
memcpy(p,buf,len);
return len;
}
复制代码
value
字段就很是的简单了,直接把元素放在encoding
字段后面就能够了,它能够是一个整数或者是字节数组,元素的长度和类型都已经存储在encoding
字段里了。设计
能够看到ziplist
的实现是很是蛋疼的,Redis
为了节省一些内存也是丧心病狂。不过这样内存是节省了,却增长了cpu
的负担,redis
也只是在数据量比较少的场景才会使用这种数据结构。做为一个Java
程序员看到这样的数据结构实际上是比较奇怪的,仍是ArrayList
好用:)指针