Redis 6.0 多线程性能测试结果及分析

时间 2021-01-02

标签 html redis 数据库 centos 服务器网络多线程并发 socket ide 栏目 Redis 繁體版

原文原文链接

单线程的Redis一贯以简洁高效著称，但也有其阿喀琉斯之踵：阻塞！单个线程在最容易产生瓶颈的网络读写（Redis大key,也包括其余一些重量级的操做SORT/SUNION/ZUNIONSTORE，集中性的expired key清理，内存溢出的maxmemory-policy策略等）请求完成以前，其余全部请求都将会被阻塞，严重影响其效率，所以Redis的多线程呼声就愈来愈高。因为是基于内存的操做延迟很是低，因此即使是单线程模式下CPU资源也不会是的瓶颈。最容易出现瓶颈的仍是网络IO操做。在Redis 6.0开始支持多线程以后，所谓的多线程也只是socket层面的多线程，核心的内存读写仍是单线程模式。
弄清楚了多线程的本质以后，就会有一系列的问题，多线程会比单线程有多大的提高？设置多少个线程合适？见一些大神测试过（目前全网也只有美图作过unstable版本的测试，全部的转载都是来源于这个测试），其结果也很是理想，但只是看看也不太过瘾，决定一试为快，本文将对Redis的多线程进行一个粗浅的测试验证。同时须要思考另一个问题：面对多线程版本的Redis，和Redis cluster，该如何选择？

多线程Redis

redis 6.0 的“多线程”特性让不少标题党高潮连连，参考图片源自于：美图技术团队侵删，核心的线程（Execute Command）仍是单线程，多线程是指网络IO（socket）读写的多线程化。
以下图所示，读写网络socket中的数据是能够用多个线程，因此Redis的多线程也叫作io thread，相关参数：“io-threads”。另外一个参数是io-threads-do-reads，这里涉及另一个细节：多线程IO主要用在请求完成后返回结果集的过程，也就是socket写操做，至于读socket，单线程自己采用的多路IO复用技术，也不会产生瓶颈，所以Redis给出了一个io-threads-do-reads 参数，决定读socket的时候是否启用多线程。其实io-threads-do-reads是否启用，对性能的影响并不大，最后会作一个验证。
html

测试环境及策略

本机配置：centos 7，16C+32GB内存

Redis版本：6.0.6

下面分别以1线程，2线程，4线程，6线程，8线程，10线程的配置下，200个并发链接进行100百万次请求（./bin/redis-benchmark -d 128 -c 200 -n 1000000 -t set -q ），同时为避免网络延迟带来的影响，redis-benchmark在Redis实例本地，测试Redis的get和set性能。

翻车

整个测试开始以前，经历了两次翻车才得以继续

翻车现场1
centos 7上默认的gcc是4.*版本，没法编译Redis 6.0，因此须要升级gcc，由于本机不支持yum安装，参考这里使用源码包安装，gcc编译的时候那个酸爽，本机16C+32GB内存的环境下，由于缺乏某些依赖包，致使失败了几回，最终编译成功的一次，花了大概1个小时10分钟

翻车现场2
没有认真读配置文件中的说明，设置io-threads后，重启Redis服务后，上来就用redis-benchmark直接怼，其结果跟单线程差很少，使人大跌眼镜。
最后仍是在原始配置文件发现了这段话：
If you want to test the Redis speedup using redis-benchmark, make sure you also run the benchmark itself in threaded mode, using the --threads option to match the number of Redis threads, otherwise you'll not be able to notice the improvements.

意思是必须在redis-benchmark设置--threads参数，而且要match Redis中的线程设置，--threads参数是redis 6.0后新增的一个参数。只有加上--threads这个参数才能体现出来多线程Redis的效率。

关于Thread IO的说明

经历了第二次翻车以后决定好好看一看redis.conf中关于thread io的注释信息

################################ THREADED I/O #################################

# Redis is mostly single threaded, however there are certain threaded
# operations such as UNLINK, slow I/O accesses and other things that are
# performed on side threads.
#
# Now it is also possible to handle Redis clients socket reads and writes
# in different I/O threads. Since especially writing is so slow, normally
# Redis users use pipelining in order to speed up the Redis performances per
# core, and spawn multiple instances in order to scale more. Using I/O
# threads it is possible to easily speedup two times Redis without resorting
# to pipelining nor sharding of the instance.
#
# By default threading is disabled, we suggest enabling it only in machines
# that have at least 4 or more cores, leaving at least one spare core.
# Using more than 8 threads is unlikely to help much. We also recommend using
# threaded I/O only if you actually have performance problems, with Redis
# instances being able to use a quite big percentage of CPU time, otherwise
# there is no point in using this feature.
#
# So for instance if you have a four cores boxes, try to use 2 or 3 I/O
# threads, if you have a 8 cores, try to use 6 threads. In order to
# enable I/O threads use the following configuration directive:
#
# io-threads 4
#
# Setting io-threads to 1 will just use the main thread as usual.
# When I/O threads are enabled, we only use threads for writes, that is
# to thread the write(2) syscall and transfer the client buffers to the
# socket. However it is also possible to enable threading of reads and
# protocol parsing using the following configuration directive, by setting
# it to yes:
#
# io-threads-do-reads no
#
# Usually threading reads doesn't help much.
#
# NOTE 1: This configuration directive cannot be changed at runtime via
# CONFIG SET. Aso this feature currently does not work when SSL is
# enabled.
#
# NOTE 2: If you want to test the Redis speedup using redis-benchmark, make
# sure you also run the benchmark itself in threaded mode, using the
# --threads option to match the number of Redis threads, otherwise you'll not
# be able to notice the improvements.

大概意思以下：
大多数状况下redis是以单线程的方式运行的，然而，有一些线程操做，如断开连接，耗时的I/O操做（bgsave，expired key清理之类的操做）和其余任务是在side线程（主线程fork出来的子线程）中执行的。

如今也能够在不一样的I/O线程中处理Redis客户端socket读和写。因为写入（指socket写入）速度很是慢，Redis用户一般使用pipelining来提升Redis在单核上的性能，并使用多个实例的方式来扩容。使用I/O线程能够很容易地提高Redis在socket读写上的性能，而无需求助于管道或实例的分片。

Redis 6.0中默认状况下多线程被是被禁用的，建议至少有4个或更多核的机器中启用多线程，且至少留下1备用核。使用超过8个线程不大可能有太大帮助。

因为Redis实例可以充分利用CPU资源（译者注：意思是即使是单线程下，CPU并非瓶颈），多线程I/O只有在你确实有性能问题的状况下才能提高运行效率，不然就没有必要使用这个特性。

若是你有一个4核的服务器，尝试使用2或3个I/O线程，若是是8核，尝试使用6个线程。要启用多线程I/O，请使用如下配置参数：io-threads 4

设置io-threads为1会像传统的redis同样启用单个主线程，当I/O threads被启用以后，仅仅支持写操做（译者注：指的是socket的写操做，socket的读操做使用多路io复用技术，自己也不是瓶颈）即IO线程调用syscall并将客户端缓冲区传输到socket。可是，也能够启用读写线程，使用如下配置指令进行协议解析，方法是将其设置为“yes”：io-threads-do-reads no

一般状况下，threading reads线程对性能的提高帮助并不大

注1：此配置指令不能在运行时经过配置集进行更改，只能在修改配置文件以后重启。启用SSL时，当前此特性也无效。

注2：若是你想用Redis-benchmark测试Redis的性能，务必以threaded mode的方式运行Redis-benchmark，使用--threads选项来匹配Redis线程的数量，不然没法观察到测试结果的提高。

测试结果及分析

以下是不一样线程requests per second测试结果的横向对比

从中能够看到：

1，1个线程，也就是传统的单线程模式，get 操做的QPS能够达到9W左右

2，2个线程，get 操做的QPS能够达到18W左右，相比单线程有100%+的提高

3，4个线程，与2线程相比，会有30%左右的提升，可是已经没有从1个线程到2个线程翻一倍的提高了

4，6个线程，与4线程相比，没有明显的提高，对于SET操做，QPS从4线程到6线程，8线程开始没有出现明显的差别，都在23W~24W之间

5，8个线程，与4线程和6线程相比，8线程下大概有10%的提高

6，10个线程，相比效率最高的8线程，此时性能反却是开始降低了，与4线程或者6线程的效率至关
所以在本机环境下，io-threads 4设置成2或者4个都ok，最多不超过8个，超出后性能反而会降低，同时也不能超出cpu的个数，正如配置文件中注释中说的，至少要留出一个CPU。

以下是不一样线程下10测试结果中GET和SET的requests per second 平均值对比：

关于io-threads-do-reads参数

上文提到过io-threads-do-reads这个参数，它是决定socket读操做是否开启多线程，Redis的socket读操做采用多路IO复用技术，自己不会成为瓶颈，因此这个参数对多线程下测试影响比较小。依旧参考这里的这个图侵删，这个io-threads-do-reads在笔者理解起来，就是socket读的线程，是否开启影响不大。
Redis.conf中关于此参数的注释为:When I/O threads are enabled, we only use threads for writes, that is to thread the write(2) syscall and transfer the client buffers to the socket. However it is also possible to enable threading of reads and protocol parsing using the following configuration directive, by setting it to yes Usually threading reads doesn't help much.
如下就该参数启用与否进行一次对比测试
redis

参考以下截图，在开启了两个线程的状况下，分别开启和禁用io-threads-do-reads，从总体上看，性能影响差异很小。固然专业的大神能够从源码的角度去分析。
io-threads为2，且启动io-threads-do-reads
数据库

io-threads为2，且禁动io-threads-do-reads
centos

多线程版本的Redis和Redis Cluster的选择

redis集群有两种模式：sentinel和cluster，这里暂时先不提sentinel，来思考多线程版本的Redis和Redis Cluster的选择问题。
Redis的Cluster解决的就是扩展性和单节点单线程的阻塞隐患，若是Redis支持了多线程（目前多线程的Redis最对不建议超出8个线程），在不考虑单个节点网卡瓶颈的状况下，其实这两个问题都已经解决了，单节点能够支持多线程和充分利用多核CPU，单节点能够支持到25W QPS，还要啥自行车？
同时要考虑到Redis cluster的痛点：
1，不支持multiple/pipline操做（第三方插件也不必定稳定）。
2，cluster中每一个主节点要挂一个从节点，无论这个节点是否是独立的物理节点仍是单机多实例中的一个节点，终究是增长了维护成本。
3，只能使用一个数据库
4，集群自身扩容、缩容带来的一系列slot迁移等性能问题，以及集群的管理问题
这些所谓的痛点也不复存在了，因此这里就面临一个从新选择的问题：是使用多线程版本的Redis，仍是使用Redis cluster？这是一个须要思考的问题。服务器

疑问

关于redis-benchmark 测试时候 ./bin/redis-benchmark -d 128 -c 200 -n 1000000 -t set -q --threads 2，涉及的两个参数-c和--threads，我的是不太理解的
-c的解释是：指定并发链接数
--threads是redis-benchmark的线程数？
对此去跟老钱（素未谋面，感谢）求证了一下，说该参数是redis-benchmark客户端的epoll，服务器端的多路IO复用原理已经看得我七荤八素了，客户端也是带epoll的，仍是不太理解这二者之间的关系。网络

redis-benchmark测试现场

以下是redis-benchmark测试过程当中部分截图多线程

图太多了，就不一一贴上来了。