Redis-3.x Cluster安装配置
官方文档:
phpredis扩展
环境:
CentOS6.5 x64
redis-3.0.6
master node1: 192.168.192.10
master node2: 192.168.192.11
master node3: 192.168.192.12
slave node1: 192.168.192.20
slave node2: 192.168.192.21
slave node3: 192.168.192.22
3主3辅
一.安装编译依赖库
yum -y install gcc gcc-c++ make tcl-devel
二.安装
tar -xvf redis-3.0.6.tar.gz -C /usr/local/src
cd /usr/local/src/redis-3.0.6
make -j4 &&
make PREFIX=/opt/redis install
cp /usr/local/src/redis-3.0.6/src/redis-trib.rb /opt/redis/bin
echo 'export PATH=$PATH:/opt/redis/bin' >>/etc/profile
source /etc/profile
三.sysv脚本
能够借助源码包自带的utils交互式工具来生成,不过还要略加修改
a.交互式
root@master:redis-3.0.6#/usr/local/src/redis-3.0.6/utils/
install_server.sh
Welcome to the redis service installer
This script will help you easily set up a running redis server
Please select the redis port for this instance: [6379]
Selecting default: 6379
Please select the redis config file name [/etc/redis/6379.conf]
/opt/redis/conf/redis.conf
Please select the redis log file name [/var/log/redis_6379.log]
/opt/redis/log/redis.log
Please select the data directory for this instance [/var/lib/redis/6379]
/opt/redis/data
Please select the redis executable path []
/opt/redis/bin/redis-server
Selected config:
Port
: 6379
Config file
: /opt/redis/conf/redis.conf
Log file
: /opt/redis/log/redis.log
Data dir
: /opt/redis/data
Executable
: /opt/redis/bin/redis-server
Cli Executable : /opt/redis/bin/redis-cli
Is this ok? Then press ENTER to go on or Ctrl-C to abort.
Copied /tmp/6379.conf => /etc/init.d/redis_6379
Installing service...
Successfully added to chkconfig!
Successfully added to runlevels 345!
Starting Redis server...
Installation successful!
能够看到, 脚本帮咱们自动配置好了相关路径和文件
b.静默
bash /usr/local/src/redis-3.0.7/utils/install_server.sh <<EOF
6379
/opt/redis/conf/redis.conf
/opt/redis/log/redis.log
/opt/redis/data
/opt/redis/bin/redis-server
EOF
不过init脚本的名字没能帮咱们自定义,有强迫症的朋友能够修改下
ln -s /etc/init.d/redis_6379 /etc/init.d/redis
四.内核参数优化
echo 'net.core.somaxconn = 511' >>
/etc/sysctl.conf
echo 'vm.overcommit_memory = 1'
>> /etc/sysctl.conf
cat >>/etc/rc.d/rc.local <<HERE
echo never > /sys/kernel/mm/transparent_hugepage/enabled
HERE
五.配置集群
简述
Redis 集群的键空间被分割为 16384 (2^14)个槽(slot), 集群的最大节点数量也是 16384 个(推荐的最大节点数量为 1000 个),同理每一个主节点能够负责处理1到16384个槽位。
每一个节点在集群中由一个独一无二的 ID标识, 该 ID 是一个十六进制表示的 160 位随机数,在节点第一次启动时由 /dev/urandom 生成。节点会将它的 ID 保存到配置文件, 只要这个配置文件不被删除, 节点就会一直沿用这个 ID 。一个节点能够改变它的 IP 和端口号, 而不改变节点 ID 。 集群能够自动识别出IP/端口号的变化, 并将这一信息经过 Gossip协议广播给其余节点知道。
准备
master node1: 192.168.192.10
master node2: 192.168.192.11
master node3: 192.168.192.12
slave node1: 192.168.192.20
slave node2: 192.168.192.21
slave node3: 192.168.192.22
确保以上节点都己成功安装redis,
安装方法同上
Note that the
minimal cluster
that works as expected requires to contain at least three master nodes. For your first tests it is strongly suggested to start a six nodes cluster with three masters and three slaves.
A.集群配置文件
master主配置文件
启用cluster模式,须要在原配置文件的基础上增长(或修改)以下几行
cluster-enabled yes #是否启用集群模式
cluster-config-file nodes-6379.conf #集群节点配置文件,自动生成并经过
Gossip协议同步到各节点
cluster-node-timeout 15000
cluster-slave-validity-factor 10
cluster-migration-barrier 1
cluster-require-full-coverage yes
slave
主配置文件
直接沿用master的主配置文件,最后手动指定给某台master来做为slave
注意:只有全部节点都运行在cluster模式redis cluster才能生效
B.添加master节点
redis-cli cluster meet 192.168.192.10 6379
redis-cli cluster meet 192.168.192.11 6379
redis-cli cluster meet 192.168.192.12 6379
注意:3节点集群,默认都为master,节点己添加成功但处于
fail状态
C.启用集群
master node1: 192.168.192.10
redis-cli cluster addslots {0..5500}
master node2: 192.168.192.11
redis-cli cluster addslots {5501..11000}
master node3: 192.168.192.12
redis-cli cluster addslots {11001..16383}
各字段对应含义
node id, address:port, flags, last ping sent, last pong received, configuration epoch, link state, slots
集群配置信息会写入到主配置文件中定义的节点配置文件(/opt/redis/data/nodes-6379.conf), 所以,也能够直接在配置文件里修改好slots区间后同步到各节点,再重启redis
注意:集群可用的必备条件
1.有slots分配到node
2.集群成员选举后认为集群可用
D.slots在线分片
须要借助ruby工具包---redis-trib.rb
yum -y install
ruby rubygems
gem install redis
redis-trib.rb reshard --from ca5fb0605fa3efbf62d1c8367489101cccfe0883 --to c16ba7f364b038e532c398d92b24d35ad5e23369 --slots 5 --yes 192.168.192.10:6379
若是不代详细参数,redis-trib.rb会交互式地问"移什么","怎么移",做为测试,这里我只从192.168.192.11上移了5个slots到192.168.192.12这台主机, 经过slots栏位能够很清楚的看到
E.添加slave节点
redis-trib.rb add-node --slave --master-id 19bcc3b19b1325fac5c7647c
1431f46299609079 192.168.192.20:6379 192.168.192.10:6379
redis-trib.rb add-node --slave --master-id ca5fb0605fa3efbf62d1c836
7489101cccfe0883 192.168.192.21:6379 192.168.192.11:6379
redis-trib.rb add-node --slave --master-id c16ba7f364b038e532c398d9
2b24d35ad5e23369 192.168.192.22:6379 192.168.192.12:6379
为确保主辅一致,请在各slave节点上执行同步操做
redis-cli cluster replicate 19bcc3b19b1325fac5c7647c
1431f46299609079
redis-cli cluster replicate ca5fb0605fa3efbf62d1c8367489101cccfe0883
redis-cli cluster replicate
c16ba7f364b038e532c398d9
2b24d35ad5e23369
F.failover测试
1.一台master宕机
redis-cli -h 192.168.192.12 -p 6379 debug segfault
redis-cli cluster info
redis-cli cluster nodes
能够看到,当其中一台master宕机时,该master的slave在很短的时间内就提高本身为master,并将原master的全部slots所有接管过来
如下是slave切换为master的详细日志
2782:S 25 Dec 11:29:29.384 # Connection with master lost.
2782:S 25 Dec 11:29:29.384 * Caching the disconnected master state.
2782:S 25 Dec 11:29:29.384 * Discarding previously cached master state.
2782:S 25 Dec 11:29:29.649 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:29:29.650 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:29:29.650 * Non blocking connect for SYNC fired the event.
2782:S 25 Dec 11:29:29.650 * Master replied to PING, replication can continue...
2782:S 25 Dec 11:29:29.651 * Partial resynchronization not possible (no cached master)
2782:S 25 Dec 11:29:29.651 * Full resync from master: cbb2f04d66e95d799f8dabbeaa90c3a293ca9e28:1121
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: receiving 18 bytes from master
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: Flushing old data
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: Loading DB in memory
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: Finished with success
2782:S 25 Dec 11:30:08.012 # Connection with master lost.
2782:S 25 Dec 11:30:08.012 * Caching the disconnected master state.
2782:S 25 Dec 11:30:08.193 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:08.193 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:08.193 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:09.212 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:09.212 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:09.212 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:10.222 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:10.222 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:10.222 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:11.232 * Connecting to MASTER 192.168.192.12:6379
... ...
2782:S 25 Dec 11:30:15.284 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:16.296 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:16.296 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:19.329 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:20.349 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:20.349 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:20.350 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:21.370 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:21.370 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:21.371 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:22.396 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:22.396 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:22.396 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:23.411 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:23.412 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:23.412 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:23.674 * FAIL message received from ca5fb0605fa3efbf62d1c8367489101cccfe0883 about c16ba7f364b038e532c398d92b24d35ad5e23369
2782:S 25 Dec 11:30:23.674 # Cluster state changed: fail
2782:S 25 Dec 11:30:23.715 # Start of election delayed for 690 milliseconds (rank #0, offset 1163).
2782:S 25 Dec 11:30:24.425 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:24.425 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:24.425 # Starting a failover election for epoch 7.
2782:S 25 Dec 11:30:24.469 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:24.470 # Failover election won: I'm the new master.
2782:S 25 Dec 11:30:24.471 # configEpoch set to 7 after successful failover
2782:M 25 Dec 11:30:24.471 * Discarding previously cached master state.
2782:M 25 Dec 11:30:24.471 # Cluster state changed: ok
2.宕机的master修复后从新启动
这里,集群会自动将原master(192.168.192.12)变为新master(192.168.192.22)的slave
2411:M 25 Dec 11:44:44.139 # Server started, Redis version 3.0.6
2411:M 25 Dec 11:44:44.139 * DB loaded from disk: 0.000 seconds
2411:M 25 Dec 11:44:44.140 * The server is now ready to accept connections on port 6379
2411:M 25 Dec 11:44:44.194 # Configuration change detected. Reconfiguring myself as a replica of e3bacaf98d2eee3259275c8751cd4757f8ca0b64
2411:S 25 Dec 11:44:44.195 # Cluster state changed: ok
2411:S 25 Dec 11:44:45.215 * Connecting to MASTER 192.168.192.22:6379
2411:S 25 Dec 11:44:45.216 * MASTER <-> SLAVE sync started
2411:S 25 Dec 11:44:45.216 * Non blocking connect for SYNC fired the event.
2411:S 25 Dec 11:44:45.216 * Master replied to PING, replication can continue...
2411:S 25 Dec 11:44:45.217 * Partial resynchronization not possible (no cached master)
2411:S 25 Dec 11:44:45.217 * Full resync from master: 163f91176e281346fefb0935221343b174b40843:1
2411:S 25 Dec 11:44:45.239 * MASTER <-> SLAVE sync: receiving 18 bytes from master
2411:S 25 Dec 11:44:45.239 * MASTER <-> SLAVE sync: Flushing old data
2411:S 25 Dec 11:44:45.240 * MASTER <-> SLAVE sync: Loading DB in memory
2411:S 25 Dec 11:44:45.240 * MASTER <-> SLAVE sync: Finished with success
补充:经常使用管理命令
集群(cluster)
cluster info 打印集群的信息
cluster nodes 列出集群当前已知的全部节点(node),以及这些节点的相关信息。
节点 (node)
cluster meet 将IP和PORT所指定的节点添加到集群当中,让它成为集群的一份子。
cluster forget 从集群中移除node_id指定的节点。
cluster replicate 将当前节点设置为node_id指定的节点的从节点。
cluster saveconfig 将节点的配置文件保存到硬盘里面。
槽(slot)
cluster addslots [slot ...] 将一个或多个槽(slot)指派(assign)给当前节点。
cluster delslots [slot ...] 移除一个或多个槽对当前节点的指派。
cluster flushslots 移除指派给当前节点的全部槽,让当前节点变成一个没有指派任何槽的节点。
cluster setslot node 将槽slot指派给node_id指定的节点,若是槽已经指派给另外一个节点,那么先让另外一个节点删除该槽,而后再进行指派。
cluster setslot migrating 将本节点的槽slot迁移到node_id指定的节点中。
cluster setslot importing 从node_id指定的节点中导入槽slot到本节点。
cluster setslot stable 取消对槽slot的导入(import)或者迁移(migrate)。
键 (key)
cluster keyslot 计算键key应该被放置在哪一个槽上。
cluster countkeysinslot 返回槽slot 目前包含的键值对数量。
cluster getkeysinslot 返回count个slot 槽中的键。