ZooKeeper 安装模式
- 单机模式:ZooKeeper 运行在一台服务器上,适合测试环境;
- 伪集群模式:在一台物理机上运行多个 ZooKeeper 实例;
- 集群模式:ZooKeeper 运行在一个集群上,称为 ensemble,适合生产环境;
ZooKeeper 经过复制来实现高可用性,只要集合中半数以上的机器处于可用状态就能够保证服务继续。由于 ZooKeeper 的复制策略是保证 znode 树的每个修改都会被复制到集群中超过半数的机器上。
准备工做
Windows 下的配置
单机模式(适合开发环境)
一、将下载的压缩包 zookeeper-3.4.11.tar.gz 解压到 C:\solrCloud\zk_server_single(如下简称 %ZK_HOME%
) 目录下;
二、将 %ZK_HOME%/conf/zoo_sample.cfg 另存为
zoo.cfg
,并修改该配置文件:
# ----------------------------------------------------------------------
# 基本配置(最低配置)
# ----------------------------------------------------------------------
# the port at which the clients will connect
# 监听客户端链接的端口
clientPort=2181
# The number of milliseconds of each tick
# 服务器之间或者客户端与服务器之间维持心跳的时间间隔,
# 会话(session)的过时时间为2倍的 tickTime;
tickTime=2000
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
# 存储内存数据库快照的位置,除非另外说明,不然就是指数据库的更新事务日志
dataDir=../data
三、而后启动 %ZK_HOME%/bin/zkServer.cmd 便可;
四、由于这里是单机模式,ZooKeeper 没有其余机器能够复制更新事务,因此当 ZooKeeper 处理失败时服务就会挂掉,这样的适合做为开发环境。
五、链接 ZooKeeper 服务器,能够经过 %ZK_HOME%/bin/zkCli.cmd 做为客户端链接到 ZooKeeper 服务器。
bin\zkCli.cmd -server 127.0.0.1:181
出现 Welcome to ZooKeeper!和 JLine support is enabled,则表示已经链接成功!
此时也能够经过 netstat 命令查看 2181 端口是否被占用,或者经过 jps 命令查看启动的 JAVA 进程状况来检查 ZooKeeper 是否启动正常!
六、输入
help 命令能够查看 ZooKeeper 的一些命令:
[zk: 127.0.0.1:2181(CONNECTED) 0] help
ZooKeeper -server host:port cmd args
stat path [watch]
set path data [version]
ls path [watch]
delquota [-n|-b] path
ls2 path [watch]
setAcl path acl
setquota -n|-b val path
history
redo cmdno
printwatches on|off
delete path [version]
sync path
listquota path
rmr path
get path [watch]
create [-s] [-e] path data acl
addauth scheme auth
quit
getAcl path
close
connect host:port
下面看一下 ZooKeeper 命令的一些示例:
[zk: 127.0.0.1:2181(CONNECTED) 1] ls /
[zookeeper]
[zk: 127.0.0.1:2181(CONNECTED) 2] create /zk_test my_data
Created /zk_test
[zk: 127.0.0.1:2181(CONNECTED) 3] ls /
[zookeeper, zk_test]
[zk: 127.0.0.1:2181(CONNECTED) 4] get /zk_test
my_data
cZxid = 0x2a
ctime = Wed Apr 11 10:49:31 CST 2018
mZxid = 0x2a
mtime = Wed Apr 11 10:49:31 CST 2018
pZxid = 0x2a
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 0
[zk: 127.0.0.1:2181(CONNECTED) 5] set /zk_test junk
cZxid = 0x2a
ctime = Wed Apr 11 10:49:31 CST 2018
mZxid = 0x2b
mtime = Wed Apr 11 10:50:33 CST 2018
pZxid = 0x2a
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 4
numChildren = 0
[zk: 127.0.0.1:2181(CONNECTED) 6] delete /zk_test
[zk: 127.0.0.1:2181(CONNECTED) 7] ls /
[zookeeper]
命令 |
描述 |
conf |
打印服务配置的详细信息 |
cons |
列举全部链接到该服务器的客户端的链接或会话,包括发送/接收的包数量,会话 id,操做延迟,最后执行的操做等 |
crst |
重置全部链接或会话的统计信息 |
dump |
列举未经处理的会话和临时节点,只对 leader 有效。 |
envi |
打印服务环境的详细信息 |
ruok |
测试服务器是否处于正确状态,若是是返回"imok",不然不做任何响应。返回"imok"只是表示服务器进程是活动的,且绑定到指定的客户端端口,并不表明该服务器已经加入到集群中。 |
srst |
重置服务器统计信息。 |
srvr |
列举服务器的全部详细信息。 |
stat |
列举服务器及其链接的客户端的简要信息。 |
wchs |
列举服务器上 watch 的简要信息。 |
wchc |
经过 session 列举服务器上 watch 的详细信息。输出一个与 watch 相关的会话(链接)列表。 |
wchp |
经过路径列举服务器上 watch 的详细信息。输出一个与 watch 相关的路径(znode)列表。 |
mntr |
输出一些用于监测集群健康的变量。 |
须要下载 netcat for windows,并在环境变量 path 中添加 nc.exe 所在目录。
C:\solrCloud\zk_server_fake\bin>echo mntr | nc localhost 2181
zk_version 3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0, built on 11/01/2017 18:06 GMT
zk_avg_latency 0
zk_max_latency 0
zk_min_latency 0
zk_packets_received 7
zk_packets_sent 6
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count 4
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 27
C:\solrCloud\zk_server_fake\bin>echo ruok | nc localhost 2181
imok
伪集群模式
一、将上面配置好的 C:\solrCloud\zk_server_single 文件夹另存为一份 C:\solrCloud\zk_server_fake(简称%ZK_HOME%)
2
、伪集群模式是经过每一个配置文档模拟一台服务器,因此将 %ZK_HOME%\conf\zoo.cfg 文件复制出三份 zoo1.cfg、zoo2.cfg 和 zoo3.cfg 配置文件,配置信息以下:
zoo1.cfg
# ----------------------------------------------------------------------
# 基本配置(最低配置)
# ----------------------------------------------------------------------
# the port at which the clients will connect
# 监听客户端链接的端口
clientPort=2181
# The number of milliseconds of each tick
# 服务器之间或者客户端与服务器之间维持心跳的时间间隔,
# 会话(session)的过时时间为2倍的 tickTime;
tickTime=2000
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
# 存储内存数据库快照的位置,除非另外说明,不然就是指数据库的更新事务日志
dataDir=../data1
# ----------------------------------------------------------------------
# 高级配置
# ----------------------------------------------------------------------
# 存储事务日志的位置,分离出默认的 dataDir 设置中包含的更新事务日志记录,避免日志和快照之间的竞争
dataLogDir=../log1
# Java 属性:zookeeper.globalOutstandingLimit
# 客户端提交请求的速度要比 ZooKeeper 处理请求的速度快不少,尤为有大量的客户端的时候。
# 为了不因为大量请求致使 ZooKeeper 内存耗尽,ZooKeeper 将调节客户端以保证系统中只有不足 globalOutstandingLimit 个未处理请求。
# 默认值 1000
# globalOutstandingLimit=1000
# Java 属性:zookeeper.preAllocSize
# 为了不地址寻址,ZooKeeper 给事务日志文件分配了 preAllocSize 字节大小的空间。默认块大小为 64M。
# 若是常用快照则能够修改该值,减少块大小。
# preAllocSize
# Java 属性:zookeeper.snapCount
# ZooKeeper 使用快照和一个快照日志文件来记录它的事务。snapCount 决定了快照时在事务日志中能够记录的事务数量。
# 为了不集群中全部的机器同时拍摄快照,每一个 ZooKeeper 服务器只有在事务日志中的事务数量达到一个值时才拍摄快照,
# 该值时在运行时生成的介于[snapCount/2+1, snapCount]范围内的随机数。默认值 100000
# snapCount=100000
# the maximum number of client connections.
# increase this if you need to handle more clients
# 限制 ZooKeeper 集群中一个客户端的并发链接数量,经过 IP 地址进行判断识别。
# 能够用于阻止某些 DoS 攻击,包括 file descriptor exhaustion。默认值 60。
# 设置为0时表示取消并发数量限制。
# maxClientCnxns=60
# 3.3.0新增设置
# 最小会话超时时间,默认 minSession=2*tickTime
# minSessionTimeout
# 最大会话超时时间,默认 maxSession=20*tickTime
# maxSessionTimeout
# 3.4.0新增设置
# The number of snapshots to retain in dataDir
# autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
# autopurge.purgeInterval=1
# ----------------------------------------------------------------------
# 集群配置
# ----------------------------------------------------------------------
# The number of ticks that the initial
# synchronization phase can take
# 容许 follower 链接并同步到 leader 的初始化链接次数,以 tickTime 为单位,总计时长为 initLimit*tickTime 毫秒
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
# leader 与 follower 之间发送消息时,请求和应答之间的通讯次数,以 tickTime 为单位,总计时长为 syncLimit*tickTime 毫秒
syncLimit=5
# A:一个正整数,表示服务器的编号
# B:服务器的 IP 地址
# C:ZooKeeper 服务器之间的通讯端口
# D:leader 选举端口
# server.A=B:C:D
server.1=localhost:2287:3387
server.2=localhost:2288:3388
server.3=localhost:2289:3389
zoo2.cfg,除了以下配置不一样,其余与 zoo1.cfg 一致
clientPort=2182
dataDir=../data2
dataLogDir=../log2
zoo3.cfg,除了以下配置不一样,其余与 zoo1.cfg 一致
clientPort=2183
dataDir=../data3
dataLogDir=../log3
要注意其中的 clientPort 端口、dataDir 和 dataLogDir 目录设置,不一样的 ZooKeeper 服务器对应不一样的配置项。
此时须要手动建立 data一、data2 和 data3,log一、log2 和 log3 六个文件夹。
三、须要在每一个 data 目录下一个 myid 文件里面分别写入 1,2,3,对应 server.x 中的 x 数字,表示不一样 ZooKeeper 服务器的编号。
四、而后将 %ZK_HOME%/bin/zkServer.cmd 复制三份 zkServer1.cmd、zkServer2.cmd 和 zkServer3.cmd 来模拟三台 ZooKeeper 服务器启动,须要在文件中增长对应配置文件的参数设置。set ZOOCFG=..\conf\zooX.cfg
,其中 X 表示对应服务器的 zoo.cfg 配置文件,与 2 中的相对应。最终结果以下图所示:
五、最后启动三个 ZooKeeper 服务器;
首先启动 zkServer1.cmd
C:\solrCloud\zk_server_fake\bin>zkServer1.cmd
C:\solrCloud\zk_server_fake\bin>call "C:\Program Files\Java\jdk1.8.0_162"\bin\java "-Dzookeeper.log.dir=C:\solrCloud\zk_server_fake\bin\.." "-Dzookeeper.root.logger=INFO,CONSOLE" -cp "C:\solrCloud\zk_server_fake\bin\..\build\classes;C:\solrCloud\zk_server_fake\bin\..\build\lib\*;C:\solrCloud\zk_server_fake\bin\..\*;C:\solrCloud\zk_server_fake\bin\..\lib\*;C:\solrCloud\zk_server_fake\bin\..\conf" org.apache.zookeeper.server.quorum.QuorumPeerMain "..\conf\zoo1.cfg"
2018-04-11 11:46:40,470 [myid:] - INFO [main:QuorumPeerConfig@136] - Reading configuration from: ..\conf\zoo1.cfg
2018-04-11 11:46:40,489 [myid:] - INFO [main:QuorumPeer$QuorumServer@184] - Resolved hostname: localhost to address: localhost/127.0.0.1
2018-04-11 11:46:40,489 [myid:] - INFO [main:QuorumPeer$QuorumServer@184] - Resolved hostname: localhost to address: localhost/127.0.0.1
2018-04-11 11:46:40,491 [myid:] - INFO [main:QuorumPeer$QuorumServer@184] - Resolved hostname: localhost to address: localhost/127.0.0.1
2018-04-11 11:46:40,491 [myid:] - INFO [main:QuorumPeerConfig@398] - Defaulting to majority quorums
2018-04-11 11:46:40,503 [myid:1] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2018-04-11 11:46:40,503 [myid:1] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2018-04-11 11:46:40,503 [myid:1] - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2018-04-11 11:46:40,560 [myid:1] - INFO [main:QuorumPeerMain@130] - Starting quorum peer
2018-04-11 11:46:40,746 [myid:1] - INFO [main:ServerCnxnFactory@117] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2018-04-11 11:46:40,747 [myid:1] - INFO [main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:2181
2018-04-11 11:46:40,753 [myid:1] - INFO [main:QuorumPeer@1158] - tickTime set to 2000
2018-04-11 11:46:40,753 [myid:1] - INFO [main:QuorumPeer@1204] - initLimit set to 10
2018-04-11 11:46:40,753 [myid:1] - INFO [main:QuorumPeer@1178] - minSessionTimeout set to -1
2018-04-11 11:46:40,753 [myid:1] - INFO [main:QuorumPeer@1189] - maxSessionTimeout set to -1
2018-04-11 11:46:40,760 [myid:1] - INFO [main:QuorumPeer@1467] - QuorumPeer communication is not secured!
2018-04-11 11:46:40,761 [myid:1] - INFO [main:QuorumPeer@1496] - quorum.cnxn.threads.size set to 20
2018-04-11 11:46:40,764 [myid:1] - INFO [main:QuorumPeer@668] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2018-04-11 11:46:40,771 [myid:1] - INFO [main:QuorumPeer@683] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2018-04-11 11:46:40,781 [myid:1] - INFO [ListenerThread:QuorumCnxManager$Listener@736] - My election bind port: localhost/127.0.0.1:3387
2018-04-11 11:46:40,789 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@909] - LOOKING
2018-04-11 11:46:40,790 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@820] - New election. My id = 1, proposed zxid=0x0
2018-04-11 11:46:40,792 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@602] - Notification: 1 (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
此时会提示说没法打开"
2号"通道和"3号"通道,错误提示以下,由于"2号"服务器和"3号
"服务器还未启动。
2018-04-11 11:48:16,324 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 2 at election address localhost/127.0.0.1:3388
java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:845)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
2018-04-11 11:48:32,332 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@184] - Resolved hostname: localhost to address: localhost/127.0.0.1
2018-04-11 11:48:33,338 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 3 at election address localhost/127.0.0.1:3389
java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:845)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
2018-04-11 11:48:33,338 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@184] - Resolved hostname: localhost to address: localhost/127.0.0.1
2018-04-11 11:48:33,340 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@854] - Notification time out: 51200
一样再启动 zkServer2.cmd,此时 zkServ1 中仍然会提示没法链接上"
3号"通道,可是有提示说链接上"2号
"通道,提示以下:
而 zkServer2.cmd 则提示没法链接"3号"通道,而后与链接上的"1号"服务器开始竞选 leader,产生一个 leader 和 一个 follow。
最后再启动 zkServer3.cmd,此时再也不提示异常了,而且会在三台服务器之间再一次竞选一个 leader,剩下两个为 follow。
不过在 Windows 系统下没法经过 zkServer.cmd 查看服务器状态,须要安装 Cygwin 工具,而后执行以下命令查看三个服务器的状态:
集群模式
与伪集群配置同样,只要将不一样配置文件(zoo.cfg)分别部署在不一样服务器上便可。
参考说明
by. Memento