原理很简单一个primary,secondary至少是一个,也能够是多个secondary,除了多个secondary以外,还能够加一个Arbiter,Arbiter叫作仲裁,当Primary宕机后,Arbiter能够很准确的告知Primary宕掉了,但可能Primary认为本身没有宕掉,这样的话就会出现脑裂,为了防止脑裂就增长了Arbiter这个角色,尤为是数据库坚定不能出现脑裂的状态,脑裂会致使数据会紊乱,数据一旦紊乱恢复就很是麻烦.linux
说明:Primary宕机后,其中secondary就成为一个新的Primary,另一个secondary依然是secondary的角色. 对于MySQL主历来讲,即便作一主多从,万一master宕机后,可让从成为新的主,但这过程是须要手动的更改的. 可是在MongoDB副本集架构当中呢,它彻底都是自动的,rimary宕机后,其中secondary就成为一个新的Primary,另一个secondary能够自动识别新的primary.mongodb
准备三台机器: 192.168.2.115 (primary)
192.168.2.116 (secondary)
192.168.2.117 (secondary)
三台机器都须要安装MongoDB,primary已安装过,两台secondary须要安装,因步骤同样,在此不作演示.shell
Primary机器: [root@root-01 ~]# vim /etc/mongod.conf net: port: 27017 bindIp: 127.0.0.1,192.168.2.115 # Listen to local interface only, comment to listen on all interfaces. 说明:作副本集bindIp 要监听本机IP和内网IP #replication: //把#去掉,并增两行 replication: oplogSizeMB: 20 replSetName: annalinux //定义副本集的名字 重启MongoDB服务: [root@root-01 ~]# systemctl restart mongod [root@root-01 ~]# ps aux |grep mongod mongod 39020 7.3 4.8 1016828 48880 ? Sl 17:18 0:00 /usr/bin/mongod -f /etc/mongod.conf root 39050 0.0 0.0 112664 964 pts/0 S+ 17:19 0:00 grep --color=auto mongod Secondary1机器: [root@root-02 ~]# vim /etc/mongod.conf net: port: 27017 bindIp: 127.0.0.1,192.168.2.116 # Listen to local interface only, comment to listen on all interfaces. 说明:作副本集bindIp 要监听本机IP和内网IP #replication: //把#去掉,并增两行 replication: oplogSizeMB: 20 replSetName: annalinux //定义副本集的名字 重启MongoDB服务: [root@root-02 ~]# systemctl restart mongod [root@root-02 ~]# ps aux |grep mongod mongod 2303 2.2 4.6 1016336 46404 ? Sl 17:28 0:00 /usr/bin/mongod -f /etc/mongod.conf root 2331 0.0 0.0 112664 968 pts/0 S+ 17:28 0:00 grep --color=auto mongod Secondary2机器: [root@root-03 ~]# vim /etc/mongod.conf net: port: 27017 bindIp: 127.0.0.1,192.168.2.117 # Listen to local interface only, comment to listen on all interfaces. 说明:作副本集bindIp 要监听本机IP和内网IP #replication: //把#去掉,并增两行 replication: oplogSizeMB: 20 replSetName: annalinux //定义副本集的名字 重启MongoDB服务: [root@root-03 ~]# systemctl restart mongod [root@root-03 ~]# ps aux |grep mongod mongod 2735 3.5 4.3 1016340 43932 ? Sl 17:46 0:00 /usr/bin/mongod -f /etc/mongod.conf root 2761 0.0 0.0 112664 968 pts/0 R+ 17:47 0:00 grep --color=auto mongod
[root@root-01 ~]# mongo MongoDB shell version v3.4.9 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 3.4.9 Server has startup warnings: 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-10-20T17:18:53.740+0800 I CONTROL [initandlisten]
说明:在哪台机器上执行这一步,那么哪台机器就会成为primary数据库
> config={_id:"annalinux",members:[{_id:0,host:"192.168.2.115:27017"},{_id:1,host:"192.168.2.116:27017"},{_id:2,host:"192.168.2.117:27017"}]} { "_id" : "annalinux", "members" : [ { "_id" : 0, "host" : "192.168.2.115:27017" }, { "_id" : 1, "host" : "192.168.2.116:27017" }, { "_id" : 2, "host" : "192.168.2.117:27017" } ] } config={_id:"annalinux" --> annalinux(副本集的名字) members --> 指定成员
> rs.initiate(config) { "ok" : 1 }
说明:能够看到192.168.2.115 显示:"stateStr" : "PRIMARY"
192.168.2.116和192.168.2.117 分别显示: "stateStr" : "SECONDARY"vim
annalinux:SECONDARY> rs.status() { "set" : "annalinux", "date" : ISODate("2017-10-20T10:02:15.484Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) } }, "members" : [ { "_id" : 0, "name" : "192.168.2.115:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 2602, "optime" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2017-10-20T10:02:12Z"), "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1508493631, 1), "electionDate" : ISODate("2017-10-20T10:00:31Z"), "configVersion" : 1, "self" : true }, { "_id" : 1, "name" : "192.168.2.116:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 115, "optime" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2017-10-20T10:02:12Z"), "optimeDurableDate" : ISODate("2017-10-20T10:02:12Z"), "lastHeartbeat" : ISODate("2017-10-20T10:02:13.519Z"), "lastHeartbeatRecv" : ISODate("2017-10-20T10:02:13.875Z"), "pingMs" : NumberLong(0), "syncingTo" : "192.168.2.117:27017", "configVersion" : 1 }, { "_id" : 2, "name" : "192.168.2.117:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 115, "optime" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1508493732, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2017-10-20T10:02:12Z"), "optimeDurableDate" : ISODate("2017-10-20T10:02:12Z"), "lastHeartbeat" : ISODate("2017-10-20T10:02:13.519Z"), "lastHeartbeatRecv" : ISODate("2017-10-20T10:02:13.881Z"), "pingMs" : NumberLong(0), "syncingTo" : "192.168.2.115:27017", "configVersion" : 1 } ], "ok" : 1 } annalinux:PRIMARY>
切换到admin库: annalinux:PRIMARY> use admin switched to db admin 建立mydb库: annalinux:PRIMARY> use mydb switched to db mydb 建立acc集合,并插入一些数据: annalinux:PRIMARY> db.acc.insert({AccountID:1,USerName:"123",password:"123456"}) WriteResult({ "nInserted" : 1 }) 查看全部库: annalinux:PRIMARY> show dbs admin 0.000GB db1 0.000GB local 0.000GB mydb 0.000GB test 0.000GB 切换到mydb库: annalinux:PRIMARY> show dbs admin 0.000GB db1 0.000GB local 0.000GB mydb 0.000GB test 0.000GB 进入mydb库: annalinux:PRIMARY> use mydb switched to db mydb 查看集合: annalinux:PRIMARY> show tables acc
说明: 和primary机器数据一致,说明搭建副本集成功了bash
登陆: [root@root-02 ~]# mongo MongoDB shell version v3.4.9 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 3.4.9 Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see http://docs.mongodb.org/ Questions? Try the support group http://groups.google.com/group/mongodb-user Server has startup warnings: 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-10-20T17:28:02.614+0800 I CONTROL [initandlisten] 查看全部库: annalinux:SECONDARY> show dbs; 2017-10-20T18:18:50.600+0800 E QUERY [thread1] Error: listDatabases failed:{ "ok" : 0, "errmsg" : "not master and slaveOk=false", //提示不是master,slaveOK=false "code" : 13435, "codeName" : "NotMasterNoSlaveOk" } : _getErrorWithCode@src/mongo/shell/utils.js:25:13 Mongo.prototype.getDBs@src/mongo/shell/mongo.js:62:1 shellHelper.show@src/mongo/shell/utils.js:769:19 shellHelper@src/mongo/shell/utils.js:659:15 @(shellhelp2):1:1 须要执行: annalinux:SECONDARY> rs.slaveOk() 再show dbs就没问题: annalinux:SECONDARY> show dbs; admin 0.000GB db1 0.000GB local 0.000GB mydb 0.000GB test 0.000GB 进入mydb库: annalinux:SECONDARY> use mydb switched to db mydb 查看集合: annalinux:SECONDARY> show tables acc
说明: 和primary机器数据一致,说明搭建副本集成功了架构
[root@root-03 ~]# mongo MongoDB shell version v3.4.9 connecting to: mongodb://127.0.0.1:27017 MongoDB server version: 3.4.9 Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see http://docs.mongodb.org/ Questions? Try the support group http://groups.google.com/group/mongodb-user Server has startup warnings: 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database. 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted. 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] 2017-10-20T17:46:53.757+0800 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2017-10-20T17:46:53.758+0800 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2017-10-20T17:46:53.758+0800 I CONTROL [initandlisten] 查看全部库: annalinux:SECONDARY> show dbs; 2017-10-20T18:18:50.600+0800 E QUERY [thread1] Error: listDatabases failed:{ "ok" : 0, "errmsg" : "not master and slaveOk=false", //提示不是master,slaveOK=false "code" : 13435, "codeName" : "NotMasterNoSlaveOk" } : _getErrorWithCode@src/mongo/shell/utils.js:25:13 Mongo.prototype.getDBs@src/mongo/shell/mongo.js:62:1 shellHelper.show@src/mongo/shell/utils.js:769:19 shellHelper@src/mongo/shell/utils.js:659:15 @(shellhelp2):1:1 须要执行: annalinux:SECONDARY> rs.slaveOk() 再show dbs就没问题: annalinux:SECONDARY> show dbs; admin 0.000GB db1 0.000GB local 0.000GB mydb 0.000GB test 0.000GB 进入mydb库: annalinux:SECONDARY> use mydb switched to db mydb 查看集合: annalinux:SECONDARY> show tables acc
说明:在primary机器添加防火墙规则,伪装primary宕机,看看是如何变成一个新的primary。app
说明:能够看到三个节点都是"priority" : 1, 也就是说三个节点权重都是同样负载均衡
annalinux:PRIMARY> rs.config() { "_id" : "annalinux", "version" : 1, "protocolVersion" : NumberLong(1), "members" : [ { "_id" : 0, "host" : "192.168.2.115:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "192.168.2.116:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "192.168.2.117:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : 60000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("59e9c9342025e71e7bcebc7b") } }
[root@root-01 ~]# iptables -I INPUT -p tcp --dport 27017 -j DROP
说明: 本来192.168.2.115是primary,如今192.168.2.117变成了primarytcp
查看副本集状态: annalinux:SECONDARY> rs.status() { "set" : "annalinux", "date" : ISODate("2017-10-20T10:40:27.907Z"), "myState" : 2, "term" : NumberLong(2), "syncingTo" : "192.168.2.117:27017", "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1508496022, 1), "t" : NumberLong(2) }, "appliedOpTime" : { "ts" : Timestamp(1508496022, 1), "t" : NumberLong(2) }, "durableOpTime" : { "ts" : Timestamp(1508496022, 1), "t" : NumberLong(2) } }, "members" : [ { "_id" : 0, "name" : "192.168.2.115:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", //显示不能获取状态值 "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDurable" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2017-10-20T10:40:26.805Z"), "lastHeartbeatRecv" : ISODate("2017-10-20T10:40:27.796Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "Couldn't get a connection within the time limit", "configVersion" : -1 }, { "_id" : 1, "name" : "192.168.2.116:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 4345, "optime" : { "ts" : Timestamp(1508496022, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2017-10-20T10:40:22Z"), "syncingTo" : "192.168.2.117:27017", "configVersion" : 1, "self" : true }, { "_id" : 2, "name" : "192.168.2.117:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 2405, "optime" : { "ts" : Timestamp(1508496022, 1), "t" : NumberLong(2) }, "optimeDurable" : { "ts" : Timestamp(1508496022, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2017-10-20T10:40:22Z"), "optimeDurableDate" : ISODate("2017-10-20T10:40:22Z"), "lastHeartbeat" : ISODate("2017-10-20T10:40:26.709Z"), "lastHeartbeatRecv" : ISODate("2017-10-20T10:40:26.089Z"), "pingMs" : NumberLong(0), "electionTime" : Timestamp(1508495934, 1), "electionDate" : ISODate("2017-10-20T10:38:54Z"), "configVersion" : 1 } ], "ok" : 1 }
说明:把在原来的primary机器设置的iptables规则删除掉,它也不能变会primary,除非设置了权重
[root@root-01 ~]# iptables -D INPUT -p tcp --dport 27017 -j DROP
设置变量: annalinux:PRIMARY> cfg=rs.config() { "_id" : "annalinux", "version" : 1, "protocolVersion" : NumberLong(1), "members" : [ { "_id" : 0, "host" : "192.168.2.115:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "192.168.2.116:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "192.168.2.117:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : 60000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("59e9c9342025e71e7bcebc7b") } } 设置权重: annalinux:PRIMARY> cfg.members[0].priority=3 3 annalinux:PRIMARY> cfg.members[1].priority=2 2 annalinux:PRIMARY> cfg.members[2].priority=1 1 让设置的权重生效: annalinux:PRIMARY> rs.reconfig(cfg) { "ok" : 1 } 查看权重: annalinux:PRIMARY> rs.config() 2017-10-20T18:54:07.529+0800 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetGetConfig' on host '127.0.0.1:27017' : DB.prototype.runCommand@src/mongo/shell/db.js:132:1 DB.prototype.adminCommand@src/mongo/shell/db.js:150:16 rs.conf@src/mongo/shell/utils.js:1271:16 @(shell):1:1 2017-10-20T18:54:07.533+0800 I NETWORK [thread1] trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed 2017-10-20T18:54:07.534+0800 I NETWORK [thread1] reconnect 127.0.0.1:27017 (127.0.0.1) ok annalinux:SECONDARY> 说明: 提示这台机器不是primary,已经发生了改变,这台机器变成了Secondary了 从新执行rs.config(): annalinux:SECONDARY> rs.config() { "_id" : "annalinux", "version" : 2, "protocolVersion" : NumberLong(1), "members" : [ { "_id" : 0, "host" : "192.168.2.115:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 3, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "192.168.2.116:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 2, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "192.168.2.117:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : 60000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("59e9c9342025e71e7bcebc7b") } } 说明:能够看到192.168.2.115的权重显示: "priority" : 3 192.168.2.116的权重显示: "priority" : 2 192.168.2.117的权重显示: "priority" : 1