下图显示了应用程序查询三节点副本集的典型环境。node
在正常操做期间,副本集只有一个节点做为PRIMARY而全部其余节点都是SECONDARY。PRIMARY成员是惟一接收写入的成员。它更新其本地集合和文档以及oplog。而后,oplog事件经过复制通道发送到全部SECONDARY节点。每一个SECONDARY节点在本地和异步上应用对本地数据和oplog的相同修改。算法
下图在内部显示了副本集的工做原理。每一个节点都链接到全部其余节点,而且有一个心跳机制来ping任何其余节点。心跳具备用于ping节点的可配置时间,默认值为10秒。sql
若是全部节点都响应心跳确认,则群集继续工做。若是其中一个节点崩溃,例如PRIMARY(最坏的状况),则发生涉及剩余节点的选举。mongodb
当SECONDARY在配置的超时后没有收到对心跳的响应时,它会要求进行选举。仍然存活的节点投票支持新的PRIMARY。选举阶段一般不须要很长时间,选举算法足够复杂,让他们选择最佳的次要成为新的主要。让咱们说这是次要的,与死亡的初级主要是最新的。shell
除了主要崩溃以外,还有一些节点要求进行选举的状况:将节点添加到副本集时,在“启动副本集”期间或在某些维护活动期间。这种选举不是本文的目的。数据库
副本集在选举成功完成以前没法处理写入操做,但若是将此类查询配置为在辅助节点上运行,则能够继续提供读取查询(稍后咱们将对此进行讨论)。选举正确完成后,群集将恢复正常操做。bash
要正常工做,副本集须要具备奇数个成员。在网络分裂的状况下,只有奇数个成员确保咱们在其中一个子集中拥有大多数投票。在具备大多数节点的子集中选择新的PRIMARY。服务器
所以,三个是副本集的最小节点数,以确保高可用性。网络
因为每一个节点都须要拥有完整的数据副本,所以若是您拥有庞大的数据库,则须要提供至少三台具备大量磁盘,内存和CPU资源的计算机。这可能很昂贵。app
幸运的是,您能够将其中一个节点配置为Arbiter,这是一个不复制数据的特殊成员。它是空的,但它能够在选举期间投票。
使用仲裁节点是维持奇数成员的一个很好的解决方案,而不须要花费不少钱来使第三个节点像其余节点同样强大。仲裁节点不能被选为新主节点,由于它没有数据。
ip | 主机名 | 系统 |
---|---|---|
172.18.11.142 | nodejs1 | Centos 7.6 |
172.18.11.143 | nodejs2 | Centos 7.6 |
172.18.11.144 | nodejs3 | Centos 7.6 |
# 下载安装包并安装 mkdir -p /opt/mongodb/ cat <<EOF > /opt/mongodb/mongodb_down.sh cd /opt/mongodb/ wget https://www.percona.com/downloads/percona-server-mongodb-LATEST/percona-server-mongodb-4.0.9-4/binary/redhat/7/x86_64/percona-server-mongodb-shell-4.0.9-4.el7.x86_64.rpm wget https://www.percona.com/downloads/percona-server-mongodb-LATEST/percona-server-mongodb-4.0.9-4/binary/redhat/7/x86_64/percona-server-mongodb-mongos-4.0.9-4.el7.x86_64.rpm wget https://www.percona.com/downloads/percona-server-mongodb-LATEST/percona-server-mongodb-4.0.9-4/binary/redhat/7/x86_64/percona-server-mongodb-tools-4.0.9-4.el7.x86_64.rpm wget https://www.percona.com/downloads/percona-server-mongodb-LATEST/percona-server-mongodb-4.0.9-4/binary/redhat/7/x86_64/percona-server-mongodb-server-4.0.9-4.el7.x86_64.rpm wget https://www.percona.com/downloads/percona-server-mongodb-LATEST/percona-server-mongodb-4.0.9-4/binary/redhat/7/x86_64/percona-server-mongodb-4.0.9-4.el7.x86_64.rpm EOF bash -x /opt/mongodb/mongodb_down.sh yum localinstall /opt/mongodb/*.rpm -y systemctl enable mongod
主机名
cat <<EOF >> /etc/hosts 192.168.0.249 k8s-m1 192.168.0.250 k8s-n1 192.168.0.251 k8s-n2 EOF
配置文件
cat <<EOF > /etc/mongod.conf storage: dbPath: /var/lib/mongo journal: enabled: true systemLog: destination: file logAppend: true path: /var/log/mongo/mongod.log processManagement: fork: true pidFilePath: /var/run/mongod.pid net: port: 27017 bindIp: 0.0.0.0 EOF
启动
systemctl start mongod
副本集的名称
副本集的名称这里使用 rs-smy,将副本集名称放置于每一台主机的 /etc/mongod.conf
replication: replSetName: "rs-smy"
重启全部服务器
systemctl restart mongod
初始化集群
随便链接到一个节点,发出 rs.initiate() 让副本集直到有哪些成员
rs.initiate( { _id: "rs-smy", members: [ { _id: 0, host: "172.18.11.142:27017" }, { _id: 1, host: "172.18.11.143:27017" }, { _id: 2, host: "172.18.11.144:27017" } ] })
发出命令后,MongoDB使用默认配置启动复制过程。选择PRIMARY节点,如今将建立的全部文档将在SECONDARY节点上异步复制。
咱们能够经过查看mongo shell提示符来验证复制是否正常。一旦副本集启动并运行,提示应该在PRIMARY节点上以下:
rs-smy:PRIMARY>
在SECONDARY节点上这样:
rs-test:SECONDARY>
有几个命令能够调查并在副本集上执行一些管理任务。这里有几个。
要调查副本集配置,您能够在任何节点上发出rs.conf()
rs-smy:SECONDARY> rs.conf() { "_id" : "rs-smy", "version" : 1, "protocolVersion" : NumberLong(1), "writeConcernMajorityJournalDefault" : true, "members" : [ { "_id" : 0, "host" : "172.18.11.142:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "172.18.11.143:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "172.18.11.144:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : -1, "catchUpTakeoverDelayMillis" : 30000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("5ce54845de4f73c1e97ea042") } }
咱们能够看到有关已配置节点的信息,不管是仲裁仍是隐藏,优先级以及有关心跳过程的其余详细信息。
要调查副本集状态,您能够在任何节点上发出rs.status()
s-smy:SECONDARY> rs.status() { "set" : "rs-smy", "date" : ISODate("2019-05-23T10:12:44.663Z"), "myState" : 2, "term" : NumberLong(7), "syncingTo" : "172.18.11.144:27017", "syncSourceHost" : "172.18.11.144:27017", "syncSourceId" : 2, "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "readConcernMajorityOpTime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "appliedOpTime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "durableOpTime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) } }, "lastStableCheckpointTimestamp" : Timestamp(1558606350, 1), "members" : [ { "_id" : 0, "name" : "172.18.11.142:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 24733, "optime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "optimeDate" : ISODate("2019-05-23T10:12:40Z"), "syncingTo" : "172.18.11.144:27017", "syncSourceHost" : "172.18.11.144:27017", "syncSourceId" : 2, "infoMessage" : "", "configVersion" : 1, "self" : true, "lastHeartbeatMessage" : "" }, { "_id" : 1, "name" : "172.18.11.143:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 24523, "optime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "optimeDurable" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "optimeDate" : ISODate("2019-05-23T10:12:40Z"), "optimeDurableDate" : ISODate("2019-05-23T10:12:40Z"), "lastHeartbeat" : ISODate("2019-05-23T10:12:44.463Z"), "lastHeartbeatRecv" : ISODate("2019-05-23T10:12:44.448Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "172.18.11.144:27017", "syncSourceHost" : "172.18.11.144:27017", "syncSourceId" : 2, "infoMessage" : "", "configVersion" : 1 }, { "_id" : 2, "name" : "172.18.11.144:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 24519, "optime" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "optimeDurable" : { "ts" : Timestamp(1558606360, 1), "t" : NumberLong(7) }, "optimeDate" : ISODate("2019-05-23T10:12:40Z"), "optimeDurableDate" : ISODate("2019-05-23T10:12:40Z"), "lastHeartbeat" : ISODate("2019-05-23T10:12:44.464Z"), "lastHeartbeatRecv" : ISODate("2019-05-23T10:12:44.433Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "electionTime" : Timestamp(1558594808, 1), "electionDate" : ISODate("2019-05-23T07:00:08Z"), "configVersion" : 1 } ], "ok" : 1, "operationTime" : Timestamp(1558606360, 1), "$clusterTime" : { "clusterTime" : Timestamp(1558606360, 1), "signature" : { "hash" : BinData(0,"eo6zfAdAzzCTw/rQj+OWbd7Vots="), "keyId" : NumberLong("6693835933885661186") } } }
能够看到哪一个是PRIMARY,哪一个是SECONDARY
链接到PRIMARY节点并建立例子:
rs-smy:PRIMARY> use test switched to db test rs-test:PRIMARY> db.foo.insert( {name:"Bruce", surname:"Dickinson"} ) WriteResult({ "nInserted" : 1 }) rs-smy:PRIMARY> db.foo.find().pretty() { "_id" : ObjectId("5ae05ac27e6680071caf94b7") "name" : "Bruce" "surname" : "Dickinson" }
而后链接到SECONDARY节点并查找相同的文档。
请记住,您没法链接到SECONDARY节点以读取数据。默认状况下,只容许在PRIMARY上进行读写操做。所以,若是要读取SECONDARY节点上的数据,首先须要发出rs.slaveOK()命令。若是您不这样作,您将收到错误。
rs-test:SECONDARY> rs.slaveOK() rs-test:SECONDARY> show collections local <strong>foo</strong> rs-test:SECONDARY> db.foo.find().pretty() { "_id" : ObjectId("5ae05ac27e6680071caf94b7") "name" : "Bruce" "surname" : "Dickinson" }
SECONDARY节点已经复制了集合foo和插入文档的建立。