MySQL Group Replication(MGR)框架让MySQL具有了自动主从切换和故障恢复能力。本文测试,单主模式下,master发生了crash后,DB系统恢复状况。html
角色 | IP | port | server-id |
---|---|---|---|
DB-1 | 192.110.103.41 | 3106 | 103413106 |
DB-2 | 192.110.103.42 | 3106 | 103423106 |
DB-3 | 192.110.103.43 | 3106 | 103423106 |
说明:机器上port 3306已在使用被占了,只好改用3106。mysql
见上节的MGR安装步骤。MySQL5.7 MGR安装及介绍算法
MGR以组视图(Group View,简称视图)为基础来进行成员管理。视图指Group在一段时间内的成员状态,若是在这段时间内没有成员变化,也就是说没有成员加入或退出,则这段连续的时间为一个视图,若是发生了成员加入或退出变化,则视图也就发生了变化,MGR使用视图ID(View ID)来跟踪视图的变化并区分视图的前后时间。sql
MySQL的组复制能够配置为单主模型和多主模型两种工做模式。 如下是两种工做模式的特性简介:bootstrap
MySQL组复制使用Paxos分布式算法来提供节点间的分布式协调。正因如此,它要求组中大多数节点在线才能达到法定票数,从而对一个决策作出一致的决定。安全
大多数指的是N/2+1(N是组中目前节点总数),例如目前组中有5个节点,则须要3个节点才能达到大多数的要求。因此,容许出现故障的节点数量以下图:bash
组大小 | 大多数数量 | 故障容忍数量 |
---|---|---|
1 | 1 | 0 |
2 | 2 | 0 |
3 | 2 | 1 |
4 | 3 | 1 |
5 | 3 | 2 |
[mysqld]
datadir=/data
socket=/data/mysql.sock
server-id=100 # 必须
gtid_mode=on # 必须
enforce_gtid_consistency=on # 必须
log-bin=/data/master-bin # 必须
binlog_format=row # 必须
binlog_checksum=none # 必须
master_info_repository=TABLE # 必须
relay_log_info_repository=TABLE # 必须
relay_log=/data/relay-log # 必须,若是不给,将采用默认值
log_slave_updates=ON # 必须
sync-binlog=1 # 建议
log-error=/data/error.log
pid-file=/data/mysqld.pid
transaction_write_set_extraction=XXHASH64 # 必须
loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" # 必须
loose-group_replication_start_on_boot=off # 建议设置为OFF
loose-group_replication_member_weigth = 40 # 非必需,mysql 5.7.20才开始支持该选项
loose-group_replication_local_address = "192.110.103.41:31061" #必须,下一行也必须
loose-group_replication_group_seeds = "192.110.103.41:31061,192.110.103.42:31061,192.110.103.43:31061"
loose-group_replication_bootstrap_group = OFF
loose-group_replication_single_primary_mode = FALSE # = multi-primary
loose-group_replication_enforce_update_everywhere_checks=ON # = multi-primary
复制代码
分析一下上面的配置选项:并发
操做组复制的语句。app
SELECT * FROM performance_schema.replication_group_members;
#查看master
SELECT ta.* ,tb.MEMBER_HOST,tb.MEMBER_PORT,tb.MEMBER_STATE FROM performance_schema.global_status ta,performance_schema.replication_group_members tb WHERE ta.VARIABLE_NAME='group_replication_primary_member' and ta.VARIABLE_VALUE=tb.MEMBER_ID;
SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='group_replication_primary_member';
>SHOW STATUS LIKE 'group_replication_primary_member';
start group_replication;
stop group_replication;
复制代码
当要中止组中的某个成员中的组复制功能时,须要在那个节点上执行stop group_replication语句。但必定要注意,在执行这个语句以前,必需要保证这个节点不会向外提供MySQL服务,不然有可能会有新数据写入(例如主节点中止时),或者读取到过时数据。 故要安全地重启整个组,最佳方法是先中止全部非主节点的MySQL实例(不只是中止组复制功能),而后中止主节点的MySQL实例,再先重启主节点,在这个节点上引导组,并启动它的组复制功能。最后再将各slave节点加入组。框架
组复制中,有两种节点离组的状况:自愿离组、非自愿离组。
自愿离组:
执行stop group_replication;语句。
非自愿离组:
除了上面自愿离组的状况,全部离组的状况都是非自愿离组。好比节点宕机,断网等等。
1).节点非自愿离组时,故障探测机制会检测到这个问题,因而向组中报告这个问题。而后会触发组视图成员自动配置,须要大多数节点赞成新视图。
2).非自愿离组时,组的大小不会改变,不管多少个节点的组,节点非自愿退出后,组大小仍是5,只不过这些离组的节点被标记为非ONLINE。
3).非自愿离组时,会丢失法定票数。因此,当非自愿离组节点数量过多时,致使组中剩余节点数量达不到大多数的要求,组就会被阻塞。
4).举个例子,5节点的组,非自愿退出1个节点A后,这个组的大小仍是5,可是节点A在新的视图中被标记为unreachable或其余状态。当继续非自愿退出2个节点后,组中只剩下2个ONLINE节点,这时达不到大多数的要求,组就会被阻塞。
当前状态为单主模式,状态以下:
select * from performance_schema.global_variables where VARIABLE_NAME
in ('group_replication_single_primary_mode','group_replication_enforce_update_everywhere_checks');
+----------------------------------------------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+----------------------------------------------------+----------------+
| group_replication_enforce_update_everywhere_checks | OFF |
| group_replication_single_primary_mode | ON |
+----------------------------------------------------+----------------+
2 rows in set (0.00 sec)
select * from performance_schema.global_variables where VARIABLE_NAME like '%read_only';
+-----------------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+-----------------------+----------------+
| innodb_read_only | OFF |
| read_only | OFF |
| super_read_only | OFF |
+-----------------------+----------------+
> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
| group_replication_applier | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | ONLINE |
| group_replication_applier | ee4a9cec-f3d5-11e9-9ded-a0369fa6cd30 | 192.110.103.43 | 3106 | ONLINE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
复制代码
当前master: 只有单主模型的组复制才须要查找主节点,多主模型没有master/slave的概念,因此无需查找。
SELECT * FROM performance_schema.global_status WHERE VARIABLE_NAME='group_replication_primary_member';
+----------------------------------+--------------------------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 |
+----------------------------------+--------------------------------------+
#查看Master
SELECT ta.* ,tb.MEMBER_HOST,tb.MEMBER_PORT,tb.MEMBER_STATE FROM performance_schema.global_status ta,performance_schema.replication_group_members tb WHERE ta.VARIABLE_NAME='group_replication_primary_member' and ta.VARIABLE_VALUE=tb.MEMBER_ID;
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
| VARIABLE_NAME | VARIABLE_VALUE | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_primary_member | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
1 row in set (0.00 sec)
复制代码
1)正常关停master
用kill命令关停master.
#Master 机器192.110.103.41上kill mysql 。
$ps aux|grep 3106
mysql 122456 0.0 0.0 106252 1444 ? S Oct23 0:00 /bin/sh /usr/local/mysql-5.7.23/bin/mysqld_safe --defaults-file=/data1/mysql_3106/etc/my.cnf
mysql 123471 0.1 0.8 11575792 1069584 ? Sl Oct23 1:45 /usr/local/mysql-5.7.23/bin/mysqld --defaults-file=/data1/mysql_3106/etc/my.cnf --basedir=/usr/local/mysql-5.7.23 --datadir=/data1/mysql_3106/data --plugin-dir=/usr/local/mysql-5.7.23/lib/plugin --log-error=/data1/mysql_3106/logs/mysqld.err --open-files-limit=8192 --pid-file=/data1/mysql_3106/tmp/mysql.pid --socket=/data1/mysql_3106/tmp/mysql.sock --port=3106
$kill 122456 123471 ; tail -f /data1/mysql_3106/logs/mysqld.err
复制代码
日志:
原DB Master日志
$tail -f /data1/mysql_3106/logs/mysqld.err
2019-10-24T03:10:32.746843Z 0 [Warning] /usr/local/mysql-5.7.23/bin/mysqld: Forcing close of thread 31 user: 'root'
2019-10-24T03:10:32.746873Z 0 [Note] Plugin group_replication reported: 'Plugin 'group_replication' is stopping.'
2019-10-24T03:10:32.746901Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2019-10-24T03:10:35.797258Z 0 [Note] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
2019-10-24T03:10:40.799923Z 0 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1'
2019-10-24T03:10:40.799954Z 0 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1'
2019-10-24T03:10:40.800110Z 7 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2019-10-24T03:10:40.800431Z 7 [Note] Slave SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 65
2019-10-24T03:10:40.800652Z 4 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
2019-10-24T03:10:40.800787Z 0 [Note] Plugin group_replication reported: 'Plugin 'group_replication' has been stopped.'
2019-10-24T03:10:40.800799Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2019-10-24T03:10:40.801278Z 0 [Note] Binlog end
2019-10-24T03:10:40.802272Z 0 [Note] Shutting down plugin 'group_replication'
2019-10-24T03:10:40.802322Z 0 [Note] Plugin group_replication reported: 'All Group Replication server observers have been successfully unregistered'
...
2019-10-24T03:10:42.804477Z 0 [Note] Shutting down plugin 'binlog'
2019-10-24T03:10:42.805238Z 0 [Note] /usr/local/mysql-5.7.23/bin/mysqld: Shutdown complete
2019-10-24T03:10:42.814933Z mysqld_safe mysqld from pid file /data1/mysql_3106/tmp/mysql.pid ended
复制代码
新DB Master日志:
$tail -n 30 /data1/mysql_3106/logs/mysqld.err
2019-10-23T11:11:00.671705Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2019-10-23T11:11:00.671736Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 31061'
2019-10-23T11:11:05.400823Z 2 [Note] Plugin group_replication reported: 'This server is working as secondary member with primary member address 192.110.103.41:3106.'
2019-10-23T11:11:05.401138Z 20 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2019-10-23T11:11:05.401143Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.110.103.41:3106, 192.110.103.42:3106 on view 15718289704352993:2.'
2019-10-23T11:11:05.402757Z 20 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='192.110.103.41', master_port= 3106, master_log_file='', master_log_pos= 4, master_bind=''.
2019-10-23T11:11:05.404717Z 20 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor 509810ee-f3d7-11e9-a7d5-a0369fac2de4 at 192.110.103.41 port: 3106.'
2019-10-23T11:11:05.404998Z 22 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2019-10-23T11:11:05.406423Z 22 [Note] Slave I/O thread for channel 'group_replication_recovery': connected to master 'repl@192.110.103.41:3106',replication started in log 'FIRST' at position 4
2019-10-23T11:11:05.442349Z 23 [Note] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log './mysql-relay-bin-group_replication_recovery.000001' position: 4
2019-10-23T11:11:05.461483Z 20 [Note] Plugin group_replication reported: 'Terminating existing group replication donor connection and purging the corresponding logs.'
2019-10-23T11:11:05.461910Z 23 [Note] Slave SQL thread for channel 'group_replication_recovery' exiting, replication stopped in log 'mysql-bin.000002' at position 934
2019-10-23T11:11:05.462119Z 22 [Note] Slave I/O thread killed while reading event for channel 'group_replication_recovery'
2019-10-23T11:11:05.462143Z 22 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'mysql-bin.000002', position 934
2019-10-23T11:11:05.523357Z 20 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='192.110.103.41', master_port= 3106, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2019-10-23T11:11:05.526137Z 0 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'
2019-10-23T11:15:33.426684Z 0 [Note] Plugin group_replication reported: 'Members joined the group: 192.110.103.43:3106'
2019-10-23T11:15:33.426832Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.110.103.41:3106, 192.110.103.42:3106, 192.110.103.43:3106 on view 15718289704352993:3.'
2019-10-23T11:15:34.094942Z 0 [Note] Plugin group_replication reported: 'The member with address 192.110.103.43:3106 was declared online within the replication group'
2019-10-24T03:10:32.839967Z 0 [Warning] Plugin group_replication reported: 'Members removed from the group: 192.110.103.41:3106'
2019-10-24T03:10:32.839985Z 0 [Note] Plugin group_replication reported: 'Primary server with address 192.110.103.41:3106 left the group. Electing new Primary.'
2019-10-24T03:10:32.840052Z 0 [Note] Plugin group_replication reported: 'A new primary with address 192.110.103.42:3106 was elected, enabling conflict detection until the new primary applies all relay logs.'
2019-10-24T03:10:32.840086Z 41 [Note] Plugin group_replication reported: 'This server is working as primary member.'
2019-10-24T03:10:32.840107Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.110.103.42:3106, 192.110.103.43:3106 on view 15718289704352993:4.'
2019-10-24T03:12:01.677869Z 4 [Note] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection'
复制代码
DB自动切换:
能够看到DB已顺利完成自动切换。
root@192.110.103.42 : (none) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | ONLINE |
| group_replication_applier | ee4a9cec-f3d5-11e9-9ded-a0369fa6cd30 | 192.110.103.43 | 3106 | ONLINE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
2 rows in set (0.00 sec)
root@192.110.103.42 : (none) > SELECT ta.* ,tb.MEMBER_HOST,tb.MEMBER_PORT,tb.MEMBER_STATE FROM performance_schema.global_status ta,performance_schema.replication_group_members tb WHERE ta.VARIABLE_NAME='group_replication_primary_member' and ta.VARIABLE_VALUE=tb.MEMBER_ID;
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
| VARIABLE_NAME | VARIABLE_VALUE | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_primary_member | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | ONLINE |
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
1 row in set (0.00 sec)
复制代码
原master恢复,从新加入机群
原master恢复,START GROUP_REPLICATION;
> select @@group_replication_bootstrap_group;
+-------------------------------------+
| @@group_replication_bootstrap_group |
+-------------------------------------+
| 0 |
+-------------------------------------+
> START GROUP_REPLICATION;
> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
| group_replication_applier | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | ONLINE |
| group_replication_applier | ee4a9cec-f3d5-11e9-9ded-a0369fa6cd30 | 192.110.103.43 | 3106 | ONLINE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
复制代码
2)异常关停master
异常关停kill -9 master节点,192.110.103.42。从日志中能够看到自动选主,并主变成为192.110.103.41。
#kill -9 master节点,192.110.103.42上执行.
$kill -9 mysql_pid
#192.110.103.41机器上日志
tail -f /data1/mysql_3106/logs/mysqld.err
2019-10-24T06:17:31.473849Z 0 [Warning] Plugin group_replication reported: 'Member with address 192.110.103.42:3106 has become unreachable.'
2019-10-24T06:17:32.479299Z 0 [Warning] Plugin group_replication reported: 'Members removed from the group: 192.110.103.42:3106'
2019-10-24T06:17:32.479323Z 0 [Note] Plugin group_replication reported: 'Primary server with address 192.110.103.42:3106 left the group. Electing new Primary.'
2019-10-24T06:17:32.479395Z 0 [Note] Plugin group_replication reported: 'A new primary with address 192.110.103.41:3106 was elected, enabling conflict detection until the new primary applies all relay logs.'
2019-10-24T06:17:32.479439Z 37 [Note] Plugin group_replication reported: 'This server is working as primary member.'
2019-10-24T06:17:32.479465Z 0 [Note] Plugin group_replication reported: 'Group membership changed to 192.110.103.41:3106, 192.110.103.43:3106 on view 15718289704352993:6.'
复制代码
root@192.110.103.41 : (none) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
| group_replication_applier | ee4a9cec-f3d5-11e9-9ded-a0369fa6cd30 | 192.110.103.43 | 3106 | ONLINE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
2 rows in set (0.00 sec)
root@192.110.103.41 : (none) > SELECT ta.* ,tb.MEMBER_HOST,tb.MEMBER_PORT,tb.MEMBER_STATE FROM performance_schema.global_status ta,performance_schema.replication_group_members tb WHERE ta.VARIABLE_NAME='group_replication_primary_member' and ta.VARIABLE_VALUE=tb.MEMBER_ID;
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
| VARIABLE_NAME | VARIABLE_VALUE | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_primary_member | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
+----------------------------------+--------------------------------------+---------------+-------------+--------------+
1 row in set (0.00 sec)
复制代码
组中节点删除较简单,执行stop group_replication语句,便可。
stop group_replication;
select * from performance_schema.replication_group_members;
复制代码
1)原MGR节点重启后,恢复加入 如果之前的MGR某节点关停了,而后服务恢复了。一般只要**start group_replication;**命令以下:。
set global group_replication_enforce_update_everywhere_checks=OFF;
set global group_replication_single_primary_mode=ON;
START GROUP_REPLICATION;
select * from performance_schema.replication_group_members;
复制代码
2)全新节点加入 如果新增一个节点或恢复的Node 故障时间太长,master log已purge了,则不能直接START GROUP_REPLICATION; 恢复。须要本身手动在MGR集群中备份,MGR集群不存在SST和IST概念,而是经过GTID和binlog来实现“同步,追数据”的一个操做。
示例: 192.110.103.42:3106 故障时间太长,MGR中master log已purge了,致使状态一直为RECOVERING或ERROR,没法加入MGR。 说明: MGR宕机节点会询问存活集群,是否能补全binlog?若是能补齐,那么就会正常传输,进行追数据 ;若是宕机节点须要的日志不存在了,则该节点没法正常加入到集群环境中。
root@192.110.103.42 : (none) > show variables like 'group_replication_group_seeds';
+-------------------------------+-------------------------------------------------------------+
| Variable_name | Value |
+-------------------------------+-------------------------------------------------------------+
| group_replication_group_seeds | 192.110.103.41:31061,192.110.103.42:31061,192.110.103.43:31061 |
+-------------------------------+-------------------------------------------------------------+
1 row in set (0.00 sec)
root@192.110.103.42 : (none) > start group_replication;
Query OK, 0 rows affected (3.35 sec)
root@192.110.103.42 : (none) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
| group_replication_applier | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | RECOVERING |
| group_replication_applier | ee4a9cec-f3d5-11e9-9ded-a0369fa6cd30 | 192.110.103.43 | 3106 | ONLINE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
3 rows in set (0.00 sec)
root@192.110.103.42 : (none) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | ERROR |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
1 row in set (0.00 sec)
复制代码
错误日志:
2019-10-24T08:15:11.015032Z 198 [ERROR] Error reading packet from server for channel 'group_replication_recovery': The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but t
he master has purged binary logs containing GTIDs that the slave requires. (server_errno=1236)
2019-10-24T08:15:11.015060Z 198 [ERROR] Slave I/O for channel 'group_replication_recovery': Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CH ANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236
2019-10-24T08:15:11.015076Z 198 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position 4
2019-10-24T08:15:11.015131Z 196 [Note] Plugin group_replication reported: 'Terminating existing group replication donor connection and purging the corresponding logs.'
2019-10-24T08:15:11.015164Z 199 [Note] Error reading relay log event for channel 'group_replication_recovery': slave SQL thread was killed
2019-10-24T08:15:11.015530Z 199 [Note] Slave SQL thread for channel 'group_replication_recovery' exiting, replication stopped in log 'FIRST' at position 0
2019-10-24T08:15:11.017239Z 196 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='192.110.103.41', master_port= 3106, master_log_file='', m
aster_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2019-10-24T08:15:11.019165Z 196 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/10'
复制代码
此种状况下,从新制作从库,并处理以下: 1)制做备份
/usr/local/mysql/bin/mysqldump -h 127.0.0.1 -P3106 --all-databases --default-character-set=utf8 -R -q --triggers --master-data=2 --single-transaction > mysql3106_online.sql ;
复制代码
2)关闭同步进程、只读、并清理本地GTID信息。
show variables like 'group_replication_group_seeds';
STOP GROUP_REPLICATION;
set global super_read_only=0;
show master logs;
reset master;
show master logs;
复制代码
3)导入备份数据:
mysql -h 127.0.0.1 -P3106 --default-character-set=utf8 < mysql3106_online.sql
复制代码
4)开启恢复MGR:
set global group_replication_enforce_update_everywhere_checks=OFF;
set global group_replication_single_primary_mode=ON;
START GROUP_REPLICATION;
复制代码
5)结果:
由于是新制做的从库,能正常同步追上数据。故可用START GROUP_REPLICATION;成功加入到MGR中。
>START GROUP_REPLICATION;
> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
| group_replication_applier | 509810ee-f3d7-11e9-a7d5-a0369fac2de4 | 192.110.103.41 | 3106 | ONLINE |
| group_replication_applier | 53e462be-f3d7-11e9-9125-a0369fa6cce4 | 192.110.103.42 | 3106 | ONLINE |
| group_replication_applier | ee4a9cec-f3d5-11e9-9ded-a0369fa6cd30 | 192.110.103.43 | 3106 | ONLINE |
+---------------------------+--------------------------------------+---------------+-------------+--------------+
复制代码