【背景介绍】mysql
故障方描述:一次用户刷权限的时候不当心把数据库用户表记录删掉了,执行以后发现不对后重建用户,杀掉进程后从新MGR启动报错。sql
【报错信息】数据库
2018-06-13T12:47:41.405593Z 32 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2018-06-13T12:47:41.405820Z 32 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 127.0.0.1/8,172.xx.xxx.xxx/26,192.xxx.xx.xxx/24 to the whitelist'
2018-06-13T12:47:41.406172Z 32 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2018-06-13T12:47:41.406216Z 32 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: group_replication_group_name: "b47a8cea-6cf5-4ea4-933f-a8c20905f900"; group_replication_local_address: "172.xx.xxx.xxx:xxx1"; group_replication_group_seeds: "172.xx.xxx.xxx:xxx1,172.xx.xxx.xxx:24901,172.xx.xxx.xxx:xxxx1"; group_replication_bootstrap_group: true; group_replication_poll_spin_loops: 0; group_replication_compression_threshold: 100; group_replication_ip_whitelist: "AUTOMATIC"'
2018-06-13T12:47:41.406944Z 34 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2018-06-13T12:47:41.434136Z 34 [ERROR] Slave SQL for channel 'group_replication_applier': Slave failed to initialize relay log info structure from the repository, Error_code: 1872
2018-06-13T12:47:41.434183Z 34 [ERROR] Plugin group_replication reported: 'Error while starting the group replication applier thread'
2018-06-13T12:47:41.434323Z 34 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
2018-06-13T12:47:41.434389Z 32 [ERROR] Plugin group_replication reported: 'Unable to initialize the Group Replication applier module.'
2018-06-13T12:47:41.434551Z 32 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2018-06-13T12:47:41.434588Z 32 [ERROR] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.'bootstrap
【问题分析】架构
从报错日志查看,数据库在识别relay log时出现问题,从Oracle官方文档能够确认异常终止MGR服务命中了Bug25534078:oracle
app
进行信息查询符合上面BUG现象:oop
【解决办法】spa
清理mysql.slave_relay_log_info时先记录日志信息,按照以下方法正常修复,同时建议参数文件要指定relay log参数路径。日志
可是因为再从新建立用户时,没有关闭binlog同步到其余节点,致使其余节点加入集群是报错。
提供第一个方案:进行reset master(此操做很是危险),可是因为MGR自己存在问题时间比较久致使binlog过时丢失,所以没法修复。
提供第二个方案:暂时提供单节点服务,制定好方案后,找另外一个时间窗口对MGR架构进行修复。
【总结】
在涉及数据库重要配置时,谨慎操做。
在出现问题的时候更加注意再次误操做,致使更加难恢复。
对数据库重要进程进行监控,及时发现问题及时修复。