(MHA+MYSQL-5.7加强半同步)高可用架构设计与实现

       架构使用mysql5.7版本基于GTD加强半同步并行复制配置 reploication 一主两从,使用MHA套件管理整个复制架构,实现故障自动切换高可用
       优点:
          一、加强半同步设置 AFTER_SYNC 提升数据安全性,主从一致性,
          二、mha 特性提升故障后主从数据一致性,自动切换并从新配置主从、切换后不影响业务正常写入
 
 
 
1、MHA介绍:
    一、简介
 ( Master High Availability)是一款开源的MySQL的高可用per脚本开发的程序套件,它为MySQL主从复制架构提供了automating master failover 功能。MHA在监控到master节点故障时,会提高其中拥有最新数据的slave节点成为新的master节点,在此期间,MHA会经过与其它从节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,即按需切换master/slave节点。
相较于其它HA软件,MHA的目的在于维持MySQL Replication中Master库的高可用性,其最大特色是能够修复多个Slave之间的差别日志,最终使全部Slave保持数据一致,而后从中选择一个充当新的Master,并将其它Slave指向它。
 
    二、角色功能:
 
MHA 服务有两种角色,MHA Manager(管理节点)和MHA Node(数据节点):
MHA Manager:一般单独部署在一台独立的机器上或者直接部署在其中一台slave上(不建议后者),能够管理多个master/slave集群:
(1)master自动切换及故障转移命令运行
(2)其余的帮助脚本运行:手动切换master;master/slave状态检测
MHA node:运行在每台MySQL服务器上(master/slave/manager),它经过监控具有解析和清理logs功能的脚原本加快故障转移。其做用有:
(1)复制主节点的binlog数据
(2)对比从节点的中继日志文件
(3)无需中止从节点的SQL线程,定时删除中继日志
  注意:MHA集群环境下须要时删除relaylog,由于关闭了mysql的 自动刷新功能 relay-log-purge = 0 能够经过,
    (1)动态开启全局参数设置
         SET GLOBAL relay_log_purge=1; FLUSH LOGS; SET GLOBAL relay_log_purge=0;
    (2)删除relaylog 文件 
      (3)  使用MHA 脚本 purge_relay_logs --help
        
    三、架构图
            
 
(1)从宕机崩溃的master保存二进制日志事件(binlog events);
(2)识别含有最新更新的slave;
(3)应用差别的中继日志(relay log)到其余的slave;
(4)应用从master保存的二进制日志事件(binlog events);
(5)提高一个slave为新的master;
(6)使其余的slave链接新的master进行复制;
 
环境准备
主机
角色
服务
端口
mha-node
172.16.40.201
slave
mysql-5.7.25(PerconaServer)
7066
node
172.16.40.202
slave(master-b)
mysql-5.7.25(PerconaServer)
7066
manager/node
172.16.40.203
master
mysql-5.7.25(PerconaServer)
7066
node
 
2、在3台服务器上安装mysql 服务
        一、版本选择:Percona-Server-5.7.25-28-Linux.x86_64.ssl101.tar.gz
    
             Mysql 安装:略
 
        二、 配置主从 
           
  【172.16.40.203(master)】:
            mysql> grant replication slave on *.* to 'repl'@'172.16.40.%' identified by 'replpasswod';
            mysql>flush privileges;   
      
  【172.16.40.202(slave(master-b)】:
            mysql> CHANGE MASTER TO MASTER_HOST='172.16.40.203',MASTER_PORT=7066,MASTER_USER='repl',MASTER_PASSWORD='replpasswod',MASTER_AUTO_POSITION=194;
            mysql>start slave;
            mysql>show slave status\G;
            
  【172.16.40.201(slave)】:
            mysql> CHANGE MASTER TO MASTER_HOST='172.16.40.203',MASTER_PORT=7066,MASTER_USER='repl',MASTER_PASSWORD='replpasswod',MASTER_AUTO_POSITION=194;
            mysql>start slave;
            mysql>show slave status\G;
 
 
        三、开启Mysql 半同步复制
# mysql -S /tmp/7066.sock -p
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
mysql> SET GLOBAL rpl_semi_sync_master_enabled=1;
mysql> SET GLOBAL rpl_semi_sync_slave_enabled=1;
 
# 重启mysql
# /etc/init.d/mysqld-7066 restart
 
#确认是否开启半同步
mysql> show global variables like '%semi%';或 show global status like '%semi%';
 
 
                
 
3、搭建MHA
 
 
一、下载MHA套件
 
 
 
二、配置服务器间免密登录
    ( 注意:Manager 要是装到某一台MySQL上,则须要本身和本身无密码登入,单独到一台服务器则不须要)
 
【172.16.40.202(manager)】:
    # ssh-keygen -t rsa
    # cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
    #  ssh-copy-id root@172.16.40.203
    #  ssh-copy-id root@172.16.40.201
【172.16.40.203(node)】:
    #  ssh-keygen -t rsa
    #  ssh-copy-id root@172.16.40.202
    #  ssh-copy-id root@172.16.40.201
【172.16.40.201(node)】:
    #  ssh-keygen -t rsa
    #  ssh-copy-id root@172.16.40.202
    #  ssh-copy-id root@172.16.40.203
 
 
 
 
三、安装MHA
    (注意:若是manager 没有安装在独立的服务器上则每一个节点都须要安装node)
        
 
     (1)上传 mha4mysql-node-0.58.tar.gz 包到全部服务器并安装
           # 安装 须要perl,perl-DBD-MySQL, perl-devel 依赖,yum安装便可,yum install DBD-MySQL
        # tar zxvf mha4mysql-node-0.58.tar.gz
        # cd mha4mysql-node-0.58
        # perl Makefile.PL
        # make&&make install
 
    (2)上传 mha4mysql-manager-0.58.tar.gz 包到manager 服务器并安装
    
     
       # 安装依赖
       # yum install perl-Config-Tiny,perl-Log-Dispatch, perl-Parallel-ForkManager,perl-Time-HiRes -y
       # tar zxvf mha4mysql-manager-0.58.tar.gz
       # cd mha4mysql-manager-0.58
       # perl Makefile.PL
       # make&&make install
 
 
       工具包介绍:     
 
  Manager工具:
- masterha_check_ssh : 检查MHA的SSH配置。
- masterha_check_repl : 检查MySQL复制。
- masterha_manager : 启动MHA。
- masterha_check_status : 检测当前MHA运行状态。
- masterha_master_monitor : 监测master是否宕机。
- masterha_master_switch : 控制故障转移(自动或手动)。
- masterha_conf_host : 添加或删除配置的server信息。
  Node工具:
- save_binary_logs : 保存和复制master的二进制日志。
- apply_diff_relay_logs : 识别差别的中继日志事件并应用于其它slave。
- filter_mysqlbinlog : 去除没必要要的ROLLBACK事件(MHA已再也不使用这个工具)。
- purge_relay_logs : 清除中继日志(不会阻塞SQL线程)。
 
 
            
报错解决:
[root@localhost authors]# masterha_check_ssh
"NI_NUMERICHOST" is not exported by the Socket module
"getaddrinfo" is not exported by the Socket module
"getnameinfo" is not exported by the Socket module
Can't continue after import errors at /usr/local/share/perl5/MHA/NodeUtil.pm line 29
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/NodeUtil.pm line 29.
Compilation failed in require at /usr/local/share/perl5/MHA/SlaveUtil.pm line 28.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/SlaveUtil.pm line 28.
Compilation failed in require at /usr/local/share/perl5/MHA/DBHelper.pm line 26.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/DBHelper.pm line 26.
Compilation failed in require at /usr/local/share/perl5/MHA/HealthCheck.pm line 30.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/HealthCheck.pm line 30.
Compilation failed in require at /usr/local/share/perl5/MHA/Server.pm line 28.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/Server.pm line 28.
Compilation failed in require at /usr/local/share/perl5/MHA/Config.pm line 29.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/Config.pm line 29.
Compilation failed in require at /usr/local/share/perl5/MHA/SSHCheck.pm line 32.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/SSHCheck.pm line 32.
Compilation failed in require at /usr/local/bin/masterha_check_ssh line 25.
BEGIN failed--compilation aborted at /usr/local/bin/masterha_check_ssh line 25.
 
 
使用cpan 安装依赖包
cpan[1]> install ExtUtils::Constant
cpan[1]> install Socket
 
Tips:若是服务器没法联网的状况下、能够根据cpan 的提示信息地址手动下载依赖包并放到对应的目录下在执行安装命令便可
 
问题解决:
[root@localhost authors]# masterha_check_ssh --help
Usage:
    masterha_check_ssh --global_conf=/etc/masterha_default.cnf
    --conf=/etc/conf/masterha/app1.cnf
 
 
    See online reference
    (http://code.google.com/p/mysql-master-ha/wiki/Requirements#SSH_public_k
    ey_authentication) for details.
 
 
四、配置MHA
 
        (1)  在【172.16.40.202(manager)】建立工做目录
 
            # mkdir -p /home/mysql/app/mha/masterha
 
        (2) 复制配置文件并修改
   
[server default]
manager_workdir=/home/mysql/app/mha/masterha
manager_log=/home/mysql/app/mha/masterha/logs/manager.log
master_binlog_dir=/home/mysql/app/mha/7066/logs/binlog
password=romysqladmint  // 设置监控用户
user=root
ping_interval=1
remote_workdir=/opt/TMHA2/mha4mysql-node-master
repl_password=replpasswod
repl_user=repl
ssh_user=root
shutdown_script=""
log_level=debug
 
 
#master node
[server1]
hostname=172.16.40.203
port=7066
ssh_port=22
#slave node
[server2]
hostname=172.16.40.202
port=7066
ssh_port=22
#candidate_master=1  //设置为候选master,若是设置该参数之后,发生主从切换之后将会将此从库提高为主库,即便这个主库不是集群中事件最新的slave
#slave node
[server3]
hostname=172.16.40.201
port=7066
ssh_port=22
 
 
# 数据库受权监控用户
 
 
    (3) Manager 状态检查:
[root@fuzhou202 conf]# masterha_check_ssh  --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
Sun Mar 24 19:30:26 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Mar 24 19:30:26 2019 - [info] Reading application default configuration from /home/mysql/app/mha/masterha/conf/app1.cnf..
Sun Mar 24 19:30:26 2019 - [info] Reading server configuration from /home/mysql/app/mha/masterha/conf/app1.cnf..
Sun Mar 24 19:30:26 2019 - [info] Starting SSH connection tests..
Sun Mar 24 19:30:26 2019 - [debug]
Sun Mar 24 19:30:26 2019 - [debug]  Connecting via SSH from root@172.16.40.202(172.16.40.202:22) to root@172.16.40.203(172.16.40.203:22)..
Sun Mar 24 19:30:26 2019 - [debug]   ok.
Sun Mar 24 19:30:26 2019 - [debug]  Connecting via SSH from root@172.16.40.202(172.16.40.202:22) to root@172.16.40.201(172.16.40.201:22)..
Sun Mar 24 19:30:26 2019 - [debug]   ok.
Sun Mar 24 19:30:27 2019 - [debug]
Sun Mar 24 19:30:26 2019 - [debug]  Connecting via SSH from root@172.16.40.203(172.16.40.203:22) to root@172.16.40.202(172.16.40.202:22)..
Sun Mar 24 19:30:26 2019 - [debug]   ok.
Sun Mar 24 19:30:26 2019 - [debug]  Connecting via SSH from root@172.16.40.203(172.16.40.203:22) to root@172.16.40.201(172.16.40.201:22)..
Sun Mar 24 19:30:26 2019 - [debug]   ok.
Sun Mar 24 19:30:27 2019 - [debug]
Sun Mar 24 19:30:27 2019 - [debug]  Connecting via SSH from root@172.16.40.201(172.16.40.201:22) to root@172.16.40.202(172.16.40.202:22)..
Sun Mar 24 19:30:27 2019 - [debug]   ok.
Sun Mar 24 19:30:27 2019 - [debug]  Connecting via SSH from root@172.16.40.201(172.16.40.201:22) to root@172.16.40.203(172.16.40.203:22)..
Sun Mar 24 19:30:27 2019 - [debug]   ok.
Sun Mar 24 19:30:27 2019 - [info] All SSH connection tests passed successfully.
 
---------
# masterha_check_repl  --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
# masterha_check_status  --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
 
上述脚本执行都经过开启manager 监控
 
 
 
(4)开启manager 监控服务
#启动manager
# nohup masterha_manager --conf=/home/mysql/app/mha/masterha/conf/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mysql/app/mha/masterha/logs/manager.log 2>&1 &
 
#检查状态
[root@fuzhou202 logs]# masterha_check_status --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
app1 (pid:9163) is running(0:PING_OK), master:172.16.40.203
 
#关闭manager 
# masterha_stop --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
 
 
(5) 配置脚本方式管理VIP
 
# 在 master 【172.16.40.203 (master)】节点上手动绑定VIP
# ifconfig eth0:1 172.16.40.99/24
 
# 建立perl master-failover 脚本
# 在配置文件中添加参数
master_ip_failover_script= /usr/local/bin/master_ip_failover
 
# vim /home/mysql/app/mha/masterha/conf/app1.cnf #添加
master_ip_failover_script= /usr/local/bin/master_ip_failover
 
#编辑脚本,内容以下
# vim /usr/local/bin/master_ip_failover
---------------------------------------------------
#!/usr/bin/env perl
 
 
use strict;
use warnings FATAL => 'all';
 
 
use Getopt::Long;
 
 
my (
    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);
 
 
my $vip = '172.16.40.99/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
 
 
GetOptions(
    'command=s'          => \$command,
    'ssh_user=s'         => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s'   => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'new_master_host=s'  => \$new_master_host,
    'new_master_ip=s'    => \$new_master_ip,
    'new_master_port=i'  => \$new_master_port,
);
 
 
exit &main();
 
 
sub main {
 
 
    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
 
 
    if ( $command eq "stop" || $command eq "stopssh" ) {
 
 
        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {
 
 
        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}
 
 
sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
 
 
sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
--------------------------------------
 
# chmod +x /usr/local/bin/master_ip_failover
 
 
验证自动自动 master-failover
 
 
# masterha_check_repl --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
...
un Mar 24 20:17:11 2019 - [info] Checking master_ip_failover_script status:
Sun Mar 24 20:17:11 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.40.203 --orig_master_ip=172.16.40.203 --orig_master_port=7066
IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 172.16.40.99/24===
 
 
4、测试
    一、自动 master-failover
        sysbench生成测试数据  
        
#  主库生成数据
# sysbench --test=oltp --oltp-table-size=1000000 --oltp-read-only=off --init-rng=on --num-threads=4 --max-requests=0 --oltp-dist-type=uniform --max-time=1800 --mysql-user=root --mysql-socket=/tmp/7706.sock --mysql-password=mysqladmin--db-driver=mysql --mysql-table-engine=innodb --oltp-test-mode=complex prepare
 
# 关闭一台mysql的slave io_thread,模拟复制延迟状况
 
mysql > stop slave io_thread;
 
# sysbench --test=oltp --oltp-table-size=1000000 --oltp-read-only=off --init-rng=on --num-threads=4--max-requests=0 --oltp-dist-type=uniform --max-time=180 --mysql-user=root --mysql-socket=/tmp/7066.sock --mysql-password=mysqladmin --db-driver=mysql --mysql-table-engine=innodb --oltp-test-mode=complex run
 
 
# 关闭 master mysql
# pkill -9 mysqld
 
#观察manager 日志
 
...
Mon Mar 25 11:07:42 2019 - [info]  172.16.40.201: Resetting slave info succeeded.
Mon Mar 25 11:07:42 2019 - [info] Master failover to 172.16.40.201(172.16.40.201:7066) completed successfully.
Mon Mar 25 11:07:42 2019 - [info] Deleted server1 entry from /home/mysql/app/mha/masterha/conf/app1.cnf .
Mon Mar 25 11:07:42 2019 - [debug]  Disconnected from 172.16.40.202(172.16.40.202:7066)
Mon Mar 25 11:07:42 2019 - [debug]  Disconnected from 172.16.40.201(172.16.40.201:7066)
Mon Mar 25 11:07:42 2019 - [info]
 
 
----- Failover Report -----
 
 
app1: MySQL Master failover 172.16.40.203(172.16.40.203:7066) to 172.16.40.201(172.16.40.201:7066) succeeded
 
 
Master 172.16.40.203(172.16.40.203:7066) is down!
 
 
Check MHA Manager logs at fuzhou202:/home/mysql/app/mha/masterha/logs/manager.log for details.
 
 
Started automated(non-interactive) failover.
Invalidated master IP address on 172.16.40.203(172.16.40.203:7066)
Selected 172.16.40.201(172.16.40.201:7066) as a new master.
172.16.40.201(172.16.40.201:7066): OK: Applying all logs succeeded.
172.16.40.201(172.16.40.201:7066): OK: Activated master IP address.
172.16.40.202(172.16.40.202:7066): OK: Slave started, replicating from 172.16.40.201(172.16.40.201:7066)
172.16.40.201(172.16.40.201:7066): Resetting slave info succeeded.
Master failover to 172.16.40.201(172.16.40.201:7066) completed successfully.
 
#mha 自动切换已经识别最新的slave,提高为master,并修改配置成功
# vip 也已经切换到另一台
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:50:56:b7:29:df brd ff:ff:ff:ff:ff:ff
     inet 172.16.40.201/24 brd 172.16.40.255 scope global eth0
     inet 172.16.40.99/24 brd 172.16.40.255 scope global secondary eth0:1
    inet6 fe80::250:56ff:feb7:29df/64 scope link
       valid_lft forever preferred_lft forever
注意:
    mha 使用自动切换 master-failover后,manager的监控程序就会自动中止,由于启动参数设置了 --remove_dead_master_conf
    因此已经恢复的mysql节点在手动加入mha 时须要的操做
    一、将节点信息添加到配置文件 /home/mysql/app/mha/masterha/conf/app1.cnf
    二、为已经恢复得mysql从新配置主从 CHANGE MASTER TO MASTER_HOST=' 172.16.40.203', MASTER_PORT=7066, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='replpasswod';  MASTER_HOST为切换后的新masetr的ip,start salve; 查看复制状态 show slave status\G
    三、从新启动 manager的监控程序
     
 
二、手动 master-failover
        手动 master-failover 无需开启 manager的监控程序, 当主服务器故障时,人工手动调用MHA来进行故障切换操做
        
# masterha_master_switch --master_state=dead --conf=/home/mysql/app/mha/masterha/conf/app1.cnf -dead_master_host= 172.16.40.202 --dead_master_port=7066 --new_master_host= 172.16.40.203 --new_master_port=7066 --ignore_last_failover
 
#此时会输出交互信息确认继续便可
相关文章
相关标签/搜索