IV 12 MySQL+drbd+heartbeat

一主多从是最常用的DB架构方案,该架构部署简单、维护方便,通过代理或程序的方式可实现rw splitting,且多个从库通过LVShaproxy实现LB分担r的压力,排除了r的单点问题,但仅有一个主库这也是单点,若主出问题w将停止,最简单的方案人工介入,做监控,主一旦宕机,管理人员手动选择半同步的那个从改为主,让其它从与新的主同步,人工介入虽可行但高要求的场合并不适用

wKioL1etHKqx1LTcAABq7Heph2s082.jpg

注:

正常情况下MySQL-M-active负责wMySQL-M-inactive为不可见状态,MySQL slave负责r,另可做MySQL slaveLBmasterslave同步时利用其自身机制并通过VIPweb serverrw时通过程序自身实现,也可用mysql proxyamoeba开源软件实现;

wKiom1etHMSyCtg1AAB-0XSKIpc824.jpg

注:双主热备模式

 

 

1、安装配置heartbeat

准备环境:

VIP10.96.20.8

mastereth010.96.20.113)、eth1172.16.1.113,不配网关及dns)、主机名(test-master

backupeth010.96.20.114)、eth1172.16.1.114,不配网关及dns)、主机名(test-backup

双网卡、双硬盘、

注:eth0为管理IPeth1心跳连接及drbd传输通道,若是生产环境中心跳传输和数据传输用一个网卡要做限制,给心跳留有带宽

注:规范vmware中标签,Xshell中标签,公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录,方便分发及管理维护

 

test-master(分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致,/etc/hosts文件,ssh双机互信,时间同步,iptablesselinux):

[[email protected] ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 6.5(Santiago)

[[email protected] ~]# uname -rm

2.6.32-431.el6.x86_64 x86_64

[[email protected] ~]# uname -n

test-master

[[email protected] ~]# ifconfig | grep eth0 -A 1

eth0     Link encap:Ethernet  HWaddr00:0C:29:1F:B6:AC 

         inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0

[[email protected] ~]# ifconfig | grep eth1 -A 1

eth1     Link encap:Ethernet  HWaddr00:0C:29:1F:B6:B6 

         inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0

[[email protected] ~]# route add -host 172.16.1.114 dev eth1   #(添加主机路由,心跳传送通过指定网卡出去,此句可追加到/etc/rc.local中,也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113

[[email protected] ~]# ssh-****** -t rsa -f ./.ssh/id_rsa -P ''

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:[email protected]

The key's randomart image is:

+--[ RSA 2048]----+

| E o..           |

| .+ +            |

|.+.* .           |

|oo* o.  .       |

|+o.. = S        |

|+. o . +         |

|o o .            |

| .               |

|                 |

+-----------------+

[[email protected] ~]# ssh-copy-id -i ./.ssh/id_rsa [email protected]

The authenticity of host 'test-backup(10.96.20.114)' can't be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added 'test-backup'(RSA) to the list of known hosts.

[email protected]'s password:

Now try logging into the machine, with"ssh '[email protected]'", and check in:

 

 .ssh/authorized_keys

 

to make sure we haven't added extra keysthat you weren't expecting.

[[email protected] ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null

[[email protected] ~]# service crond restart

Stopping crond:                                           [  OK  ]

Starting crond:                                            [  OK  ]

[[email protected] ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[[email protected] ~]# rpm -ivh epel-release-6-8.noarch.rpm

warning: epel-release-6-8.noarch.rpm:Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY

Preparing...               ########################################### [100%]

  1:epel-release          ########################################### [100%]

[[email protected] ~]# yum search heartbeat

……

heartbeat-devel.i686 : Heartbeatdevelopment package

heartbeat-devel.x86_64 : Heartbeatdevelopment package

heartbeat-libs.i686 : Heartbeat libraries

heartbeat-libs.x86_64 : Heartbeat libraries

heartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux

[[email protected] ~]# yum -y install heartbeat

[[email protected] ~]# chkconfig heartbeat off

[[email protected] ~]# chkconfig --list heartbeat

heartbeat          0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

test-backup

[[email protected] ~]# uname -n

test-backup

[[email protected] ~]# ifconfig | grep eth0-A 1

eth0     Link encap:Ethernet  HWaddr00:0C:29:15:E6:BB 

         inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0

[[email protected] ~]# ifconfig | grep eth1-A 1

eth1     Link encap:Ethernet  HWaddr00:0C:29:15:E6:C5 

         inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0

[[email protected] ~]# route add -host 172.16.1.113 dev eth1

[[email protected] ~]# ssh-****** -t rsa -f ./.ssh/id_rsa -P ''

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:[email protected]

The key's randomart image is:

+--[ RSA 2048]----+

|           .    |

|         = .    |

|   .    = *     |

| . . . .. + +    |

|. + . ..SE .     |

| o = . .        |

|. . =   .       |

| o . .   .      |

|o   .o...       |

+-----------------+

[[email protected] ~]# ssh-copy-id -i ./.ssh/id_rsa [email protected]

The authenticity of host 'test-master(10.96.20.113)' can't be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added 'test-master'(RSA) to the list of known hosts.

[email protected]'s password:

Now try logging into the machine, with"ssh '[email protected]'", and check in:

 

 .ssh/authorized_keys

 

to make sure we haven't added extra keysthat you weren't expecting.

[[email protected] ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null

[[email protected] ~]# service crond restart

Stopping crond:                                           [  OK  ]

Starting crond:                                            [  OK  ]

[[email protected] ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[[email protected] ~]# rpm -ivh epel-release-6-8.noarch.rpm

[[email protected] ~]# yum -y install heartbeat

[[email protected] ~]# chkconfig heartbeat off

[[email protected] ~]# chkconfig --list heartbeat

heartbeat          0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

test-master

[[email protected] ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

[[email protected] ~]# cd /etc/ha.d

[[email protected] ha.d]# ls

authkeys ha.cf  harc  haresources rc.d  README.config  resource.d shellfuncs

[[email protected] ha.d]# vim authkeys   #(使用#ddif=/dev/random count=1 bs=512 | md5sum生成随机数,sha1后跟随机数)

auth 1

1 sha1912d6402295ac8d47109e56b177073b9

[[email protected] ha.d]# chmod 600 authkeys   #(此文件权限600,否则启动服务时会报错)

[[email protected] ha.d]# ll !$

ll authkeys

-rw-------. 1 root root 692 Aug  7 21:51 authkeys

[[email protected] ha.d]# vim ha.cf

debugfile /var/log/ha-debug   #(调试日志)

logfile /var/log/ha-log

logfacility     local1  #(在rsyslog服务中配置通过local1接收日志)

keepalive 2   #(指定心跳间隔时间,即2s发一次广播)

deadtime 30   #(指定备node30s内没收到主node的心跳信息则立即接管对方的服务资源)

warntime 10   #(指定心跳延迟的时间为10s,当10s内备node没收到主node的心跳信息,就会往日志中写警告,此时不会切换服务)

initdead 120   #(指定在heartbeat首次运行后,需等待120s才启动主node的各资源,此项用于解决等待对方heartbeat服务启动了自己才启,此项值至少要是deadtime的两倍)

udpport 694

#bcast eth0   #(指定心跳使用以太网广播方式在eth0上广播,若要使用两个实际网络传送心跳则要为bcast eth0 eth1

mcast eth0 225.0.0.11 6941 0   #(设置多播通信的参数,多播地址在LAN内必须是唯一的,因为有可能有多个heartbeat服务,多播地址使用DIP224.0.0.0--239.255.255.255),格式为mcast devmcast_group port ttl loop

auto_failback on   #(用于主node恢复后failback

node test-master   #(主node主机名,uname -n结果)

node test-backup   #(备node主机名)

crm no   #(是否开启CRM功能)

[[email protected] ha.d]# vim haresources

test-master     IPaddr::10.96.20.8/24/eth0   #(此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|startIPaddr即是/etc/ha.d/resource.d/下的脚本)

[[email protected] ha.d]# scp authkeys ha.cf haresources [email protected]:/etc/ha.d/

authkeys                                                                                           100%  692     0.7KB/s  00:00   

ha.cf                                                                                              100%   10KB  10.3KB/s  00:00   

haresources                                                                                        100% 5944     5.8KB/s   00:00   

[[email protected] ha.d]# service heartbeat start

Starting High-Availability services:INFO:  Resource is stopped

Done.

 

[[email protected] ha.d]# ssh test-backup 'service heartbeat start'

Starting High-Availability services:2016/08/07_22:39:00 INFO:  Resource isstopped

Done.

[[email protected] ha.d]# ps aux | grep heartbeat

root     63089  0.0  3.1 50124  7164 ?        SLs 22:38   0:00 heartbeat: mastercontrol process

root     63093  0.0  3.1 50076  7116 ?        SL  22:38   0:00 heartbeat: FIFOreader       

root     63094  0.0  3.1 50072  7112 ?        SL  22:38   0:00 heartbeat: write:mcast eth0 

root     63095  0.0  3.1 50072  7112 ?        SL  22:38   0:00 heartbeat: read:mcast eth0  

root     63136  0.0  0.3 103264  836 pts/0    S+   22:39  0:00 grep heartbeat

[[email protected] ha.d]# ssh test-backup 'ps aux | grep heartbeat'

root      3050  0.0 3.1  50124  7164 ?       SLs  22:39   0:00 heartbeat: master control process

root      3054  0.0  3.1 50076  7116 ?        SL  22:39   0:00 heartbeat: FIFOreader       

root      3055  0.0  3.1 50072  7112 ?        SL  22:39   0:00 heartbeat: write:mcast eth0 

root      3056  0.0  3.1 50072  7112 ?        SL  22:39   0:00 heartbeat: read:mcast eth0  

root      3094  0.0  0.5 106104 1368 ?        Ss   22:39  0:00 bash -c ps aux | grep heartbeat

root      3108  0.0  0.3 103264  832 ?        S    22:39  0:00 grep heartbeat

[[email protected] ha.d]# netstat -tnulp |grep heartbeat

udp       0      0 225.0.0.11:694              0.0.0.0:*                               63094/heartbeat:wr

udp       0      0 0.0.0.0:50268               0.0.0.0:*                               63094/heartbeat:wr

[[email protected] ha.d]# ssh test-backup 'netstat -tnulp | grep heartbeat'

udp       0      0 0.0.0.0:58019               0.0.0.0:*                               3055/heartbeat:wri

udp        0     0 225.0.0.11:694             0.0.0.0:*                              3055/heartbeat: wri

[[email protected] ha.d]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[[email protected] ha.d]# ssh test-backup'ip addr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

[[email protected] ha.d]# service heartbeatstop

Stopping High-Availability services: Done.

 

[[email protected] ha.d]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

[[email protected] ha.d]# ssh test-backup'ip addr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[[email protected] ha.d]# service heartbeat start

Starting High-Availability services:INFO:  Resource is stopped

Done.

 

[[email protected] ha.d]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[[email protected] ha.d]# ssh test-backup 'ip addr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

[[email protected] ~]# service heartbeat stop

Stopping High-Availability services: Done.

 

[[email protected] ~]# ssh test-backup 'service heartbeat stop'

Stopping High-Availability services: Done.

 

 

 

2、安装配置drbd

test-master

[[email protected] ~]# fdisk -l

……

Disk /dev/sdb: 2147 MB, 2147483648 bytes

255 heads, 63 sectors/track, 261 cylinders

Units = cylinders of 16065 * 512 = 8225280bytes

Sector size (logical/physical): 512 bytes /512 bytes

I/O size (minimum/optimal): 512 bytes / 512bytes

Disk identifier: 0x00000000

[[email protected] ~]# parted /dev/sdb  #parted命令可支持大于2T的硬盘,将新硬盘分两个区,一个区用于放数据,另一个区用于drbdmeta data

GNU Parted 2.1

Using /dev/sdb

Welcome to GNU Parted! Type 'help' to viewa list of commands.

(parted) h                                                               

 align-check TYPE N                       check partition N for TYPE(min|opt) alignment

 check NUMBER                            do a simple check on the file system

  cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER  copy file system to another partition

 help [COMMAND]                           print general help, or helpon COMMAND

  mklabel,mktable LABEL-TYPE               create a new disklabel (partitiontable)

 mkfs NUMBER FS-TYPE                     make a FS-TYPE file system on partition NUMBER

  mkpart PART-TYPE [FS-TYPE] START END     make a partition

 mkpartfs PART-TYPE FS-TYPE START END    make a partition with a file system

 move NUMBER START END                   move partition NUMBER

 name NUMBER NAME                        name partition NUMBER as NAME

  print [devices|free|list,all|NUMBER]     display the partition table, availabledevices, free space, all found partitions, or a

       particular partition

 quit                                    exit program

 rescue START END                        rescue a lost partition near START and END

 resize NUMBER START END                 resize partition NUMBER and its file system

  rmNUMBER                               delete partition NUMBER

 select DEVICE                           choose the device to edit

  setNUMBER FLAG STATE                   change the FLAG on partition NUMBER

 toggle [NUMBER [FLAG]]                  toggle the state of FLAG on partition NUMBER

 unit UNIT                               set the default unit to UNIT

 version                                 display the version number and copyright information of GNU Parted

(parted) mklabel gpt                                                     

(parted) mkpart primary 0 1024

Warning: The resulting partition is not properlyaligned for best performance.

Ignore/Cancel?Ignore

(parted) mkpart primary 1025 2147                                        

Warning: The resulting partition is notproperly aligned for best performance.

Ignore/Cancel? Ignore

(parted) p                                                                

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 2147MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

 

Number Start   End     Size   File system  Name     Flags

 1     17.4kB  1024MB  1024MB               primary

 2     1025MB  2147MB  1122MB               primary

[[email protected] ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[[email protected] ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

warning:elrepo-release-6-6.el6.elrepo.noarch.rpm: Header V4 DSA/SHA1 Signature, key IDbaadae52: NOKEY

Preparing...               ########################################### [100%]

  1:elrepo-release        ########################################### [100%]

[[email protected] ~]# yum -y install drbd kmod-drbd84

[[email protected] ~]# modprobe drbd

FATAL: Module drbd not found.

[[email protected] ~]# yum -y install kernel*   #(更新内核后要重启系统)

[[email protected] ~]# uname -r

2.6.32-642.3.1.el6.x86_64

[[email protected] ~]# depmod

[[email protected] ~]# lsmod | grep drbd

drbd                  372759  0

libcrc32c               1246  1 drbd

[[email protected] ~]# ll /usr/src/kernels/

total 12

drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug  8 03:40 2.6.32-642.3.1.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug  8 03:40 2.6.32-642.3.1.el6.x86_64.debug

[[email protected] ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[[email protected] ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

 

test-backup

[[email protected] ~]# parted /dev/sdb

(parted) mklabel gpt

(parted) mkpart primary 0 4096                                           

Warning: The resulting partition is notproperly aligned for best performance.

Ignore/Cancel? Ignore                                                    

(parted) mkpart primary 4097 5368                                        

(parted) p                                                                

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 5369MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

 

Number Start   End     Size   File system  Name     Flags

 1     17.4kB  4096MB  4096MB               primary

 2     4097MB  5368MB  1271MB               primary

[[email protected] ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[[email protected] ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

[[email protected] ~]# ll /etc/yum.repos.d/

total 20

-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo

-rw-r--r--. 1 root root 2150 Feb  9  2014elrepo.repo

-rw-r--r--. 1 root root  957 Nov 4  2012 epel.repo

-rw-r--r--. 1 root root 1056 Nov  4  2012epel-testing.repo

-rw-r--r--. 1 root root  529 Mar 30 23:00 rhel-source.repo.bak

[[email protected] ~]# yum -y install drbd kmod-drbd84

[[email protected] ~]# yum -y install kernel*

[[email protected] ~]# depmod

[[email protected] ~]# lsmod | grep drbd

drbd                  372759  0

libcrc32c               1246  1 drbd

[[email protected] ~]# chkconfig drbd off

[[email protected] ~]# chkconfig --list drbd

drbd              0:off 1:off 2:off 3:off 4:off 5:off 6:off

[[email protected] ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[[email protected] ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

 

test-master

[[email protected] ~]# vim /etc/drbd.d/global_common.conf

[[email protected] ~]# egrep -v "#|^$" /etc/drbd.d/global_common.conf

global {

         usage-countno;

}

common {

         handlers{

         }

         startup{

         }

         options{

         }

         disk{

                on-io-error detach;

         }

         net{

         }

         syncer{

                   rate50M;

                   verify-algcrc32c;

         }

}

[[email protected] ~]# vim/etc/drbd.d/data.res

resource data {

       protocol C;

       on test-master {

                device  /dev/drbd0;

                disk    /dev/sdb1;

                address 172.16.1.113:7788;

                meta-disk       /dev/sdb2[0];

       }

       on test-backup {

                device  /dev/drbd0;

                disk    /dev/sdb1;

                address 172.16.1.114:7788;

                meta-disk       /dev/sdb2[0];

       }

}

[[email protected] ~]# cd /etc/drbd.d

[[email protected] drbd.d]# scp global_common.conf data.res [email protected]:/etc/drbd.d/

global_common.conf                                                                                     100% 2144     2.1KB/s   00:00   

data.res                                                                                                100%  251    0.3KB/s   00:00   

 

[[email protected] drbd.d]# drbdadm --help

USAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}

GENERAL OPTIONS:

 --stacked, -S

 --dry-run, -d

 --verbose, -v

  --config-file=...,-c ...

 --config-to-test=..., -t ...

 --drbdsetup=..., -s ...

 --drbdmeta=..., -m ...

 --drbd-proxy-ctl=..., -p ...

 --sh-varname=..., -n ...

 --peer=..., -P ...

 --version, -V

 --setup-option=..., -W ...

 --help, -h

 

COMMANDS:

 attach                             disk-options                      

 detach                             connect                           

 net-options                        disconnect                        

 up                                 resource-options                  

 down                               primary                           

 secondary                          invalidate                        

 invalidate-remote                  outdate                           

 resize                             verify                            

 pause-sync                         resume-sync                       

 adjust                            adjust-with-progress              

 wait-connect                       wait-con-int                      

 role                               cstate                            

 dstate                             dump                              

 dump-xml                           create-md                          

 show-gi                           get-gi                            

 dump-md                            wipe-md                           

 apply-al                           hidden-commands    

[[email protected] drbd.d]# drbdadm create-md data

initializing activity log

NOT initializing bitmap

Writing meta data...

New drbd meta data block successfullycreated.

[[email protected] drbd.d]# ssh test-backup 'drbdadm create-md data'

NOT initializing bitmap

initializing activity log

Writing meta data...

New drbd meta data block successfullycreated.

[[email protected] drbd.d]# drbdadm up data

[[email protected] drbd.d]# ssh test-backup 'drbdadm up data'

[[email protected] drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by [email protected], 2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

   ns:0 nr:0 dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[[email protected] drbd.d]# ssh test-backup 'cat /proc/drbd'

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

   ns:0 nr:0 dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[[email protected] drbd.d]# drbdadm -- --overwrite-data-of-peer primary data   #(仅在主上执行)

[[email protected] drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r-----

   ns:339968 nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:660016

         [=====>..............]sync'ed: 34.3% (660016/999984)K

         finish:0:00:15 speed: 42,496 (42,496) K/sec

[[email protected] drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

   ns:630784 nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:369200

         [===========>........]sync'ed: 63.3% (369200/999984)K

         finish:0:00:09 speed: 39,424 (39,424) K/sec

[[email protected] drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

   ns:942080 nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:57904

         [=================>..]sync'ed: 94.3% (57904/999984)K

         finish:0:00:01 speed: 39,196 (39,252) K/sec

[[email protected] drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

    ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[[email protected] drbd.d]# ssh test-backup 'cat /proc/drbd'

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected], 2016-01-1213:27:11

 0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r-----

   ns:0 nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0

[[email protected] drbd.d]# mkdir /drbd

[[email protected] drbd.d]# ssh test-backup 'mkdir /drbd'

[[email protected] drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0   #(仅在主上执行,meta分区不要格式化)

Writing superblocks and filesystemaccounting information: done

[[email protected] drbd.d]# tune2fs -c -1 /dev/drbd0

tune2fs 1.41.12 (17-May-2010)

Setting maximal mount count to -1

[[email protected] drbd.d]# mount /dev/drbd0 /drbd

[[email protected] drbd.d]# cd /drbd

[[email protected] drbd]# for i in `seq 1 10`; do touch test$i; done

[[email protected] drbd]# ls

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

[[email protected] drbd]# cd

[[email protected] ~]# umount /dev/drbd0

[[email protected] ~]# drbdadm secondary data

[[email protected] ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by [email protected], 2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

   ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

 

test-backup

[[email protected] ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

   ns:0 nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0

[[email protected] ~]# drbdadm primary data

[[email protected] ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by [email protected],2016-01-12 13:27:11

 0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r-----

   ns:0 nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

[[email protected] ~]# mount /dev/drbd0 /drbd

[[email protected] ~]# ls /drbd

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

 

 

3、调试heartbeat+drbd

[[email protected] ~]# ssh test-backup 'umount /drbd'

[[email protected] ~]# ssh test-backup 'drbdadm secondary data'

[[email protected] ~]# service drbd stop

Stopping all DRBD resources: .

[[email protected] ~]# ssh test-backup 'service drbd stop'

Stopping all DRBD resources: .

[[email protected] ~]# service heartbeat status

heartbeat is stopped. No process

[[email protected] ~]# ssh test-backup 'service heartbeat status'

heartbeat is stopped. No process

[[email protected] ~]# ll /etc/ha.d/resource.d/{Filesystem,drbddisk}

-rwxr-xr-x. 1 root root 3162 Jan 12  2016 /etc/ha.d/resource.d/drbddisk

-rwxr-xr-x. 1 root root 1903 Dec  2  2013/etc/ha.d/resource.d/Filesystem

[[email protected] ~]# vim /etc/ha.d/haresources   #(此行内容相当于脚本加参数的执行方式,例如#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 start|stop#/etc/ha.d/resource.d/drbddisk data start|stop#/etc/ha.d/resource.d/Filesystem/dev/drbd0 /drbd ext4 start|stopheartbeat就是这样按配置的先后顺序控制资源的,如果heartbeat出问题了,可通过查看日志并单独运行这些命令排错)

test-master     IPaddr::10.96.20.8/24/eth0      drbddisk::data  Filesystem::/dev/drbd/0::/drbd::ext4

[[email protected] ~]# scp /etc/ha.d/haresources [email protected]:/etc/ha.d/

haresources                                                                                               100% 5996     5.9KB/s   00:00 

[[email protected]test-master~]# service drbd start   #(在主node执行)

Starting DRBD resources: [

    create res: data

  prepare disk: data

   adjust disk: data

    adjust net: data

]

..........

***************************************************************

 DRBD's startup script waits for the peernode(s) to appear.

 - Ifthis node was already a degraded cluster before the

   reboot,the timeout is 0 seconds. [degr-wfc-timeout]

 - Ifthe peer was available before the reboot, the timeout

   is0 seconds. [wfc-timeout]

  (These values are for resource 'data'; 0 sec -> wait forever)

 Toabort waiting enter 'yes' [  23]:

[[email protected]test-backup~]# service drbd start   #(在备node执行)

Starting DRBD resources: [

    create res: data

  prepare disk: data

   adjust disk: data

    adjust net: data

]

.

[[email protected] ~]# drbdadm role data

Secondary/Secondary

[[email protected] ~]# ssh test-backup 'drbdadm role data'

Secondary/Secondary

[[email protected] ~]# drbdadm -- --overwrite-data-of-peer primary data

[[email protected] ~]# drbdadm role data

Primary/Secondary

[[email protected] ~]# service heartbeat start

Starting High-Availability services:INFO:  Resource is stopped

Done.

[[email protected] ~]# ssh test-backup 'service heartbeat start'

Starting High-Availability services:2016/08/09_03:08:11 INFO:  Resource isstopped

Done.

[[email protected] ~]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[[email protected] ~]# drbdadm role data

Primary/Secondary

[[email protected] ~]# df -h

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 6.3G   11G  38% /

tmpfs           112M     0 112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M 1.3M  896M   1% /drbd

[[email protected] ~]# ls /drbd

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

 

[[email protected] ~]# service heartbeat stop

Stopping High-Availability services: Done.

[[email protected] ~]# ssh test-backup 'ipaddr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[[email protected] ~]# ssh test-backup 'df-h'

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 3.9G   13G  24% /

tmpfs           112M    0  112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M 1.3M  896M   1% /drbd

[[email protected] ~]# ssh test-backup 'ls /drbd'

lost+found

test1

test10

test2

test3

test4

test5

test6

test7

test8

test9

 

[[email protected] ~]# drbdadm role data  

Secondary/Primary

[[email protected] ~]# service heartbeat start   #node恢复后,先确保把drbd理顺,弄正常,再开启heartbeat服务

Starting High-Availability services:INFO:  Resource is stopped

Done.

[[email protected] ~]# drbdadm role data

Primary/Secondary

[[email protected] ~]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[[email protected] ~]# df -h

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 6.3G   11G  38% /

tmpfs           112M     0 112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M 1.3M  896M   1% /drbd

[[email protected] ~]# ls /drbd

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

 

 

4、分别在两主一从上,安装配置MySQL

MySQL-master-active

[[email protected] ~]# drbdadm role data

Primary/Secondary

[[email protected] ~]# groupadd -g 3306 mysql

[[email protected] ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[[email protected] ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[[email protected] ~]# mkdir /drbd/data   #(两主要在drbd的挂载点处创建DB的数据目录,drbd仅同步MySQL的数据,程序文件都放在/usr/local/下)

[[email protected] ~]# chown -R mysql.mysql /drbd/data

[[email protected] ~]# rz   #(上传mysql二进制包)

[[email protected] ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[[email protected] ~]# cd /usr/local

[[email protected] local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[[email protected] local]# cd mysql

[[email protected] mysql]# chown -R root.mysql ./

[[email protected] mysql]#scripts/mysql_install_db --user=mysql --datadir=/drbd/data   #(仅在当前对外提供服务的主node初始化,即drbdprimary端)

Installing MySQL system tables...

160810 19:46:23 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 3908 ...

OK

…….

[[email protected] mysql]# cp support-files/my-large.cnf /etc/my.cnf

[[email protected] mysql]# vim /etc/my.cnf   #(添加如下两项)

[mysqld]

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[[email protected] mysql]# egrep -v "#|^$" /etc/my.cnf

[client]

port           =3306

socket                =/tmp/mysql.sock

[mysqld]

port           =3306

socket                =/tmp/mysql.sock

skip-external-locking

key_buffer_size = 256M

max_allowed_packet = 1M

table_open_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

thread_concurrency = 8

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[mysqldump]

quick

max_allowed_packet = 16M

[mysql]

no-auto-rehash

[myisamchk]

key_buffer_size = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

[mysqlhotcopy]

interactive-timeout

[[email protected] mysql]# scp /etc/my.cnf [email protected]:/etc/

my.cnf                                                                                                     100% 4787     4.7KB/s   00:00   

[[email protected] mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[[email protected] mysql]# chkconfig --add mysqld

[r[email protected] mysql]# chkconfig mysqldoff

[[email protected] mysql]# chkconfig --list mysqld

mysqld            0:off 1:off 2:off 3:off 4:off 5:off 6:off

[[email protected] mysql]# service mysqld start

Starting MySQL.....                                        [  OK  ]

[[email protected] mysql]#/usr/local/mysql/bin/mysql

……

mysql> GRANT ALL ON *.* TO 'root'@'%'IDENTIFIED BY 'redhat';

Query OK, 0 rows affected (0.28 sec)

mysql> GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'%' IDENTIFIED BY 'repluser';

Query OK, 0 rows affected (0.17 sec)

mysql> FLUSH PRIVILEGES;

Query OK, 0 rows affected (0.04 sec)

mysql> select User,Password,Host from mysql.user;

mysql> select User,Host,Password from mysql.user;

+----------+-------------+-------------------------------------------+

| User    | Host        | Password                                  |

+----------+-------------+-------------------------------------------+

| root    | localhost   |                                           |

| root    | test-master |                                           |

| root    | 127.0.0.1   |                                           |

| root    | ::1         |                                           |

|         | localhost   |                                           |

|         | test-master |                                           |

| root    | %           |*84BB5DF4823DA319BBF86C99624479A198E6EEE9 |

| repluser | %           |*89A63F9688240669B54B5C2649EEFB795850597E |

+----------+-------------+-------------------------------------------+

8 rows in set (0.23 sec)

mysql> create database webgame;

Query OK, 1 row affected (0.10 sec)

 

mysql> show databases;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| mysql              |

| performance_schema |

| test               |

| webgame            |

+--------------------+

5 rows in set (0.04 sec)

mysql> \q

Bye

[[email protected] mysql]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[[email protected] mysql]# df -h | grep drbd0

/dev/drbd0      946M  31M  866M   4% /drbd

[[email protected] ~]# vim /etc/ha.d/haresources

test-master     IPaddr::10.96.20.8/24/eth0      drbddisk::data  Filesystem::/dev/drbd0::/drbd::ext4     mysqld

[[email protected] ~]# scp /etc/ha.d/haresources [email protected]:/etc/ha.d/

 

 

MySQL-master-inactive

[[email protected] ~]# drbdadm role data

Secondary/Primary

[[email protected] ~]# groupadd -g 3306 mysql

[[email protected] ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[[email protected] ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[[email protected] ~]# rz

[[email protected] ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[[email protected] ~]# cd /usr/local

[[email protected] local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[[email protected] local]# cd mysql

[[email protected] mysql]# chown -R root.mysql ./

[[email protected] mysql]# vim /etc/my.cnf   #(此文件从master active传来的,确认有如下配置)

[mysqld]

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[[email protected] mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[[email protected] mysql]# chkconfig --add mysqld

[[email protected] mysql]# chkconfig mysqldoff

[[email protected] mysql]# chkconfig --list mysqld

mysqld            0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

 

mysql-slave

[[email protected] ~]# mkdir /mydata/data -pv

mkdir: created directory `/mydata'

mkdir: created directory `/mydata/data'

[[email protected] ~]# groupadd -g 3306 mysql

[[email protected] ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[[email protected] ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[[email protected] ~]# rz

[[email protected] ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[[email protected] ~]# cd /usr/local

[[email protected] local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[[email protected] local]# cd mysql

[[email protected] mysql]# chown -R root.mysql./

[[email protected] mysql]# chown -R mysql.mysql /mydata/data

[[email protected] mysql]# cp support-files/my-large.cnf /etc/my.cnf

cp: overwrite `/etc/my.cnf'? y

[[email protected] mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[[email protected] mysql]# chkconfig --add mysqld

[[email protected] mysql]# chkconfig --list mysqld

mysqld            0:off 1:off 2:on 3:on 4:on 5:on 6:off

[[email protected] mysql]# vim /etc/my.cnf

[mysqld]

datadir=/mydata/data

innodb_file_per_table=1

relay-log=relay-log

relay-log-index=relay-log.index

server-id=11   

read_only=1

skip_slave_start=1

[[email protected] mysql]# egrep -v "#|^$" /etc/my.cnf

[client]

port           =3306

socket                =/tmp/mysql.sock

[mysqld]

port           =3306

socket                =/tmp/mysql.sock

skip-external-locking

key_buffer_size = 256M

max_allowed_packet = 1M

table_open_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

thread_concurrency = 8

datadir=/mydata/data

innodb_file_per_table=1

relay-log=relay-log

relay-log-index=relay-log.index

server-id=11

read_only=1

skip_slave_start=1

[mysqldump]

quick

max_allowed_packet = 16M

[mysql]

no-auto-rehash

[myisamchk]

key_buffer_size = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

[mysqlhotcopy]

interactive-timeout

[[email protected] mysql]# scripts/mysql_install_db --user=mysql --datadir=/mydata/data

Installing MySQL system tables...

160810 22:18:18 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

160810 22:18:18 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46873 ...

OK

Filling help tables...

160810 22:18:19 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

160810 22:18:19 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46880 ...

OK

……

[[email protected] mysql]# service mysqld start

Starting MySQL..                                           [  OK  ]

[[email protected] ~]# mysql

mysql> CHANGE MASTER TO MASTER_USER='repluser',MASTER_PASSWORD='repluser',MASTER_HOST='10.96.20.8',MASTER_LOG_FILE='mysql-bin.000003',MASTER_LOG_POS=330;

Query OK, 0 rows affected (0.04 sec)

 

mysql> start slave;

Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G

……

 

 

测试分两步:

先测两主node间是否正常,调整好drbd并开启服务,先不要开启heartbeat,手动开启mysqld服务,在master-active创建新库,再关闭mysqld、将activedrbd置从;将inactivedrbd置为主,开启mysqldmaster-inactive上查看;

再测在主切换后,主从同步能否继续,如下,正常

[[email protected] ~]# tail -f /var/log/ha-log   #(模拟active故障,在inactive查看take over过程)

Aug 10 22:40:38 test-backup heartbeat:[7738]: info: Local status now set to: 'up'

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Link test-master:eth0 up.

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Status update for node test-master: status active

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Comm_now_up(): updating status to active

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Local status now set to: 'active'

harc(default)[7747]:         2016/08/10_22:40:39 info: Running /etc/ha.d//rc.d/statusstatus

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: local resource transition completed.

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: Initial resource acquisition complete (T_RESOURCES(us))

Aug 10 22:40:50 test-backup heartbeat:[7766]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: remote resource transition completed.

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Received shutdown notice from 'test-master'.

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Resources being acquired from test-master.

Aug 10 23:10:16 test-backup heartbeat:[7879]: info: acquire local HA resources (standby).

Aug 10 23:10:16 test-backup heartbeat:[7879]: info: local HA resource acquisition completed (standby).

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Standby resource acquisition done [all].

Aug 10 23:10:16 test-backup heartbeat:[7880]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.

harc(default)[7905]:         2016/08/10_23:10:16 info: Running /etc/ha.d//rc.d/statusstatus

mach_down(default)[7922]:    2016/08/10_23:10:16 info: Taking overresource group IPaddr::10.96.20.8/24/eth0

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Acquiring resourcegroup: test-master IPaddr::10.96.20.8/24/eth0 drbddisk::dataFilesystem::/dev/drbd0::/drbd::ext4 mysqld

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[7977]:          2016/08/10_23:10:16 INFO:  Resource is stopped

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start

IPaddr(IPaddr_10.96.20.8)[8102]:  2016/08/10_23:10:16 INFO: Adding inet address10.96.20.8/24 with broadcast address 10.96.20.255 to device eth0

IPaddr(IPaddr_10.96.20.8)[8102]:  2016/08/10_23:10:16 INFO: Bringing device eth0up

IPaddr(IPaddr_10.96.20.8)[8102]:  2016/08/10_23:10:16 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.96.20.8 eth0 10.96.20.8 auto not_usednot_used

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[8076]:          2016/08/10_23:10:16 INFO:  Success

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/drbddisk data start

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8231]:   2016/08/10_23:10:17 INFO:  Resource is stopped

ResourceManager(default)[7949]: 2016/08/10_23:10:17 info: Running/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start

Filesystem(Filesystem_/dev/drbd0)[8314]:     2016/08/10_23:10:17 INFO: Running start for/dev/drbd0 on /drbd

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8306]:   2016/08/10_23:10:17 INFO:  Success

ResourceManager(default)[7949]: 2016/08/10_23:10:18 info: Running/etc/init.d/mysqld  start

mach_down(default)[7922]:    2016/08/10_23:10:31 info:/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

Aug 10 23:10:32test-backup heartbeat: [7738]: info: mach_down takeover complete.

mach_down(default)[7922]:    2016/08/10_23:10:33 info: mach_down takeovercomplete for node test-master.

^C

[[email protected] ~]# ip addr

……

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>mtu 1500 qdisc pfifo_fast state UP qlen 1000

   link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

    inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

   inet6 fe80::20c:29ff:fe15:e6bb/64 scope link

      valid_lft forever preferred_lft forever

……

[[email protected] ~]# df -h

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 4.7G   12G  29% /

tmpfs           112M     0 112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M  31M  866M   4% /drbd

[[email protected] ~]# service mysqld status

MySQL running(8772)                                      [  OK  ]

 

[[email protected] ~]# mysql   (在slave端查看主从同步是否正常)

Welcome to the MySQL monitor.  Commands end with ; or \g.

……

mysql> show slave status\G

*************************** 1. row***************************

               Slave_IO_State: Waiting formaster to send event

                  Master_Host: 10.96.20.8

                  Master_User: repluser

                  Master_Port: 3306

                Connect_Retry: 60

              Master_Log_File: mysql-bin.000005

         Read_Master_Log_Pos: 198

               Relay_Log_File: relay-log.000004

                Relay_Log_Pos: 344

       Relay_Master_Log_File: mysql-bin.000005

             Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

         Replicate_Ignore_DB:

          Replicate_Do_Table:

……

mysql> show databases;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| mysql              |

| performance_schema |

| test               |

| webgame1           |

| webgame2           |

| webgame3           |

+--------------------+

7 rows in set (0.00 sec)

 

 

MySQL主从同步常用的架构方案:

1、一主一从

wKioL1etHxOjDvQdAAAanLmZIOg907.jpg

注:HA软件keepalivedheartbeat只负责VIP切换即可;

此方案部署简单、容易维护;

master故障后,业务可自动切换到slave

rw都依赖主库,压力大,有锁、死锁等;

也可让slaver服务,但要依赖程序代码实现;

 

2、一主多从

wKiom1etHyShmdNYAAAiyX5OSOc021.jpg

注:HA软件keepalivedheartbeat可只负责VIP的切换;

master故障后,业务可自动切到slave1上,这时slave2可能无法和slave1自动同步,解决办法使用semi-sync机制;

支持rw splittingmaster负责wslave负责r,但要通过程序代码实现;

 

3、双主

wKiom1etHzTAdZFoAAAZlPPbIvg078.jpg

注:HA软件keepalived+LVSMMM

双主同步后,可将两个主做LB,任意一个主挂掉业务不受影响;

双主会有严重问题,会增加数据不一致的机率;

双主对性能提升不大,属复杂而并无太多好处的架构方案,不推荐;

 

4、双主多从:

wKioL1etH0XhZ628AAA_IjZmupc245.jpg

注:HA软件MMMkeepalived

若一个主挂掉,业务不受影响;

双写可以做,但会增加数据不一致机率;

同一时间只往一个主上写数据;

 

5、级联复制

wKioL1etH1eRZCeMAAAz7n7-vSA794.jpg

注:HA软件keepalivedheartbeat,可只负责VIP的切换;

master故障切至master2上,master2依然继续向slave{1,2}同步;

slave{1,2}支持rw splitting,但要通过程序代码实现;

从库为级联同步,可能会有延迟,master2若故障,那slave的同步将中断;

 

6drbd的双主

wKiom1etH2fSq6MKAAA25xIxBmc477.jpg

注:passive-server作为备用node时是不可见状态

 

7

wKioL1etH3nCJu0wAABpu-HxnC8657.jpg

-------------------------------------------------------------------------

wKiom1etH4uSpWWvAAAtSfJF8M0165.jpg

注:HA软件heartbeat既负责VIP切换,还负责drbdmysqld服务的管理;

master故障自动切至backupslave{1,2}仍能与backup同步;

slave{1,2}支持rw splitting,但要通过程序代码实现;

此方案也支持semi-sync机制;

backup仅在提升为主时才能访问,正常情况下,masterbackup仅有一台对外提供服务;

 

8、基于SAN存储的HA方案,OracleSQLserver常用

wKiom1etH76hIdWXAABJJ0ae1QY533.jpg

-----------------------------------------------------------------------

wKioL1etH9bhN9VqAAAsWNlNwJE425.jpg

注:HA软件RedHat Cluster Site

业务依赖SAN存储;

Backup仅在Master故障后,成功接管才能访问;

slave{1,2}支持rw splitting

 

9

wKioL1etH-fQ8jQmAAAvpTYHriw073.jpg

注:部署灵活、资源利用率高;

master负责wslave负责r

业务依赖DNS服务,对长连接的支持不好;

master故障影响从库;

 

10

wKiom1etH_fgQv3KAAAqT-XvnRQ930.jpg

注:可用软件mysql-proxyamoeba

前端业务透明rw splitting,后端health check

开源方案目前不稳定;

需要定制开发DBproxy

 

11、分布式数据库集群高可用方案

wKiom1etIDHDDeeaAABV6iyADRg950.jpg

注:DALdata access layer

 

12

wKioL1etIEXDd0HoAABN9NLNaMI969.jpg

注:基于Galera高可用方案;

Galera是一套在MySQL InnoDB上实现Multi-Mastersychronousreplication的集群系统;特点:true multi-master;read&write to any node;synchronousreplication;no slave log,integrity issues;no master-slave failover,noVIP;multi-thread slave;automatic node provisioning;

 

13MySQL官方cluster高可用方案

wKiom1etIFPDPchLAAB6wG-vTXU442.jpg

 

 

 

注:

MySQL HA架构方案选择依据:


根据可用性

根据安全性

根据写性能

MySQL replication

98%--99.9+%

No

Fair

master-master with MMM manager

99%

No

Fair

heartbeat/SAN

99.5%--99.9%

Yes

Excellent

Heartbeat/drbd

99.9%

Yes

Good

NDB cluster

99.999%

yes

excellent

注:NDB clustervery high,specific NDB knowledge,strom MySQL skills and strongsysadmin skills

 

 

MySQL目前存在的问题:

单机性能(QPS(rw),响应时间,数据规模,IOPSr操作和w操作的瓶颈);

主从数据一致性(异步复制,semi-sync复制,顺序性+完整性);

自动化扩容(数据迁移;按一定规模扩容(哈希取模、范围、日期、组合等,水平垂直拆分);数据容量预估、提前预警(单表容量预估(业务评估);buffer pool容量、命中率;磁盘容量);全量+增量自动化扩容(从库提升为新主库;自动或手动;扩容完毕通知代理层对前端透明);

主库单点(主备策略(备库只做数据同步,不做线上查询);数据补全(从主库拉取binblog文件进行数据补全);单点切换(主库宕机,切换新主库,尽量保持数据一致性(业务特性);通知代理层切换新的主库对应透明);

 

分布式数据库:

1、产品定位(尽量保证数据库特性,提升数据规模;线上低延迟的访问;满足具有一定复杂关系的数据操作);

2、设计原则(实现mysql客户端通信协议;数据逻辑分布对应用透明;自动发现/人工决定/自动处理;支持单机事务);

3、设计指标(千亿级别存贮数据;响应时间低于10ms;对上层应用完全透明);

wKiom1etIGqhxjLEAABUcjjz5Fg189.jpg

分布式数据库代理层(实现mysql客户端协议;rw splittingLB,从库加权轮询等;数据查询合并;数据拆分规则;并发控制;sql白名单管理;单机事务支持(amoeba不支持事务);服务端模型);

监控(存活监控;主从延时监控;容量监控(表、磁盘);流量监控(请求);命中率监控(缓冲池);关键数据收集上报);

web监控和报警(界面对运维和DBA友好;可以触发集群管理操作(人工扩容、切换新主库);监控数据异常报警(邮件、短信、级别不同方式不一样);

元数据服务(存贮数据拆分规则(配置中心);选举服务;实现fast paxos协议;数据原子广播通信协议;实现数据通知服务;锁服务;应用定位服务);

单点切换服务(主库宕机提升备库或从库为新主库(ssh是否通,获取binlog补全数据),尽量保持数据一致性);选取新主库的策略;新主库确定,通知前端代理层);

数据迁移服务(根据监控数据和预值指标进行扩容;全量+增量;冗余数据自动清理;自动或人工迁移)

 

 

 



本文转自 chaijowin 51CTO博客,原文链接:http://blog.51cto.com/jowin/1837146,如需转载请自行联系原作者