MySQL binlog 组提交与 XA(两阶段提交)--1

参考了网上几篇比较靠谱的文章html

http://www.linuxidc.com/Linux/2015-11/124942.htmmysql

http://blog.csdn.net/woqutechteam/article/details/51178803linux

http://blog.itpub.net/15480802/viewspace-1411356/sql

http://blog.csdn.net/sofia1217/article/details/53968214数据库

http://dinglin.iteye.com/blog/907123服务器

特别粘贴里第一篇文章的内容多线程

1. XA-2PC (two phase commit, 两阶段提交 )
XA是由X/Open组织提出的分布式事务的规范(X表明transaction; A表明accordant?)。XA规范主要定义了(全局)事务管理器(TM: Transaction Manager)和(局部)资源管理器(RM: Resource Manager)之间的接口。XA为了实现分布式事务,将事务的提交分红了两个阶段:也就是2PC (tow phase commit),XA协议就是经过将事务的提交分为两个阶段来实现分布式事务。
1.1 prepare 阶段:
第一阶段,事务管理器向全部涉及到的数据库服务器发出prepare"准备提交"请求,数据库收到请求后执行数据修改和日志记录等处理,处理完成后只是把事务的状态改为"能够提交",而后把结果返回给事务管理器。
1.2 commit 阶段:
事务管理器收到回应后进入第二阶段,若是在第一阶段内有任何一个数据库的操做发生了错误,或者事务管理器收不到某个数据库的回应,则认为事务失败,回撤全部数据库的事务。数据库服务器收不到第二阶段的确认提交请求,也会把"能够提交"的事务回撤。若是第一阶段中全部数据库都提交成功,那么事务管理器向数据库服务器发出"确认提交"请求,数据库服务器把事务的"能够提交"状态改成"提交完成"状态,而后返回应答。
 
2. MySQL 中的XA实现
Support for  XA transactions is available for the  InnoDB storage engine. The MySQL XA implementation is based on the X/Open CAE document  Distributed Transaction Processing: The XA Specification.

Currently, among the MySQL Connectors, MySQL Connector/J 5.0.0 and higher supports XA directly, by means of a class interface that handles the XA SQL statement interface for you.并发

XA supports distributed transactions, that is, the ability to permit multiple separate transactional resources to participate in a global transaction. Transactional resources often are RDBMSs but may be other kinds of resources.less

A global transaction involves several actions that are transactional in themselves, but that all must either complete successfully as a group, or all be rolled back as a group. In essence, this extends ACID properties “up a level” so that multiple ACID transactions can be executed in concert as components of a global operation that also has ACID properties. (However, for a distributed transaction, you must use the SERIALIZABLE isolation level to achieve ACID properties. It is enough to use REPEATABLE READ for a nondistributed transaction, but not for a distributed transaction.) 分布式

最重要的一点:使用MySQL中的XA实现分布式事务时必须使用serializable隔离级别。

The MySQL implementation of XA MySQL enables a MySQL server to act as a Resource Manager that handles XA transactions within a global transaction. A client program that connects to the MySQL server acts as the Transaction Manager.

The process for executing a global transaction uses two-phase commit (2PC). This takes place after the actions performed by the branches of the global transaction have been executed.

  1. In the first phase, all branches are prepared. That is, they are told by the TM to get ready to commit. Typically, this means each RM that manages a branch records the actions for the branch in stable storage. The branches indicate whether they are able to do this, and these results are used for the second phase.

  2. In the second phase, the TM tells the RMs whether to commit or roll back. If all branches indicated when they were prepared that they will be able to commit, all branches are told to commit. If any branch indicated when it was prepared that it will not be able to commit, all branches are told to roll back.

第一阶段:为prepare阶段,TM向RM发出prepare指令,RM进行操做,而后返回成功与否的信息给TM;

第二阶段:为事务提交或者回滚阶段,若是TM收到全部RM的成功消息,则TM向RM发出提交指令;否则则发出回滚指令;

XA transaction support is limited to the InnoDB storage engine.(只有innodb支持XA分布式事务)

For "external XA" a MySQL server acts as a Resource Manager and client programs act as Transaction Managers. For "Internal XA", storage engines within a MySQL server act as RMs, and the server itself acts as a TM. Internal XA support is limited by the capabilities of individual storage engines.  Internal XA is required for handling XA transactions that involve more than one storage engine. The implementation of internal XA requires that a storage engine support two-phase commit at the table handler level, and currently this is true only for InnoDB.

MySQL中的XA实现分为:外部XA和内部XA;前者是指咱们一般意义上的分布式事务实现;后者是指单台MySQL服务器中,Server层做为TM(事务协调者),而服务器中的多个数据库实例做为RM,而进行的一种分布式事务,也就是MySQL跨库事务;也就是一个事务涉及到同一条MySQL服务器中的两个innodb数据库(由于其它引擎不支持XA)。

3. 内部XA的额外功能

XA 将事务的提交分为两个阶段,而这种实现,解决了 binlog 和 redo log的一致性问题,这就是MySQL内部XA的第三种功能。

MySQL为了兼容其它非事物引擎的复制,在server层面引入了 binlog, 它能够记录全部引擎中的修改操做,于是能够对全部的引擎使用复制功能;MySQL在4.x 的时候放弃redo的复制策略而引入binlog的复制(淘宝丁奇)。

可是引入了binlog,会致使一个问题——binlog和redo log的一致性问题:一个事务的提交必须写redo log和binlog,那么两者如何协调一致呢?事务的提交以哪个log为标准?如何判断事务提交?事务崩溃恢复如何进行?
MySQL经过 两阶段提交( 内部XA的两阶段提交)很好地解决了这一问题:
第一阶段:InnoDB prepare,持有prepare_commit_mutex,而且write/sync redo log; 将回滚段设置为Prepared状态,binlog不做任何操做;
第二阶段:包含两步,1> write/sync Binlog; 2> InnoDB commit (写入COMMIT标记后释放prepare_commit_mutex);
以 binlog 的写入与否做为事务提交成功与否的标志,innodb commit标志并非事务成功与否的标志。由于此时的事务崩溃恢复过程以下:
1> 崩溃恢复时,扫描最后一个Binlog文件,提取其中的xid;  
2> InnoDB维持了状态为Prepare的事务链表,将这些事务的xid和Binlog中记录的xid作比较,若是在Binlog中存在,则提交,不然回滚事务。
经过这种方式,可让InnoDB和Binlog中的事务状态保持一致。若是在写入innodb commit标志时崩溃,则恢复时,会从新对commit标志进行写入;
在prepare阶段崩溃,则会回滚,在write/sync binlog阶段崩溃,也会回滚。这种事务提交的实现是MySQL5.6以前的实现。
 
4. binlog 组提交
上面的事务的两阶段提交过程是5.6以前版本中的实现,有严重的缺陷。当sync_binlog=1时,很明显上述的第二阶段中的 write/sync binlog会成为瓶颈,并且仍是持有全局大锁(prepare_commit_mutex: prepare 和 commit共用一把锁),这会致使性能急剧降低。解决办法就是MySQL5.6中的 binlog组提交。
4.1 MySQL5.6中的binlog group commit:

Binlog Group Commit的过程拆分红了三个阶段:

1> flush stage 将各个线程的binlog从cache写到文件中; 

2> sync stage 对binlog作fsync操做(若是须要的话;最重要的就是这一步,对多个线程的binlog合并写入磁盘);

3> commit stage 为各个线程引擎层的事务commit(这里不用写redo log,在prepare阶段已写)。每一个stage同时只有一个线程在操做。(分红三个阶段,每一个阶段的任务分配给一个专门的线程,这是典型的并发优化)

这种实现的优点在于三个阶段能够并发执行,从而提高效率。注意prepare阶段没有变,仍是write/sync redo log.
(另外:5.7中引入了 MTS:多线程slave复制,也是经过binlog组提交实现的,在binlog组提交时,给每个组提交打上一个seqno,而后在slave中就能够按照master中同样按照seqno的大小顺序,进行事务组提交了。)
 
4.2 MySQL5.7中的binlog group commit:

淘宝对binlog group commit进行了进一步的优化,其原理以下:

从XA恢复的逻辑咱们能够知道,只要保证InnoDB Prepare的redo日志在写Binlog前完成write/sync便可。所以咱们对Group Commit的第一个stage的逻辑作了些许修改,大概描述以下:

 Step1. InnoDB Prepare,记录当前的LSN到thd中; 
 Step2. 进入Group Commit的flush stage;Leader搜集队列,同时算出队列中最大的LSN。 
 Step3. 将InnoDB的redo log write/fsync到指定的LSN  (:这一步就是redo log的组写入。由于小于等于LSN的redo log被一次性写入到ib_logfile[0|1])
 Step4. 写Binlog并进行随后的工做(sync Binlog, InnoDB commit , etc)

也就是将 redo log的write/sync延迟到了 binlog group commit的 flush stage 以后,sync binlog以前。

经过延迟写redo log的方式,显式的为redo log作了一次组写入(redo log group write),并减小了(redo log) log_sys->mutex的竞争。

也就是将 binlog group commit 对应的redo log也进行了 group write. 这样binlog 和 redo log都进行了优化。

官方MySQL在5.7.6的代码中引入了淘宝的优化,对应的Release Note以下:

When using InnoDB with binary logging enabled, concurrent transactions written in the InnoDB redo log are now grouped together before synchronizing to disk when innodb_flush_log_at_trx_commit is set to 1, which reduces the amount of synchronization operations. This can lead to improved performance.

5. XA参数 innodb_support_xa

Command-Line Format --innodb_support_xa
System Variable Name innodb_support_xa
Variable Scope Global, Session
Dynamic Variable Yes
Permitted Values Type boolean
Default TRUE

Enables InnoDB support for two-phase commit(2PC) in XA transactions, causing an extra disk flush for transaction preparation. This setting is the default. The XA mechanism is used internally and is essential for any server that has its binary log turned on and is accepting changes to its data from more than one thread. If you turn it off, transactions can be written to the binary log in a different order from the one in which the live database is committing them. This can produce different data when the binary log is replayed in disaster recovery or on a replication slave. Do not turn it off on a replication master server unless you have an unusual setup where only one thread is able to change data.

For a server that is accepting data changes from only one thread, it is safe and recommended to turn off this option to improve performance forInnoDB tables. For example, you can turn it off on replication slaves where only the replication SQL thread is changing data.

You can also turn off this option if you do not need it for safe binary logging or replication, and you also do not use an external XA transaction manager.

参数innodb_support_xa默认为true,表示启用XA,虽然它会致使一次额外的磁盘flush(prepare阶段flush redo log). 可是咱们必须启用,而不能关闭它。由于关闭会致使binlog写入的顺序和实际的事务提交顺序不一致,会致使崩溃恢复和slave复制时发生数据错误。若是启用了log-bin参数,而且不止一个线程对数据库进行修改,那么就必须启用innodb_support_xa参数。

相关文章
相关标签/搜索