工做中一直在用Oracle 的中间件Oracle GondenGate 是如何保证消息的有序和不丢失呢?数据库
首先,先看一下Oracle GoldenGate 的逻辑架构:网络
图中涉及到两个阶段:架构
官方关于 trail文件的说明以下:oracle
To support the continuous extraction and replication of database changes, Oracle GoldenGate stores records of the captured changes temporarily on disk in a series of files called a trail. A trail can exist on the source system, an intermediary system, the target system, or any combination of those systems, depending on how you configure Oracle GoldenGate. On the local system it is known as an extract trail (or local trail). On a remote system it is known as a remote trail.app
By using a trail for storage, Oracle GoldenGate supports data accuracy and fault tolerance (see Section 1.2.6, "Overview of Checkpoints"). The use of a trail also allows extraction and replication activities to occur independently of each other. With these processes separated, you have more choices for how data is processed and delivered. For example, instead of extracting and replicating changes continuously, you could extract changes continuously but store them in the trail for replication to the target later, whenever the target application needs them.spa
即trail 中保存的是数据库中的变化数据。Oracle GoldenGate用trail 作存储,确保数据的准确性和容错性。它也容许extract进程和replicat进程能够独立存在,相似于消息中间件的做用。线程
下面看一下官方给出的checkpoint 的案例(原本想用项目的真实checkpoint信息,为避免没必要要的麻烦,做罢):3d
注意这个是Oracle RAC模式下checkpoint信息。日志
查看extract进程checkpoint信息命令:INFO EXTRACT JC108XT,SHOWCHcode
extract 进程checkpoint信息以下:
EXTRACT JC108XT Last Started 2011-01-01 14:15 Status ABENDED
Checkpoint Lag 00:00:00 (updated 00:00:01 ago) Log Read Checkpoint File /orarac/oradata/racq/redo01.log 2011-01-01 14:16:45 Thread 1, Seqno 47, RBA 68748800 Log Read Checkpoint File /orarac/oradata/racq/redo04.log 2011-01-01 14:16:19 Thread 2, Seqno 24, RBA 65657408 Current Checkpoint Detail: Read Checkpoint #1 Oracle RAC Redo Log Startup Checkpoint (starting position in data source): Thread #: 1 Sequence #: 47 RBA: 68548112 Timestamp: 2011-01-01 13:37:51.000000 SCN: 0.8439720 Redo File: /orarac/oradata/racq/redo01.log Recovery Checkpoint (position of oldest unprocessed transaction in data source): Thread #: 1 Sequence #: 47 RBA: 68748304 Timestamp: 2011-01-01 14:16:45.000000 SCN: 0.8440969 Redo File: /orarac/oradata/racq/redo01.log Current Checkpoint (position of last record read in the data source): Thread #: 1 Sequence #: 47 RBA: 68748800 Timestamp: 2011-01-01 14:16:45.000000 SCN: 0.8440969 Redo File: /orarac/oradata/racq/redo01.log Read Checkpoint #2 Oracle RAC Redo Log Startup Checkpoint(starting position in data source): Sequence #: 24 RBA: 60607504 Timestamp: 2011-01-01 13:37:50.000000 SCN: 0.8439719 Redo File: /orarac/oradata/racq/redo04.log Recovery Checkpoint (position of oldest unprocessed transaction in data source): Thread #: 2 Sequence #: 24 RBA: 65657408 Timestamp: 2011-01-01 14:16:19.000000 SCN: 0.8440613 Redo File: /orarac/oradata/racq/redo04.log Current Checkpoint (position of last record read in the data source): Thread #: 2 Sequence #: 24 RBA: 65657408 Timestamp: 2011-01-01 14:16:19.000000 SCN: 0.8440613 Redo File: /orarac/oradata/racq/redo04.log Write Checkpoint #1 GGS Log Trail Current Checkpoint (current write position): Sequence #: 2 RBA: 2142224 Timestamp: 2011-01-01 14:16:50.567638 Extract Trail: ./dirdat/eh Header: Version = 2 Record Source = A Type = 6 # Input Checkpoints = 2 # Output Checkpoints = 1 File Information: Block Size = 2048 Max Blocks = 100 Record Length = 2048 Current Offset = 0 Configuration: Data Source = 3 Transaction Integrity = 1 Task Type = 0 Status: Start Time = 2011-01-01 14:15:14 Last Update Time = 2011-01-01 14:16:50 Stop Status = A Last Result = 400
1. extract将read checkpoints放置在数据源中。若是数据源是Oracle,则检查点是放在Oracle的日志中。
2. Startup checkpoint:启动检查点是进程启动时在数据源中建立的第一个检查点。
Thread #
: 建立检查点的线程数,只有Oracle的RAC模式才会有
Sequence #
: 建立检查点的事务日志的序列号
RBA
: RBA是relative byte address的简写,表示建立检查点的记录的相对字节地址
Timestamp
: 表示建立检查点的记录的时间戳
SCN
: SCN是system change number的简写,表示系统更改检查点所在记录的编号
Redo File
: 包含建立检查点的记录的事务日志的路径名
3. Recovery checkpoint:恢复检查点表示extract未处理的最先的事务日志的位置信息。
4. Current checkpoint:表示extract在数据源中读的最近的(注意:此时尚未写成功)记录的位置信息。它应该和 Log Read Checkpoint
信息一致。
extract进程将 current checkpoint 放在trail 文件中。current checkpoint 是指extract 正在写的trail的位置。
Sequence #
: 写入检查点的trail文件的序列号
RBA
:trail文件中建立检查点的记录的相对字节地址
Timestamp
: 建立检查点的记录的时间戳
Extract trail
: trail文件的相对路径名称
Trail Type
: 其中在相似于NFS服务上的被认为是local
查看 replicat 进程 checkpoint 信息命令:INFO REPLICAT JC108RP, SHOWCH
replicat 进程checkpoint 信息以下:
REPLICAT JC108RP Last Started 2011-01-12 13:10 Status RUNNING
Checkpoint Lag 00:00:00 (updated 111:46:54 ago) Log Read Checkpoint File ./dirdat/eh000000 First Record RBA 3702915 Current Checkpoint Detail: Read Checkpoint #1 GGS Log Trail Startup Checkpoint(starting position in data source): Sequence #: 0 RBA: 3702915 Timestamp: Not Available Extract Trail: ./dirdat/eh Current Checkpoint (position of last record read in the data source): Sequence #: 0 RBA: 3702915 Timestamp: Not Available Extract Trail: ./dirdat/eh Header: Version = 2 Record Source = A Type = 1 # Input Checkpoints = 1 # Output Checkpoints = 0 File Information: Block Size = 2048 Max Blocks = 100 Record Length = 2048 Current Offset = 0 Configuration: Data Source = 0 Transaction Integrity = -1 Task Type = 0 Status: Start Time = 2011-01-12 13:10:13 Last Update Time = 2011-01-12 21:23:31 Stop Status = A Last Result = 400
1. Startup Checkpoint
当进程启动时在trail文件中建立的第一个checkpoint
Sequence #:
写入检查点的trail文件的序列号
RBA:
trail文件中建立检查点的记录的相对字节地址
Timestamp:
表示建立检查点的记录的时间戳
Extract Trail:
trail 文件的相对地址
2. Current Checkpoint:current checkpoint 是指replicat 进程读取trail文件的最近的记录的位置。
Oracle GoldenGate的日志格式是snapshot格式的,试想一下,假设我一条记录的某个字段 作累加操做,Oracle GoldenGate给咱们的数据是增量数据,在at-least-once语义之上,进行屡次传输,那么数据最终会出问题。而snapshot数据,只须要根据主键不断覆盖便可。这种数据是支持幂等性操做的。
参考:
Oracle GoldenGate文档库:https://docs.oracle.com/goldengate/1212/gg-winux/GWUAD/wu_about_gg.htm#GWUAD117
Oracle官方对 Checkpoint 的术语的解释:https://docs.oracle.com/goldengate/1212/gg-winux/GWUAD/wu_ogg_checkpts.htm#GWUAD965