Clickhouse-010之MySQL数据增量同步到clickhouse库

1、演示环境

node01 部署 mysql数据库服务,MySQL版本5.7.22
node02 部署 clickhouse库服务,proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm ,以及同步数据程序服务
特别申明: 同步的程序时python语言写的,目前只支持到pyth2.7.5版本
服务器IP:html

node01   172.16.0.246
node02   172.16.0.197

clickhouse服务客户端和服务端的版本:node

[root@node01 ~]# clickhouse-server -V
ClickHouse server version 20.8.3.18.
[root@node02 ~]# clickhouse-client -V
ClickHouse client version 20.8.3.18.

服务器系统和python版本:python

[root@node02 ~]# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 
[root@node02 ~]#  python -V
Python 2.7.5
[root@node01 ~]# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 
[root@node01 ~]# python -V
Python 2.7.5

node02机器 proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm软件安装说明:
主要是为了clickhouse兼容mysql协议mysql

2、服务器安装

2.1 、node02服务器clickhouse服务单机版安装

此处node2服务器 clickhouse服务单机版安装就再也不演示
具体安装参考链接:https://blog.51cto.com/wujianwei/2949877
简单说明下安装完单机版clickhouse服务:linux

[root@node02 soft]# clickhouse-client -h 127.0.0.1 -m -q "show databases;"
_temporary_and_external_tables
default
system

default数据库里面没有任何东西,和mysql里面的test库是同样的。system库看名字就知道是什么。git

clickhouse服务安装补充,在官方的文档里面有几点建议:github

  1. 关闭大页
  2. 调整内存使用
  3. 关闭cpu节能模式
    echo 'performance' | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    echo 0 > /proc/sys/vm/overcommit_memory
    echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled

2.二、node1服务器MySQL服务安装

mysql部署请自行部署。这里不作介绍。
若是想从mysql同步数据那么binlog 格式必须是row。并且必须binlog_row_image=fullredis

2.三、node02服务器安装同步程序

安装同步程序依赖的包;同步程序能够放在clickhouse服务器上面,也能够单独放在其余服务器。
同步程序使用pypy启动,能够提升同步数据的速度。
因此安装包的时候须要安装pypy相关的软件包和依赖
具体安装命令以下:sql

yum -y install pypy-libs pypy pypy-devel
wget https://bootstrap.pypa.io/get-pip.py
提示:一开始下载这个文件后,执行pypy get-pip.py 提示须要python3的环境,python2.7的环境须要下载下面的版本
wget https://bootstrap.pypa.io/pip/2.7/get-pip.py
pypy get-pip.py数据库

执行安装下面的命令,可是直接复制粘贴一堆报错,因为对python不是很熟,致使没法进行下去:

/usr/lib64/pypy-5.0.1/bin/pip install MySQL-python
/usr/lib64/pypy-5.0.1/bin/pip install mysql-replication
/usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver==0.0.20
/usr/lib64/pypy-5.0.1/bin/pip install redis

2.四、成功的安装步骤

通过几回测试:按照下面的顺序安装python模块,能够安装成功:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install MySQL-python
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting MySQL-python
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip (108 kB)
     |████████████████████████████████| 108 kB 2.6 MB/s 
Building wheels for collected packages: MySQL-python
  Building wheel for MySQL-python (setup.py) ... done
  Created wheel for MySQL-python: filename=MySQL_python-1.2.5-pp27-pypy_41-linux_x86_64.whl size=49118 sha256=ff86f8fba2433c5d623d1bf2158b8d9f8ab346b8f09dcfa9acfc074130e07bcb
  Stored in directory: /root/.cache/pip/wheels/21/03/d9/41cbcc2b332380d24663723922354ab876fffe2224b259a834
Successfully built MySQL-python
Installing collected packages: MySQL-python
Successfully installed MySQL-python-1.2.5
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication==0.24 
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting mysql-replication==0.24
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/ec/5e/36b87b6068210f1fbd606768e0c2541727a229ac0ebf557d65fd31bc79e9/mysql-replication-0.24.tar.gz (33 kB)
Collecting pymysql>=0.6
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/2b/c4/3c3e7e598b1b490a2525068c22f397fda13f48623b7bd54fb209cd0ab774/PyMySQL-1.0.0.tar.gz (45 kB)
     |████████████████████████████████| 45 kB 37.2 MB/s 
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/pypy -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-XgbqzK/pymysql/setup.py'"'"'; __file__='"'"'/tmp/pip-install-XgbqzK/pymysql/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-fnr4Dg
         cwd: /tmp/pip-install-XgbqzK/pymysql/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-XgbqzK/pymysql/setup.py", line 6, in <module>
        with open("./README.rst", encoding="utf-8") as f:
    TypeError: __init__() got an unexpected keyword argument 'encoding'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

执行上面的命令遇到报错,详细看报错提示,按照报错提示Collecting pymysql>=0.6 说明缺乏这个模块,并且要求版本要大于0.6.因而采用下面的命令安装成功:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install pymysql==0.6
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting pymysql==0.6
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/0c/3b/17407490b878d2abbc0c544ff71491e08932d1d44225b84a103eae317b7c/PyMySQL-0.6.tar.gz (52 kB)
     |████████████████████████████████| 52 kB 748 kB/s 
Building wheels for collected packages: pymysql
  Building wheel for pymysql (setup.py) ... done
  Created wheel for pymysql: filename=PyMySQL-0.6-py2-none-any.whl size=60771 sha256=6947b8d7c9e24e3d13982b4871f06c97923231b2223d8e2442f5ccce41fb4548
  Stored in directory: /root/.cache/pip/wheels/e0/b8/37/bbe7db22c5f90fb4dc04e9766ca49aa05a5f76acd0956e62ce
Successfully built pymysql
Installing collected packages: pymysql
Successfully installed pymysql-0.6

接着执行下面的安装命令:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication==0.24 
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting mysql-replication==0.24
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/ec/5e/36b87b6068210f1fbd606768e0c2541727a229ac0ebf557d65fd31bc79e9/mysql-replication-0.24.tar.gz (33 kB)
Requirement already satisfied: pymysql>=0.6 in /usr/lib64/pypy-5.0.1/site-packages (from mysql-replication==0.24) (0.6)
Building wheels for collected packages: mysql-replication
  Building wheel for mysql-replication (setup.py) ... done
  Created wheel for mysql-replication: filename=mysql_replication-0.24-py2-none-any.whl size=42153 sha256=8c9ba52edb99fc8c17c07b329f0a1b00ac535cd87b4562ec90fbc6cc1f367512
  Stored in directory: /root/.cache/pip/wheels/11/a5/cd/912029dfb7e8a159dc3d439416f6cf8bccc65cc010cf124fff
Successfully built mysql-replication
Installing collected packages: mysql-replication
Successfully installed mysql-replication-0.24

接着执行下面的安装命令:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver==0.0.20 
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting clickhouse-driver==0.0.20
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/9e/a4/bc945ee53254b6f38fd9c7ee6e97a5834c116a68220d1910bf0850c7bc64/clickhouse-driver-0.0.20.tar.gz (36 kB)
Collecting pytz
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/70/94/784178ca5dd892a98f113cdd923372024dc04b8d40abe77ca76b5fb90ca6/pytz-2021.1-py2.py3-none-any.whl (510 kB)
     |████████████████████████████████| 510 kB 19.9 MB/s 
Collecting enum34
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/6f/2c/a9386903ece2ea85e9807e0e062174dc26fdce8b05f216d00491be29fad5/enum34-1.1.10-py2-none-any.whl (11 kB)
Collecting ipaddress
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/c2/f8/49697181b1651d8347d24c095ce46c7346c37335ddc7d255833e7cde674d/ipaddress-1.0.23-py2.py3-none-any.whl (18 kB)
Building wheels for collected packages: clickhouse-driver
  Building wheel for clickhouse-driver (setup.py) ... done
  Created wheel for clickhouse-driver: filename=clickhouse_driver-0.0.20-py2-none-any.whl size=50313 sha256=828b07473b373d9b9ef0538e76b192cd2af592951d41175acfd1cc5b68206ed5
  Stored in directory: /root/.cache/pip/wheels/ef/f1/f0/1926c46953bd8f9d65f1176efc995c223006504c4fbfe37a73
Successfully built clickhouse-driver
Installing collected packages: pytz, enum34, ipaddress, clickhouse-driver
Successfully installed clickhouse-driver-0.0.20 enum34-1.1.10 ipaddress-1.0.23 pytz-2021.1
[root@node02 soft]#

接着执行下面的安装命令:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install redis
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting redis
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/a7/7c/24fb0511df653cf1a5d938d8f5d19802a88cef255706fdda242ff97e91b7/redis-3.5.3-py2.py3-none-any.whl (72 kB)
     |████████████████████████████████| 72 kB 3.0 MB/s 
Installing collected packages: redis
Successfully installed redis-3.5.3
[root@node02 soft]#

说明: 这里也安装了redis模块是由于同步的binlog pos能够存放在redis里面,固然程序也是支持存放在文件里面。

查看已经安装完成的模块:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip  list|egrep -i "MySQL-python|mysql-replication|clickhouse-driver|redis"
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
clickhouse-driver 0.0.20
MySQL-python      1.2.5
mysql-replication 0.24
redis             3.5.3

2.五、node02服务器安装proxysql

proxysql安装(主要是为了clickhouse兼容mysql协议):

proxysql在这里下载:https://github.com/sysown/proxysql/releases 选择带clickhouse的包下载,不然不会支持clickhouse。
ps:ClickHouse server version 20.8.3.18版本的clickhouse已经原生兼容mysql协议。可是再同步MySQL数据时,有严格的格式要求,目前还不能很好的结合已有的MySQL库数据进行配置同步到clickhouser库

proxysql安装及配置以下:

[root@node02 soft]# rpm -ivh proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm 
Preparing...                          ################################# [100%]
Updating / installing...
   1:proxysql-2.0.13-1                warning: group proxysql does not exist - using root
warning: group proxysql does not exist - using root
################################# [100%]
Created symlink from /etc/systemd/system/multi-user.target.wants/proxysql.service to /etc/systemd/system/proxysql.service.

启动(必须这样启动,不然是不支持clickhouse的:

proxysql --clickhouse-server
[root@node02 soft]# proxysql --clickhouse-server 
2021-07-15 12:54:28 [INFO] Using config file /etc/proxysql.cnf
2021-07-15 12:54:28 [INFO] Using OpenSSL version: OpenSSL 1.1.1d  10 Sep 2019
2021-07-15 12:54:28 [INFO] No SSL keys/certificates found in datadir (/var/lib/proxysql). Generating new keys/certificates.
[root@node02 soft]#
[root@node02 soft]# ss -lntup|grep proxysql
tcp    LISTEN     0      128       *:6090                  *:*                   users:(("proxysql",pid=20648,fd=28))
tcp    LISTEN     0      128       *:6032                  *:*                   users:(("proxysql",pid=20648,fd=27))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=26))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=25))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=24))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=23))

登陆proxsql服务端:

[root@node02 soft]# mysql -uadmin -padmin -h127.0.0.1 -P6032 -e "show databases;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----+---------------+-------------------------------------+
| seq | name          | file                                |
+-----+---------------+-------------------------------------+
| 0   | main          |                                     |
| 2   | disk          | /var/lib/proxysql/proxysql.db       |
| 3   | stats         |                                     |
| 4   | monitor       |                                     |
| 5   | stats_history | /var/lib/proxysql/proxysql_stats.db |
+-----+---------------+-------------------------------------+

登陆proxysql,设置clicku帐户,经过这个帐户来登陆后端的clickhouse服务:

mysql -uadmin -padmin -h127.0.0.1 -P6032
admin@node02 12:57:  [(none)]> select * from clickhouse_users;
Empty set (0.00 sec)

admin@node02 12:57:  [(none)]> 

INSERT INTO clickhouse_users VALUES ('clicku','clickp',1,100);
LOAD CLICKHOUSE USERS TO RUNTIME;
SAVE CLICKHOUSE USERS TO DISK;

admin@node02 12:57:  [(none)]> select * from clickhouse_users;
+----------+----------+--------+-----------------+
| username | password | active | max_connections |
+----------+----------+--------+-----------------+
| clicku   | clickp   | 1      | 100             |
+----------+----------+--------+-----------------+
1 row in set (0.00 sec)

使用proxysql链接到clickhouse:

[root@node02 soft]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 -e "show databases;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------------+
| name                           |
+--------------------------------+
| _temporary_and_external_tables |
| default                        |
| system                         |

3、同步node01上的MySQL的数据到node02的clickhouse

mysql同步数据到clickhouse

3.一、案例1:mysql里面有个库test001,库里面有张表tb1,同步这张表到clickhoue

####3.1.一、登陆node01 MySQL库建立须要同步的测试库和测试表

root@node01 13:05:  [(none)]> create database test001;
Query OK, 1 row affected (0.00 sec)

root@node01 13:05:  [(none)]> 
root@node01 13:05:  [(none)]> use test001;
Database changed
root@node01 13:05:  [test001]> CREATE TABLE `tb1` (   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,   `pay_money` decimal(20,2) NOT NULL DEFAULT '0.00',   `pay_day` date NOT NULL,   `pay_time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',   PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Query OK, 0 rows affected (0.02 sec)
**建立复制node01库的帐户:**
 GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'click_rep'@'172.16.0.197' identified by 'jwts996';flush privileges;
root@node01 13:05:  [test001]> GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'click_rep'@'172.16.0.197' identified by 'jwts996';flush privileges;
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

root@node01 13:09:  [test001]>

3.1.二、登陆node02 clickhouse服务建立和MySQL中对应的库表

1. clickhoue里面建库,建表:

[root@node02 soft]# clickhouse-client -h 127.0.0.1 -m -q "show databases;"
_temporary_and_external_tables
default
system

node02 :) create database test001;

2. 建表(clickhouse建表的格式以及字段类型和mysql彻底不同,若是字段少还能够本身建,若是字段多比较痛苦,可使用clickhouse自带的从mysql导数据的命令来建表),在建表以前须要进行受权,由于程序同步也是模拟一个从库拉取数据.
登录clickhouse进行建表:

CREATE TABLE tb1
ENGINE = MergeTree
PARTITION BY toYYYYMM(pay_time)
ORDER BY pay_time AS
SELECT *
FROM mysql('172.16.0.246:3306', 'test001', 'tb1', 'click_rep', 'jwts996');

关于clickhouse表结构的说明:

[root@node02 soft]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 -e " show create table test001.tb1;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| statement                                                                                                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE TABLE test001.tb1
(
    `id` UInt32,
    `pay_money` String,
    `pay_day` Date,
    `pay_time` DateTime
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(pay_time)
ORDER BY pay_time
SETTINGS index_granularity = 8192

 这里使用MergeTree引擎,MergeTree是clickhouse里面最牛逼的引擎,支持海量数据,支持索引,支持分区,支持更新删除。toYYYYMM(pay_time)的意思是根据pay_time分区,粒度是按月。
 ORDER BY (pay_time)的意思是根据pay_time排序存储,同时也是索引。上面的create table命令若是mysql表里面之后数据那么数据也会一并进入clickhouse里面。
 其中这里的index_granularity = 8192是指索引的粒度。若是数据量没有达到百亿,那么一般无需更改.

3.1.三、执行同步程序命令

[root@node02 sync]# pypy mysql-clickhouse-replication.py --help 
Traceback (most recent call last):
  File "mysql-clickhouse-replication.py", line 10, in <module>
    import MySQLdb
  File "/usr/lib64/pypy-5.0.1/site-packages/MySQLdb/__init__.py", line 19, in <module>
    import _mysql
ImportError: unable to load extension module '/usr/lib64/pypy-5.0.1/site-packages/_mysql.pypy-41.so': libmysqlclient.so.20: cannot open shared object file: No such file or directory
[root@node02 sync]#

解决办法:

[root@node02 sync]# ln -sv /usr/local/mysql-5.7.22-linux-glibc2.12-x86_64/lib/libmysqlclient.so.20 /usr/lib64/
‘/usr/lib64/libmysqlclient.so.20’ -> ‘/usr/local/mysql-5.7.22-linux-glibc2.12-x86_64/lib/libmysqlclient.so.20’
[root@node02 sync]# pypy mysql-clickhouse-replication.py --help 
usage: Data Replication to clikhouse [-h] [-c CONF] [-d] [-l]

mysql data is copied to clikhouse

optional arguments:
  -h, --help            show this help message and exit
  -c CONF, --conf CONF  Data synchronization information file
  -d, --debug           Display SQL information
  -l, --logtoredis      log position to redis ,default file
By dengyayun @2019

到此处同步程序算是安装完成

3.1.四、编写和配置同步程序配置文件

表结构也建立完成之后如今配置同步程序配置文件metainfo.conf
配置文件内容以下:

[root@node02 sync]# cat metainfo.conf
# 从这里同步数据
[master_server]
host='172.16.0.246'
port=3306
user='click_rep'
passwd='jwts996'
server_id=172160246

# redis配置信息,用于存放pos点
[redis_server]
host='127.0.0.1'
port=6379
passwd='xx'
log_pos_prefix='log_pos_'
**##这次演示没采用redis来存放指定的binglog文件和pos位置点**
#把log_position记录到文件
[log_position]
file='./repl_pos.log'
**##本次演示的是把binlog文件和位置点记录到文件repl_pos.log**
#[root@node02 soft]# cat sync/repl_pos.log 
#[log_position]
#filename = mysql-bin.000111
#position = 360752645
###################################

**# ch server信息,数据同步之后写入这里**
[clickhouse_server]
host=127.0.0.1
port=9000
passwd=''
user='default'
#字段大小写. 1是大写,0是小写
column_lower_upper=0

**# 须要同步的数据库**
[only_schemas]
schemas='test001'

**# 须要同步的表**
[only_tables]
tables='tb1'

# 指定库表跳过DML语句(update,delete可选)
[skip_dmls_sing]
skip_delete_tb_name = ''
skip_update_tb_name = ''

#跳过全部表的DML语句(update,delete可选)
[skip_dmls_all]
#skip_type = 'delete'
#skip_type = 'delete,update'
skip_type = ''

[bulk_insert_nums]
**#多少记录提交一次,使用pypy运行推荐2w记录提交。**
insert_nums=20000
**#选择每隔多少秒同步一次,负数表示不启用,单位秒**
#interval=60
interval=1

# 告警邮件设置
[failure_alarm]
mail_host= 'smtp.xx.com'
mail_port= 25
mail_user= 'xx'
mail_pass= 'xxx'
mail_send_from = 'xxx'
#报警收件人
alarm_mail = 'yymysql@gmail.com'

**#日志存放路径**
[repl_log]
log_dir="/tmp/relication_mysql_clickhouse.log"

3.1.五、启动同步程序

默认pos点就是记录文件,无需再指定记录binlog pos方式,启动同步程序:

[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 
13:26:55 INFO     开始同步数据时间 2021-07-15 13:26:55
13:26:55 INFO     同步binlog pos点从文件读取
13:26:55 INFO     从服务器 172.16.0.246:3306 同步数据
13:26:55 INFO     读取binlog: mysql-bin.000111:360750299
13:26:55 INFO     同步到clickhouse server 127.0.0.1:9000
13:26:55 INFO     同步到clickhouse的数据库: ['test001']
13:26:55 INFO     同步到clickhouse的表: ['tb1']

13:27:59 INFO     INSERT 数据插入SQL: INSERT INTO test001.tb1 VALUES, [{u'id': 1, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 2, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 3, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 4, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 5, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 6, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 7, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 8, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 9, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 10, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 11, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}] 
13:28:31 INFO     INSERT 数据插入SQL: INSERT INTO test001.tb1 VALUES, [{u'id': 12, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 13, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 14, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 15, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 16, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 17, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 18, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 19, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 20, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 21, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}]

3.1.六、验证同步结果

[root@node01 soft]# mysql -e "select * from test001.tb1 where 1=1 order by id limit 20;"
+----+-----------+------------+---------------------+
| id | pay_money | pay_day    | pay_time            |
+----+-----------+------------+---------------------+
|  1 |     66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
|  2 |     66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
|  3 |     66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
+----+-----------+------------+---------------------+

[root@node02 sync]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.tb1 where 1=1 order by id limit 20;"
1   66.22   2019-06-29  2019-06-29 14:00:00
2   66.22   2019-06-29  2019-06-29 14:00:00
3   66.22   2019-06-29  2019-06-29 14:00:00

3.二、新增一张MySQL表同步到clickhouse库

node01库服务器test001库下再新增一张表:

CREATE TABLE `t_call_log1` (
  `id` bigint(20) NOT NULL COMMENT '记录标识',
  `user_id` bigint(20) NOT NULL COMMENT '用户标识',
  `customer_id` bigint(20) DEFAULT NULL COMMENT '客户标识',
  `city_id` bigint(20) DEFAULT NULL COMMENT '城市标识',
  `phone` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL COMMENT '对方电话',
  `name` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '对方名称',
  `is_recorded` bit(1) NOT NULL COMMENT '是否录音',
  `file_size` bigint(20) DEFAULT NULL COMMENT '文件大小(字节)',
  `file_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '文件名称',
  `created_time` datetime NOT NULL COMMENT '建立时间',
  `modified_time` datetime DEFAULT NULL COMMENT '修改时间',
  `call_type` tinyint(4) DEFAULT '1' COMMENT '呼叫方式(1,手机 2,呼叫中心)',
  `call_id` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '智齿id',
  `status_id` tinyint(4) DEFAULT '-1' COMMENT '当前客户状态1.未授信;2.已授信;3.已成单;4.全退租',
  `contact_id` bigint(20) DEFAULT '0' COMMENT '联系人id',
  PRIMARY KEY (`id`),
  KEY `index_phone` (`phone`),
  KEY `fk_clog_user_id` (`user_id`) USING BTREE,
  KEY `index_customer_id` (`customer_id`),
  KEY `index_call_id` (`call_id`) USING BTREE,
  KEY `idx_created_time` (`created_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='电话记录表'

node02服务器上的clickhouse服务也新增一张表t_call_log1:

[root@node02 data]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090

CREATE TABLE t_call_log1
ENGINE = MergeTree
PARTITION BY toYYYYMM(created_time)
ORDER BY created_time AS
SELECT *
FROM mysql('172.16.0.246:3306', 'test001', 't_call_log1', 'click_rep', 'jwts996');

或者以下:
[root@node02 proxysql]#  mysql -u clicku -pclickp -h 127.0.0.1 -P6090
clicku@node02 12:56:  [test001]> CREATE TABLE t_call_log1 ENGINE = MergeTree PARTITION BY toYYYYMM(created_time) ORDER BY created_time AS SELECT * FROM mysql('172.16.0.246:3306', 'test001', 't_call_log1', 'click_rep', 'jwts996'); 
Query OK, 0 rows affected (0.01 sec)

clicku@node02 16:14:  [(none)]> show create test001.t_call_log1\G
*************************** 1. row ***************************
statement: CREATE TABLE test001.t_call_log1
(
    `id` Int64,
    `user_id` Int64,
    `customer_id` Nullable(Int64),
    `city_id` Nullable(Int64),
    `phone` String,
    `name` Nullable(String),
    `is_recorded` String,
    `file_size` Nullable(Int64),
    `file_name` Nullable(String),
    `created_time` DateTime,
    `modified_time` Nullable(DateTime),
    `call_type` Nullable(Int8),
    `call_id` String,
    `status_id` Nullable(Int8),
    `contact_id` Nullable(Int64)
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(created_time)
ORDER BY created_time
SETTINGS index_granularity = 8192
1 row in set (0.00 sec)

3.2.一、验证node01主库mysql下test001.t_call_log1表增删改查

配置文件再新增一张表:

[root@tidb04 ~]# egrep "t_call_log1|test001" /data/soft/sync/metainfo.conf
schemas='test001'
tables='tb1,t_call_log1'

启动同步程序:

[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 
16:19:02 INFO     开始同步数据时间 2021-07-18 16:19:02
16:19:02 INFO     同步binlog pos点从文件读取
16:19:02 INFO     从服务器 172.16.0.246:3306 同步数据
16:19:02 INFO     读取binlog: mysql-bin.000111:360767728
16:19:02 INFO     同步到clickhouse server 127.0.0.1:9000
16:19:02 INFO     同步到clickhouse的数据库: ['test001']
16:19:02 INFO     同步到clickhouse的表: ['tb1', 't_call_log1']

MySQL下的test001.t_call_log1 表插入数据:

insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(1,001,1,0001,18535001234,'小花',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(2,001,1,0001,18535001234,'张婉',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(3,001,1,0001,18535001234,'李四',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(4,001,1,0001,18535001234,'王五',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(5,001,1,0001,18535001234,'赵六',0,null,null,now(),now(),1,1,1,0);

查看同步日志:

[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 
16:19:02 INFO     开始同步数据时间 2021-07-18 16:19:02
16:19:02 INFO     同步binlog pos点从文件读取
16:19:02 INFO     从服务器 172.16.0.246:3306 同步数据
16:19:02 INFO     读取binlog: mysql-bin.000111:360767728
16:19:02 INFO     同步到clickhouse server 127.0.0.1:9000
16:19:02 INFO     同步到clickhouse的数据库: ['test001']
16:19:02 INFO     同步到clickhouse的表: ['tb1', 't_call_log1']

16:19:47 INFO     INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 1, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u5c0f\u82b1', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'modified_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:20:46 INFO     INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 2, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u5f20\u5a49', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 20, 46), u'modified_time': datetime.datetime(2021, 7, 18, 16, 20, 46), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:21:33 INFO     INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 3, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u674e\u56db', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:21:33 INFO     INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 4, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u738b\u4e94', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:21:34 INFO     INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 5, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u8d75\u516d', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 34), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 34), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}]

验证clickhouser表数据:

[root@node02 proxysql]#  clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1;"
5   1   1   1   18535001234 赵六  0   \N  \N  2021-07-18 16:21:34 2021-07-18 16:21:34 1   1   1   0
1   1   1   1   18535001234 小花  0   \N  \N  2021-07-18 16:19:47 2021-07-18 16:19:47 1   1   1   0
2   1   1   1   18535001234 张婉  0   \N  \N  2021-07-18 16:20:46 2021-07-18 16:20:46 1   1   1   0
3   1   1   1   18535001234 李四  0   \N  \N  2021-07-18 16:21:33 2021-07-18 16:21:33 1   1   1   0
4   1   1   1   18535001234 王五  0   \N  \N  2021-07-18 16:21:33 2021-07-18 16:21:33 1   1   1   0

update 跟新MySQL表

root@node01 16:23:  [test001]> update t_call_log1 set name='百万' where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

root@node01 16:24:  [test001]> select * from t_call_log1;
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
| id | user_id | customer_id | city_id | phone       | name   | is_recorded | file_size | file_name | created_time        | modified_time       | call_type | call_id | status_id | contact_id |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
|  1 |       1 |           1 |       1 | 18535001234 | 百万   |             |      NULL | NULL      | 2021-07-18 16:19:47 | 2021-07-18 16:19:47 |         1 | 1       |         1 |          0 |
|  2 |       1 |           1 |       1 | 18535001234 | 张婉   |             |      NULL | NULL      | 2021-07-18 16:20:46 | 2021-07-18 16:20:46 |         1 | 1       |         1 |          0 |
|  3 |       1 |           1 |       1 | 18535001234 | 李四   |             |      NULL | NULL      | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 |         1 | 1       |         1 |          0 |
|  4 |       1 |           1 |       1 | 18535001234 | 王五   |             |      NULL | NULL      | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 |         1 | 1       |         1 |          0 |
|  5 |       1 |           1 |       1 | 18535001234 | 赵六   |             |      NULL | NULL      | 2021-07-18 16:21:34 | 2021-07-18 16:21:34 |         1 | 1       |         1 |          0 |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
5 rows in set (0.00 sec)

同步日志以下:

16:24:12 INFO     INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 1, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u767e\u4e07', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'modified_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}]

clickhouse库验证:

[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1 where name='百万';"
1   1   1   1   18535001234 百万  0   \N  \N  2021-07-18 16:19:47 2021-07-18 16:19:47 1   1   1   0
[root@node02 proxysql]#

delete删表:

root@node01 16:26:  [test001]> delete from t_call_log1  where id in(4,5);
Query OK, 2 rows affected (0.00 sec)

root@node01 16:27:  [test001]> select * from t_call_log1;
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
| id | user_id | customer_id | city_id | phone       | name   | is_recorded | file_size | file_name | created_time        | modified_time       | call_type | call_id | status_id | contact_id |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
|  1 |       1 |           1 |       1 | 18535001234 | 百万   |             |      NULL | NULL      | 2021-07-18 16:19:47 | 2021-07-18 16:19:47 |         1 | 1       |         1 |          0 |
|  2 |       1 |           1 |       1 | 18535001234 | 张婉   |             |      NULL | NULL      | 2021-07-18 16:20:46 | 2021-07-18 16:20:46 |         1 | 1       |         1 |          0 |
|  3 |       1 |           1 |       1 | 18535001234 | 李四   |             |      NULL | NULL      | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 |         1 | 1       |         1 |          0 |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
3 rows in set (0.00 sec)

同步日志以下:

16:27:18 INFO     DELETE 数据删除SQL: alter table test001.t_call_log1 delete where id in (4) 
16:27:18 INFO     DELETE 数据删除SQL: alter table test001.t_call_log1 delete where id in (5)

clickhouse库验证:

[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1 ;"
1   1   1   1   18535001234 百万  0   \N  \N  2021-07-18 16:19:47 2021-07-18 16:19:47 1   1   1   0
2   1   1   1   18535001234 张婉  0   \N  \N  2021-07-18 16:20:46 2021-07-18 16:20:46 1   1   1   0
3   1   1   1   18535001234 李四  0   \N  \N  2021-07-18 16:21:33 2021-07-18 16:21:33 1   1   1   0

结果一致

参考文档:
http://www.javashuo.com/article/p-mwthqvpm-be.html在此要特别感谢师兄邓亚运提供的生产解决案例

相关文章
相关标签/搜索