node01 部署 mysql数据库服务,MySQL版本5.7.22
node02 部署 clickhouse库服务,proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm ,以及同步数据程序服务
特别申明: 同步的程序时python语言写的,目前只支持到pyth2.7.5版本
服务器IP:html
node01 172.16.0.246 node02 172.16.0.197
clickhouse服务客户端和服务端的版本:node
[root@node01 ~]# clickhouse-server -V ClickHouse server version 20.8.3.18. [root@node02 ~]# clickhouse-client -V ClickHouse client version 20.8.3.18.
服务器系统和python版本:python
[root@node02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@node02 ~]# python -V Python 2.7.5 [root@node01 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@node01 ~]# python -V Python 2.7.5
node02机器 proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm软件安装说明:
主要是为了clickhouse兼容mysql协议mysql
此处node2服务器 clickhouse服务单机版安装就再也不演示
具体安装参考链接:https://blog.51cto.com/wujianwei/2949877
简单说明下安装完单机版clickhouse服务:linux
[root@node02 soft]# clickhouse-client -h 127.0.0.1 -m -q "show databases;" _temporary_and_external_tables default system
default数据库里面没有任何东西,和mysql里面的test库是同样的。system库看名字就知道是什么。git
clickhouse服务安装补充,在官方的文档里面有几点建议:github
echo 'performance' | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor echo 0 > /proc/sys/vm/overcommit_memory echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
mysql部署请自行部署。这里不作介绍。
若是想从mysql同步数据那么binlog 格式必须是row。并且必须binlog_row_image=fullredis
安装同步程序依赖的包;同步程序能够放在clickhouse服务器上面,也能够单独放在其余服务器。
同步程序使用pypy启动,能够提升同步数据的速度。
因此安装包的时候须要安装pypy相关的软件包和依赖
具体安装命令以下:sql
yum -y install pypy-libs pypy pypy-devel
wget https://bootstrap.pypa.io/get-pip.py
提示:一开始下载这个文件后,执行pypy get-pip.py 提示须要python3的环境,python2.7的环境须要下载下面的版本
wget https://bootstrap.pypa.io/pip/2.7/get-pip.py
pypy get-pip.py数据库
执行安装下面的命令,可是直接复制粘贴一堆报错,因为对python不是很熟,致使没法进行下去:
/usr/lib64/pypy-5.0.1/bin/pip install MySQL-python /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication /usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver==0.0.20 /usr/lib64/pypy-5.0.1/bin/pip install redis
通过几回测试:按照下面的顺序安装python模块,能够安装成功:
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install MySQL-python DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting MySQL-python Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip (108 kB) |████████████████████████████████| 108 kB 2.6 MB/s Building wheels for collected packages: MySQL-python Building wheel for MySQL-python (setup.py) ... done Created wheel for MySQL-python: filename=MySQL_python-1.2.5-pp27-pypy_41-linux_x86_64.whl size=49118 sha256=ff86f8fba2433c5d623d1bf2158b8d9f8ab346b8f09dcfa9acfc074130e07bcb Stored in directory: /root/.cache/pip/wheels/21/03/d9/41cbcc2b332380d24663723922354ab876fffe2224b259a834 Successfully built MySQL-python Installing collected packages: MySQL-python Successfully installed MySQL-python-1.2.5
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication==0.24 DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting mysql-replication==0.24 Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/ec/5e/36b87b6068210f1fbd606768e0c2541727a229ac0ebf557d65fd31bc79e9/mysql-replication-0.24.tar.gz (33 kB) Collecting pymysql>=0.6 Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/2b/c4/3c3e7e598b1b490a2525068c22f397fda13f48623b7bd54fb209cd0ab774/PyMySQL-1.0.0.tar.gz (45 kB) |████████████████████████████████| 45 kB 37.2 MB/s ERROR: Command errored out with exit status 1: command: /usr/bin/pypy -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-XgbqzK/pymysql/setup.py'"'"'; __file__='"'"'/tmp/pip-install-XgbqzK/pymysql/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-fnr4Dg cwd: /tmp/pip-install-XgbqzK/pymysql/ Complete output (5 lines): Traceback (most recent call last): File "<string>", line 1, in <module> File "/tmp/pip-install-XgbqzK/pymysql/setup.py", line 6, in <module> with open("./README.rst", encoding="utf-8") as f: TypeError: __init__() got an unexpected keyword argument 'encoding' ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
执行上面的命令遇到报错,详细看报错提示,按照报错提示Collecting pymysql>=0.6 说明缺乏这个模块,并且要求版本要大于0.6.因而采用下面的命令安装成功:
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install pymysql==0.6 DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting pymysql==0.6 Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/0c/3b/17407490b878d2abbc0c544ff71491e08932d1d44225b84a103eae317b7c/PyMySQL-0.6.tar.gz (52 kB) |████████████████████████████████| 52 kB 748 kB/s Building wheels for collected packages: pymysql Building wheel for pymysql (setup.py) ... done Created wheel for pymysql: filename=PyMySQL-0.6-py2-none-any.whl size=60771 sha256=6947b8d7c9e24e3d13982b4871f06c97923231b2223d8e2442f5ccce41fb4548 Stored in directory: /root/.cache/pip/wheels/e0/b8/37/bbe7db22c5f90fb4dc04e9766ca49aa05a5f76acd0956e62ce Successfully built pymysql Installing collected packages: pymysql Successfully installed pymysql-0.6
接着执行下面的安装命令:
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication==0.24 DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting mysql-replication==0.24 Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/ec/5e/36b87b6068210f1fbd606768e0c2541727a229ac0ebf557d65fd31bc79e9/mysql-replication-0.24.tar.gz (33 kB) Requirement already satisfied: pymysql>=0.6 in /usr/lib64/pypy-5.0.1/site-packages (from mysql-replication==0.24) (0.6) Building wheels for collected packages: mysql-replication Building wheel for mysql-replication (setup.py) ... done Created wheel for mysql-replication: filename=mysql_replication-0.24-py2-none-any.whl size=42153 sha256=8c9ba52edb99fc8c17c07b329f0a1b00ac535cd87b4562ec90fbc6cc1f367512 Stored in directory: /root/.cache/pip/wheels/11/a5/cd/912029dfb7e8a159dc3d439416f6cf8bccc65cc010cf124fff Successfully built mysql-replication Installing collected packages: mysql-replication Successfully installed mysql-replication-0.24
接着执行下面的安装命令:
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver==0.0.20 DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting clickhouse-driver==0.0.20 Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/9e/a4/bc945ee53254b6f38fd9c7ee6e97a5834c116a68220d1910bf0850c7bc64/clickhouse-driver-0.0.20.tar.gz (36 kB) Collecting pytz Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/70/94/784178ca5dd892a98f113cdd923372024dc04b8d40abe77ca76b5fb90ca6/pytz-2021.1-py2.py3-none-any.whl (510 kB) |████████████████████████████████| 510 kB 19.9 MB/s Collecting enum34 Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/6f/2c/a9386903ece2ea85e9807e0e062174dc26fdce8b05f216d00491be29fad5/enum34-1.1.10-py2-none-any.whl (11 kB) Collecting ipaddress Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/c2/f8/49697181b1651d8347d24c095ce46c7346c37335ddc7d255833e7cde674d/ipaddress-1.0.23-py2.py3-none-any.whl (18 kB) Building wheels for collected packages: clickhouse-driver Building wheel for clickhouse-driver (setup.py) ... done Created wheel for clickhouse-driver: filename=clickhouse_driver-0.0.20-py2-none-any.whl size=50313 sha256=828b07473b373d9b9ef0538e76b192cd2af592951d41175acfd1cc5b68206ed5 Stored in directory: /root/.cache/pip/wheels/ef/f1/f0/1926c46953bd8f9d65f1176efc995c223006504c4fbfe37a73 Successfully built clickhouse-driver Installing collected packages: pytz, enum34, ipaddress, clickhouse-driver Successfully installed clickhouse-driver-0.0.20 enum34-1.1.10 ipaddress-1.0.23 pytz-2021.1 [root@node02 soft]#
接着执行下面的安装命令:
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install redis DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting redis Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/a7/7c/24fb0511df653cf1a5d938d8f5d19802a88cef255706fdda242ff97e91b7/redis-3.5.3-py2.py3-none-any.whl (72 kB) |████████████████████████████████| 72 kB 3.0 MB/s Installing collected packages: redis Successfully installed redis-3.5.3 [root@node02 soft]#
说明: 这里也安装了redis模块是由于同步的binlog pos能够存放在redis里面,固然程序也是支持存放在文件里面。
查看已经安装完成的模块:
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip list|egrep -i "MySQL-python|mysql-replication|clickhouse-driver|redis" DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality. clickhouse-driver 0.0.20 MySQL-python 1.2.5 mysql-replication 0.24 redis 3.5.3
proxysql安装(主要是为了clickhouse兼容mysql协议):
proxysql在这里下载:https://github.com/sysown/proxysql/releases 选择带clickhouse的包下载,不然不会支持clickhouse。
ps:ClickHouse server version 20.8.3.18版本的clickhouse已经原生兼容mysql协议。可是再同步MySQL数据时,有严格的格式要求,目前还不能很好的结合已有的MySQL库数据进行配置同步到clickhouser库
proxysql安装及配置以下:
[root@node02 soft]# rpm -ivh proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:proxysql-2.0.13-1 warning: group proxysql does not exist - using root warning: group proxysql does not exist - using root ################################# [100%] Created symlink from /etc/systemd/system/multi-user.target.wants/proxysql.service to /etc/systemd/system/proxysql.service.
启动(必须这样启动,不然是不支持clickhouse的:
proxysql --clickhouse-server [root@node02 soft]# proxysql --clickhouse-server 2021-07-15 12:54:28 [INFO] Using config file /etc/proxysql.cnf 2021-07-15 12:54:28 [INFO] Using OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019 2021-07-15 12:54:28 [INFO] No SSL keys/certificates found in datadir (/var/lib/proxysql). Generating new keys/certificates. [root@node02 soft]#
[root@node02 soft]# ss -lntup|grep proxysql tcp LISTEN 0 128 *:6090 *:* users:(("proxysql",pid=20648,fd=28)) tcp LISTEN 0 128 *:6032 *:* users:(("proxysql",pid=20648,fd=27)) tcp LISTEN 0 1024 *:6033 *:* users:(("proxysql",pid=20648,fd=26)) tcp LISTEN 0 1024 *:6033 *:* users:(("proxysql",pid=20648,fd=25)) tcp LISTEN 0 1024 *:6033 *:* users:(("proxysql",pid=20648,fd=24)) tcp LISTEN 0 1024 *:6033 *:* users:(("proxysql",pid=20648,fd=23))
登陆proxsql服务端:
[root@node02 soft]# mysql -uadmin -padmin -h127.0.0.1 -P6032 -e "show databases;" mysql: [Warning] Using a password on the command line interface can be insecure. +-----+---------------+-------------------------------------+ | seq | name | file | +-----+---------------+-------------------------------------+ | 0 | main | | | 2 | disk | /var/lib/proxysql/proxysql.db | | 3 | stats | | | 4 | monitor | | | 5 | stats_history | /var/lib/proxysql/proxysql_stats.db | +-----+---------------+-------------------------------------+
登陆proxysql,设置clicku帐户,经过这个帐户来登陆后端的clickhouse服务:
mysql -uadmin -padmin -h127.0.0.1 -P6032 admin@node02 12:57: [(none)]> select * from clickhouse_users; Empty set (0.00 sec) admin@node02 12:57: [(none)]> INSERT INTO clickhouse_users VALUES ('clicku','clickp',1,100); LOAD CLICKHOUSE USERS TO RUNTIME; SAVE CLICKHOUSE USERS TO DISK; admin@node02 12:57: [(none)]> select * from clickhouse_users; +----------+----------+--------+-----------------+ | username | password | active | max_connections | +----------+----------+--------+-----------------+ | clicku | clickp | 1 | 100 | +----------+----------+--------+-----------------+ 1 row in set (0.00 sec)
使用proxysql链接到clickhouse:
[root@node02 soft]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 -e "show databases;" mysql: [Warning] Using a password on the command line interface can be insecure. +--------------------------------+ | name | +--------------------------------+ | _temporary_and_external_tables | | default | | system |
mysql同步数据到clickhouse
####3.1.一、登陆node01 MySQL库建立须要同步的测试库和测试表
root@node01 13:05: [(none)]> create database test001; Query OK, 1 row affected (0.00 sec) root@node01 13:05: [(none)]> root@node01 13:05: [(none)]> use test001; Database changed root@node01 13:05: [test001]> CREATE TABLE `tb1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `pay_money` decimal(20,2) NOT NULL DEFAULT '0.00', `pay_day` date NOT NULL, `pay_time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; Query OK, 0 rows affected (0.02 sec) **建立复制node01库的帐户:** GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'click_rep'@'172.16.0.197' identified by 'jwts996';flush privileges; root@node01 13:05: [test001]> GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'click_rep'@'172.16.0.197' identified by 'jwts996';flush privileges; Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.01 sec) root@node01 13:09: [test001]>
1. clickhoue里面建库,建表:
[root@node02 soft]# clickhouse-client -h 127.0.0.1 -m -q "show databases;" _temporary_and_external_tables default system node02 :) create database test001;
2. 建表(clickhouse建表的格式以及字段类型和mysql彻底不同,若是字段少还能够本身建,若是字段多比较痛苦,可使用clickhouse自带的从mysql导数据的命令来建表),在建表以前须要进行受权,由于程序同步也是模拟一个从库拉取数据.
登录clickhouse进行建表:
CREATE TABLE tb1 ENGINE = MergeTree PARTITION BY toYYYYMM(pay_time) ORDER BY pay_time AS SELECT * FROM mysql('172.16.0.246:3306', 'test001', 'tb1', 'click_rep', 'jwts996');
关于clickhouse表结构的说明:
[root@node02 soft]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 -e " show create table test001.tb1;" mysql: [Warning] Using a password on the command line interface can be insecure. +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | statement | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | CREATE TABLE test001.tb1 ( `id` UInt32, `pay_money` String, `pay_day` Date, `pay_time` DateTime ) ENGINE = MergeTree PARTITION BY toYYYYMM(pay_time) ORDER BY pay_time SETTINGS index_granularity = 8192 这里使用MergeTree引擎,MergeTree是clickhouse里面最牛逼的引擎,支持海量数据,支持索引,支持分区,支持更新删除。toYYYYMM(pay_time)的意思是根据pay_time分区,粒度是按月。 ORDER BY (pay_time)的意思是根据pay_time排序存储,同时也是索引。上面的create table命令若是mysql表里面之后数据那么数据也会一并进入clickhouse里面。 其中这里的index_granularity = 8192是指索引的粒度。若是数据量没有达到百亿,那么一般无需更改.
[root@node02 sync]# pypy mysql-clickhouse-replication.py --help Traceback (most recent call last): File "mysql-clickhouse-replication.py", line 10, in <module> import MySQLdb File "/usr/lib64/pypy-5.0.1/site-packages/MySQLdb/__init__.py", line 19, in <module> import _mysql ImportError: unable to load extension module '/usr/lib64/pypy-5.0.1/site-packages/_mysql.pypy-41.so': libmysqlclient.so.20: cannot open shared object file: No such file or directory [root@node02 sync]#
解决办法:
[root@node02 sync]# ln -sv /usr/local/mysql-5.7.22-linux-glibc2.12-x86_64/lib/libmysqlclient.so.20 /usr/lib64/ ‘/usr/lib64/libmysqlclient.so.20’ -> ‘/usr/local/mysql-5.7.22-linux-glibc2.12-x86_64/lib/libmysqlclient.so.20’
[root@node02 sync]# pypy mysql-clickhouse-replication.py --help usage: Data Replication to clikhouse [-h] [-c CONF] [-d] [-l] mysql data is copied to clikhouse optional arguments: -h, --help show this help message and exit -c CONF, --conf CONF Data synchronization information file -d, --debug Display SQL information -l, --logtoredis log position to redis ,default file By dengyayun @2019
到此处同步程序算是安装完成
表结构也建立完成之后如今配置同步程序配置文件metainfo.conf
配置文件内容以下:
[root@node02 sync]# cat metainfo.conf # 从这里同步数据 [master_server] host='172.16.0.246' port=3306 user='click_rep' passwd='jwts996' server_id=172160246 # redis配置信息,用于存放pos点 [redis_server] host='127.0.0.1' port=6379 passwd='xx' log_pos_prefix='log_pos_' **##这次演示没采用redis来存放指定的binglog文件和pos位置点** #把log_position记录到文件 [log_position] file='./repl_pos.log' **##本次演示的是把binlog文件和位置点记录到文件repl_pos.log** #[root@node02 soft]# cat sync/repl_pos.log #[log_position] #filename = mysql-bin.000111 #position = 360752645 ################################### **# ch server信息,数据同步之后写入这里** [clickhouse_server] host=127.0.0.1 port=9000 passwd='' user='default' #字段大小写. 1是大写,0是小写 column_lower_upper=0 **# 须要同步的数据库** [only_schemas] schemas='test001' **# 须要同步的表** [only_tables] tables='tb1' # 指定库表跳过DML语句(update,delete可选) [skip_dmls_sing] skip_delete_tb_name = '' skip_update_tb_name = '' #跳过全部表的DML语句(update,delete可选) [skip_dmls_all] #skip_type = 'delete' #skip_type = 'delete,update' skip_type = '' [bulk_insert_nums] **#多少记录提交一次,使用pypy运行推荐2w记录提交。** insert_nums=20000 **#选择每隔多少秒同步一次,负数表示不启用,单位秒** #interval=60 interval=1 # 告警邮件设置 [failure_alarm] mail_host= 'smtp.xx.com' mail_port= 25 mail_user= 'xx' mail_pass= 'xxx' mail_send_from = 'xxx' #报警收件人 alarm_mail = 'yymysql@gmail.com' **#日志存放路径** [repl_log] log_dir="/tmp/relication_mysql_clickhouse.log"
默认pos点就是记录文件,无需再指定记录binlog pos方式,启动同步程序:
[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 13:26:55 INFO 开始同步数据时间 2021-07-15 13:26:55 13:26:55 INFO 同步binlog pos点从文件读取 13:26:55 INFO 从服务器 172.16.0.246:3306 同步数据 13:26:55 INFO 读取binlog: mysql-bin.000111:360750299 13:26:55 INFO 同步到clickhouse server 127.0.0.1:9000 13:26:55 INFO 同步到clickhouse的数据库: ['test001'] 13:26:55 INFO 同步到clickhouse的表: ['tb1'] 13:27:59 INFO INSERT 数据插入SQL: INSERT INTO test001.tb1 VALUES, [{u'id': 1, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 2, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 3, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 4, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 5, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 6, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 7, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 8, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 9, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 10, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 11, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}] 13:28:31 INFO INSERT 数据插入SQL: INSERT INTO test001.tb1 VALUES, [{u'id': 12, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 13, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 14, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 15, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 16, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 17, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 18, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 19, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 20, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 21, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}]
[root@node01 soft]# mysql -e "select * from test001.tb1 where 1=1 order by id limit 20;" +----+-----------+------------+---------------------+ | id | pay_money | pay_day | pay_time | +----+-----------+------------+---------------------+ | 1 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 | | 2 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 | | 3 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 | +----+-----------+------------+---------------------+ [root@node02 sync]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.tb1 where 1=1 order by id limit 20;" 1 66.22 2019-06-29 2019-06-29 14:00:00 2 66.22 2019-06-29 2019-06-29 14:00:00 3 66.22 2019-06-29 2019-06-29 14:00:00
node01库服务器test001库下再新增一张表:
CREATE TABLE `t_call_log1` ( `id` bigint(20) NOT NULL COMMENT '记录标识', `user_id` bigint(20) NOT NULL COMMENT '用户标识', `customer_id` bigint(20) DEFAULT NULL COMMENT '客户标识', `city_id` bigint(20) DEFAULT NULL COMMENT '城市标识', `phone` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL COMMENT '对方电话', `name` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '对方名称', `is_recorded` bit(1) NOT NULL COMMENT '是否录音', `file_size` bigint(20) DEFAULT NULL COMMENT '文件大小(字节)', `file_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '文件名称', `created_time` datetime NOT NULL COMMENT '建立时间', `modified_time` datetime DEFAULT NULL COMMENT '修改时间', `call_type` tinyint(4) DEFAULT '1' COMMENT '呼叫方式(1,手机 2,呼叫中心)', `call_id` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '智齿id', `status_id` tinyint(4) DEFAULT '-1' COMMENT '当前客户状态1.未授信;2.已授信;3.已成单;4.全退租', `contact_id` bigint(20) DEFAULT '0' COMMENT '联系人id', PRIMARY KEY (`id`), KEY `index_phone` (`phone`), KEY `fk_clog_user_id` (`user_id`) USING BTREE, KEY `index_customer_id` (`customer_id`), KEY `index_call_id` (`call_id`) USING BTREE, KEY `idx_created_time` (`created_time`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='电话记录表'
node02服务器上的clickhouse服务也新增一张表t_call_log1:
[root@node02 data]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 CREATE TABLE t_call_log1 ENGINE = MergeTree PARTITION BY toYYYYMM(created_time) ORDER BY created_time AS SELECT * FROM mysql('172.16.0.246:3306', 'test001', 't_call_log1', 'click_rep', 'jwts996'); 或者以下: [root@node02 proxysql]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 clicku@node02 12:56: [test001]> CREATE TABLE t_call_log1 ENGINE = MergeTree PARTITION BY toYYYYMM(created_time) ORDER BY created_time AS SELECT * FROM mysql('172.16.0.246:3306', 'test001', 't_call_log1', 'click_rep', 'jwts996'); Query OK, 0 rows affected (0.01 sec) clicku@node02 16:14: [(none)]> show create test001.t_call_log1\G *************************** 1. row *************************** statement: CREATE TABLE test001.t_call_log1 ( `id` Int64, `user_id` Int64, `customer_id` Nullable(Int64), `city_id` Nullable(Int64), `phone` String, `name` Nullable(String), `is_recorded` String, `file_size` Nullable(Int64), `file_name` Nullable(String), `created_time` DateTime, `modified_time` Nullable(DateTime), `call_type` Nullable(Int8), `call_id` String, `status_id` Nullable(Int8), `contact_id` Nullable(Int64) ) ENGINE = MergeTree PARTITION BY toYYYYMM(created_time) ORDER BY created_time SETTINGS index_granularity = 8192 1 row in set (0.00 sec)
配置文件再新增一张表:
[root@tidb04 ~]# egrep "t_call_log1|test001" /data/soft/sync/metainfo.conf schemas='test001' tables='tb1,t_call_log1'
启动同步程序:
[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 16:19:02 INFO 开始同步数据时间 2021-07-18 16:19:02 16:19:02 INFO 同步binlog pos点从文件读取 16:19:02 INFO 从服务器 172.16.0.246:3306 同步数据 16:19:02 INFO 读取binlog: mysql-bin.000111:360767728 16:19:02 INFO 同步到clickhouse server 127.0.0.1:9000 16:19:02 INFO 同步到clickhouse的数据库: ['test001'] 16:19:02 INFO 同步到clickhouse的表: ['tb1', 't_call_log1']
MySQL下的test001.t_call_log1 表插入数据:
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(1,001,1,0001,18535001234,'小花',0,null,null,now(),now(),1,1,1,0); insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(2,001,1,0001,18535001234,'张婉',0,null,null,now(),now(),1,1,1,0); insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(3,001,1,0001,18535001234,'李四',0,null,null,now(),now(),1,1,1,0); insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(4,001,1,0001,18535001234,'王五',0,null,null,now(),now(),1,1,1,0); insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(5,001,1,0001,18535001234,'赵六',0,null,null,now(),now(),1,1,1,0);
查看同步日志:
[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 16:19:02 INFO 开始同步数据时间 2021-07-18 16:19:02 16:19:02 INFO 同步binlog pos点从文件读取 16:19:02 INFO 从服务器 172.16.0.246:3306 同步数据 16:19:02 INFO 读取binlog: mysql-bin.000111:360767728 16:19:02 INFO 同步到clickhouse server 127.0.0.1:9000 16:19:02 INFO 同步到clickhouse的数据库: ['test001'] 16:19:02 INFO 同步到clickhouse的表: ['tb1', 't_call_log1'] 16:19:47 INFO INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 1, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u5c0f\u82b1', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'modified_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 16:20:46 INFO INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 2, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u5f20\u5a49', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 20, 46), u'modified_time': datetime.datetime(2021, 7, 18, 16, 20, 46), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 16:21:33 INFO INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 3, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u674e\u56db', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 16:21:33 INFO INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 4, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u738b\u4e94', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 16:21:34 INFO INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 5, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u8d75\u516d', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 34), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 34), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}]
验证clickhouser表数据:
[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1;" 5 1 1 1 18535001234 赵六 0 \N \N 2021-07-18 16:21:34 2021-07-18 16:21:34 1 1 1 0 1 1 1 1 18535001234 小花 0 \N \N 2021-07-18 16:19:47 2021-07-18 16:19:47 1 1 1 0 2 1 1 1 18535001234 张婉 0 \N \N 2021-07-18 16:20:46 2021-07-18 16:20:46 1 1 1 0 3 1 1 1 18535001234 李四 0 \N \N 2021-07-18 16:21:33 2021-07-18 16:21:33 1 1 1 0 4 1 1 1 18535001234 王五 0 \N \N 2021-07-18 16:21:33 2021-07-18 16:21:33 1 1 1 0
update 跟新MySQL表
root@node01 16:23: [test001]> update t_call_log1 set name='百万' where id=1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 root@node01 16:24: [test001]> select * from t_call_log1; +----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+ | id | user_id | customer_id | city_id | phone | name | is_recorded | file_size | file_name | created_time | modified_time | call_type | call_id | status_id | contact_id | +----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+ | 1 | 1 | 1 | 1 | 18535001234 | 百万 | | NULL | NULL | 2021-07-18 16:19:47 | 2021-07-18 16:19:47 | 1 | 1 | 1 | 0 | | 2 | 1 | 1 | 1 | 18535001234 | 张婉 | | NULL | NULL | 2021-07-18 16:20:46 | 2021-07-18 16:20:46 | 1 | 1 | 1 | 0 | | 3 | 1 | 1 | 1 | 18535001234 | 李四 | | NULL | NULL | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 | 1 | 1 | 1 | 0 | | 4 | 1 | 1 | 1 | 18535001234 | 王五 | | NULL | NULL | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 | 1 | 1 | 1 | 0 | | 5 | 1 | 1 | 1 | 18535001234 | 赵六 | | NULL | NULL | 2021-07-18 16:21:34 | 2021-07-18 16:21:34 | 1 | 1 | 1 | 0 | +----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+ 5 rows in set (0.00 sec)
同步日志以下:
16:24:12 INFO INSERT 数据插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 1, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u767e\u4e07', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'modified_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}]
clickhouse库验证:
[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1 where name='百万';" 1 1 1 1 18535001234 百万 0 \N \N 2021-07-18 16:19:47 2021-07-18 16:19:47 1 1 1 0 [root@node02 proxysql]#
delete删表:
root@node01 16:26: [test001]> delete from t_call_log1 where id in(4,5); Query OK, 2 rows affected (0.00 sec) root@node01 16:27: [test001]> select * from t_call_log1; +----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+ | id | user_id | customer_id | city_id | phone | name | is_recorded | file_size | file_name | created_time | modified_time | call_type | call_id | status_id | contact_id | +----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+ | 1 | 1 | 1 | 1 | 18535001234 | 百万 | | NULL | NULL | 2021-07-18 16:19:47 | 2021-07-18 16:19:47 | 1 | 1 | 1 | 0 | | 2 | 1 | 1 | 1 | 18535001234 | 张婉 | | NULL | NULL | 2021-07-18 16:20:46 | 2021-07-18 16:20:46 | 1 | 1 | 1 | 0 | | 3 | 1 | 1 | 1 | 18535001234 | 李四 | | NULL | NULL | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 | 1 | 1 | 1 | 0 | +----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+ 3 rows in set (0.00 sec)
同步日志以下:
16:27:18 INFO DELETE 数据删除SQL: alter table test001.t_call_log1 delete where id in (4) 16:27:18 INFO DELETE 数据删除SQL: alter table test001.t_call_log1 delete where id in (5)
clickhouse库验证:
[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1 ;" 1 1 1 1 18535001234 百万 0 \N \N 2021-07-18 16:19:47 2021-07-18 16:19:47 1 1 1 0 2 1 1 1 18535001234 张婉 0 \N \N 2021-07-18 16:20:46 2021-07-18 16:20:46 1 1 1 0 3 1 1 1 18535001234 李四 0 \N \N 2021-07-18 16:21:33 2021-07-18 16:21:33 1 1 1 0
结果一致
参考文档:
http://www.javashuo.com/article/p-mwthqvpm-be.html在此要特别感谢师兄邓亚运提供的生产解决案例