postgresql集群的搭建

时间 2019-12-09

原文原文链接

postgresql是一款很强大的数据库，具体有多强大呢，请谷歌。。。
网上的相关资料不是不少，参考了大神们的文档，结合本身趟了的坑，写了pg集群的安装文档。可能有一些参数没有配置好，但愿大神们指出，谢谢。参照本文档，基本能实现pg的高可用，稍后会写出测试文档。javascript

架构图

部署详情

ip	host	系统	vip	部署
192.168.123.180	master	CentOS 6.8	192.168.123.183	postgresql、pgpool
192.168.123.181	slave1	CentOS 6.8	192.168.123.183	postgresql、pgpool
192.168.123.182	slave2	CentOS 6.8	192.168.123.183	postgresql

postgresql的安装

一、安装

一、Rpm包下载地址:
https://yum.postgresql.org/rpmchart.php
下载下面的包：
postgresql95-libs-9.5.8-1PGDG.rhel6.x86_64.rpm
postgresql95-contrib-9.5.8-1PGDG.rhel6.x86_64.rpm
postgresql95-9.5.8-1PGDG.rhel6.x86_64.rpm
postgresql95-server-9.5.8-1PGDG.rhel6.x86_64.rpmphp

二、安装：css

yum install -y postgresql95-libs-9.5.8-1PGDG.rhel6.x86_64.rpm yum install -y postgresql95-contrib-9.5.8-1PGDG.rhel6.x86_64.rpm yum install -y postgresql95-9.5.8-1PGDG.rhel6.x86_64.rpm yum install -y postgresql95-server-9.5.8-1PGDG.rhel6.x86_64.rpm

三、初始化db：
service postgresql-9.5 initdb
四、设置为开机启动：
chkconfig postgresql-9.5 on
五、安装的目录：
/var/lib/pgsql/9.5前端

配置白名单与流复制

一、 master的配置java

PostgreSQL流复制默认是异步的。在主服务器上提交事务和从服务器上变化可见之间有一个小的延迟，这个延迟远小于基于文件日志传送，一般1秒能完成。若是主服务器忽然崩溃，可能会有少许数据丢失。
同步复制必须等主服务器和从服务器都写完WAL后才能提交事务。这样在必定程度上会增长事务的响应时间。
配置同步复制仅须要一个额外的配置步骤： synchronous_standby_names必须设置为一个非空值。synchronous_commit也必须设置为on。
这里部署的是异步的流复制node

1.二、配置白名单:
vim /var/lib/pgsql/9.5/data/pg_hba.conf
在配置文件最后加上：sql

host   all             all              192.168.123.180/32      trust
host   all             all              192.168.123.181/32      trust
host   all             all              192.168.123.182/32      trust
host   replication      replica          192.168.123.181/32      trust
host   replication      replica          192.168.123.182/32      trust

保存退出
我这里因为是内网，所有用的trust，若是要用密码，请修改成passwd数据库

1.三、修改配置文件:
vim /var/lib/pgsql/9.5/data/postgresql.confvim

data_directory = '/app/pgsql/data' #自定义data目录 listen_addresses = '*' #监听全部ip archive_mode = on #容许归档 archive_command = 'cp %p /app/pgsql/pg_archive/%f' #使用命令归档 wal_level = hot_standby #选择热备 max_wal_senders = 16 #最多多少个流复制连接 wal_keep_segments = 256 #流复制保留最多的xlog数 wal_sender_timeout = 60s #流复制主机发送数据超时时间 max_connections = 99 #从库的max_connections必须大于主库的

1.四、建立data目录，赋权并修改启动文件后端

mkdir -p /app/pgsql/data && chown postgres:postgres /app/pgsql/data
mkdir -p /app/pgsql/pg_archive && chown postgres:postgres /app/pgsql/pg_archive

cd /app/pgsql && chmod 700 data cd /app/pgsql && chmod 700 pg_archive

1.五、修改启动文件

vim /etc/init.d/postgresql-9.5 PGDATA=/app/pgsql/data

1.六、把原data目录下的文件copy到新的data的目录下,并改变属组：

cp -r /var/lib/pgsql/9.5/data/* /app/pgsql/data/ cd /app/pgsql && chown -R postgres:postgres data

1.七、启动
service postgresql-9.5 start
之后修改配置文件就在/app/pgsql/data下面修改

1.八、建立一个数据库用户进行主从同步。建立用户replica，并赋予登陆和复制的权限
登录到数据库里：

su postgres
psql
CREATE ROLE replica login replication encrypted password 'replica';

防火墙开放5432端口
在从库上：
psql -h 192.168.123.180 -U postgres

登录主库成功

二、从库的配置
2.一、自定义data目录:

mkdir -p /app/pgsql/data/&&chmod 700 /app/pgsql/data/
chown -R  postgres:postgres pgsql

mkdir /app/pgsql/pg_archive
cd /app/pgsql chmod 700 pg_archive && chown postgres:postgres pg_archive/

2.二、切换用户:
su – postgres
2.三、备份数据：
pg_basebackup -h 192.168.123.180 -U replica -D /app/pgsql/data -X stream –P
2.四、修改启动文件:

vim /etc/init.d/postgresql-9.5 PGDATA=/app/pgsql/data

2.五、配置recovery.conf

su postgres
cp /usr/pgsql-9.5/share/recovery.conf.sample /app/pgsql/data/recovery.conf

vim /app/pgsql/data/recovery.conf

standby_mode = on        \#该节点为从 primary_conninfo = 'host=192.168.123.180 port=5432 user=replica password=replica' \#主服务器的ip、user recovery_target_timeline = 'latest' trigger_file = '/tmp/trigger_file0'

2.六、配置postgresql.conf
vim /app/pgsql/data/postgresql.conf

max_connections = 999     #大于主的链接数 max_standby_streaming_delay = 30s wal_receiver_status_interval = 10s hot_standby_feedback = on

2.七、重启服务：
service postgresql-9.5 restart

验证流复制

3.一、在master上登录psql
查看状态:
select client_addr,sync_state from pg_stat_replication;

3.二、建立test库
Create database test;

3.三、slave上登录psql
查看库
playboy => \l

发现已同步。

pgpool的安装

pgpool-II是PostgreSQL服务器之间一种有效的中间件和PostgreSQL数据库客户端。它提供了如下功能。
链接池
pgpool-II保存到PostgreSQL服务器的链接，当一个相同新链接(如用户名、数据库、协议版本)进来时，重用他们。它减小了链接开销，提升了系统的总体吞吐量。
复制
pgpool-II能够管理多个PostgreSQL服务器。使用复制功能可使2个或更多的物理磁盘上建立一个实时备份，这样服务不会因服务器的磁盘故障而中断。
负载平衡
若是数据库是复制的，在任何服务器上执行一个SELECT查询会返回相同的结果。pgpool-II复制特性的优点在于减小每一个PostgreSQL服务器上的负载，由于它可使用分布在多个服务器之间进行SELECT查询，从而提升系统的总体吞吐量。最好是查询和PostgreSQL服务器数量成必定比例，多用户同时执行多查询达到负载均衡最好的效果。
限制链接数
PostgreSQL的最大并发链接数有必定限制的，当超过限制的链接数后，链接会被拒绝。然而，设置增长最大链接数又会增长资源消耗，影响系统性能。pgpool-II也有最大链接数限制，但超过的链接进来时是进行当即排队，而不是返回一个错误。
pgpool-II交互PostgreSQL的后端和前端协议时，起着继电器的做用。所以，数据库应用程序(前端)认为pgpool-II是真实的PostgreSQL服务器，服务器(后端)认为pgpool-II是它的客户端之一。由于pgpool-II在服务器和客户端是透明的，因此pgpool-II可使用现有的数据库应用程序而作到几乎不修改它们。

版本：3.6
机器：192.168.123.180
192.168.123.181
4.一、免密码登录
安装以前先配置密钥使master和slave1这两台虚拟机的postgres用户能免密链接

先修改postgres的密码，在root用户下
passwd postgres
新密码123456

Master到slave1的免密码登录：
在master上切换至postgres用户，生成密钥
su postgres ssh-keygen -t rsa
而后全输入回车

切换到postgres用户：
su postgres ssh-copy-id -i /var/lib/pgsql/.ssh/id_rsa 192.168.123.181
而后ssh 192.168.123.181 成功，实现master到slave1的免密码登录

Slave1到master的免密码登录：
流程同上
4.二、安装
Rpm包下载地址:
http://www.pgpool.net/yum/rpms/3.6/redhat/rhel-6-x86_64/pgpool-II-pg95-3.6.0-1pgdg.rhel6.x86_64.rpm

安装:
yum install pgpool-II-pg95-3.6.0-1pgdg.rhel6.x86_64.rpm

4.三、配置
Master的配置:

配置白名单：
要和pg_hba.conf登录权限一致，这里因为是私有云，都用trust
vim /etc/pgpool-II/pool_hba.conf
在最后改为:

配置pcp管理工具密码:
pg_md5 pwd
密码加密

vim /etc/pgpool-II/pcp.conf
把刚才加密的密码粘贴到文件里

配置系统命令权限:
Root用户下:
chmod u+s /sbin/ifconfig &&chmod u+s /usr/sbin

配置中间件配置文件:
vim /etc/pgpool-II/pgpool.conf

# CONNECTIONS listen_addresses = '*' port = 9999 pcp_listen_addresses = '*' pcp_port = 9898 # - Backend Connection Settings - backend_hostname0 = 'master' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = ' /app/pgsql/data' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = 'slave' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = ' /app/pgsql/data'' backend_flag1 = 'ALLOW_TO_FAILOVER' # - Authentication - enable_pool_hba = off pool_passwd = 'pool_passwd' # FILE LOCATIONS pid_file_name = '/opt/pgpool/pgpool.pid' replication_mode = off load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_period = 5 sr_check_user = 'repuser' sr_check_password = 'repuser' sr_check_database = 'postgres' #------------------------------------------------------------------------------ # HEALTH CHECK 健康检查 #------------------------------------------------------------------------------ health_check_period = 10 # Health check period # Disabled (0) by default health_check_timeout = 20 # Health check timeout # 0 means no timeout health_check_user = ' repuser ' # Health check user health_check_password = ' repuser ' #数据库密码 # Password for health check user health_check_database = 'postgres' #必须设置，不然primary数据库down了，pgpool不知道，不能及时切换。从库流复制还在链接数据，报链接失败。 #只有下次使用pgpool登陆时，发现链接不上，而后报错，这时候，才知道挂了，pgpool进行切换。 #主备切换的命令行配置 #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ failover_command = ' /opt/pgpool/failover_stream.sh %d %H /tmp/trigger_file0 ' #------------------------------------------------------------------------------ # WATCHDOG #------------------------------------------------------------------------------ # - Enabling - use_watchdog = on # - Watchdog communication Settings - wd_hostname = 'master' # Host name or IP address of this watchdog # (change requires restart) wd_port = 9000 # port number for watchdog service # (change requires restart) # - Virtual IP control Setting - delegate_IP = ' 192.168.123.183 ' # delegate IP address # If this is empty, virtual IP never bring up. # (change requires restart) if_cmd_path = '/sbin' # path to the directory where if_up/down_cmd # (change requires restart) if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0' # startup delegate IP command # (change requires restart) if_down_cmd = 'ifconfig eth0:0 down' # shutdown delegate IP command # (change requires restart) # -- heartbeat mode -- wd_heartbeat_port = 9694 # Port number for receiving heartbeat signal # (change requires restart) wd_heartbeat_keepalive = 2 # Interval time of sending heartbeat signal (sec) # (change requires restart) wd_heartbeat_deadtime = 30 # Deadtime interval for heartbeat signal (sec) # (change requires restart) heartbeat_destination0 = 'slave1' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = 'eth0' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) # - Other pgpool Connection Settings - other_pgpool_hostname0 = 'slave' #对端 # Host name or IP address to connect to for # (change requires restart) other_pgpool_port0 = 9999 # Port number for othet pgpool 0 # (change requires restart) other_wd_port0 = 9000 # Port number for othet watchdog 0 # (change requires restart)

4.四、Slave1的配置：

配置白名单：
要和pg_hba.conf登录权限一致，这里因为是私有云，都用trust
vim /etc/pgpool-II/pool_hba.conf
在最后改为:

配置pcp管理工具密码:
pg_md5 pwd
密码加密

vim /etc/pgpool-II/pcp.conf
把刚才加密的密码粘贴到文件里

配置系统命令权限:
Root用户下:
chmod u+s /sbin/ifconfig &&chmod u+s /usr/sbin

配置中间件配置文件:
vim /etc/pgpool-II/pgpool.conf

# CONNECTIONS listen_addresses = '*' port = 9999 pcp_listen_addresses = '*' pcp_port = 9898 # - Backend Connection Settings - backend_hostname0 = 'master' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = ' /app/pgsql/data' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = 'slave1' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/app/pgsql/data ' backend_flag1 = 'ALLOW_TO_FAILOVER' # - Authentication - enable_pool_hba = off pool_passwd = 'pool_passwd' # FILE LOCATIONS pid_file_name = '/opt/pgpool/pgpool.pid' replication_mode = off load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_period = 5 sr_check_user = 'repuser' sr_check_password = 'repuser' sr_check_database = 'postgres' #------------------------------------------------------------------------------ # HEALTH CHECK 健康检查 #------------------------------------------------------------------------------ health_check_period = 10 # Health check period # Disabled (0) by default health_check_timeout = 20 # Health check timeout # 0 means no timeout health_check_user = ' repuser ' # Health check user health_check_password = ' repuser ' #数据库密码 # Password for health check user health_check_database = 'postgres' #必须设置，不然primary数据库down了，pgpool不知道，不能及时切换。从库流复制还在链接数据，报链接失败。 #只有下次使用pgpool登陆时，发现链接不上，而后报错，这时候，才知道挂了，pgpool进行切换。 #主备切换的命令行配置 #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ failover_command = '/opt/pgpool/failover_stream.sh %d %H /tmp/trigger_file0 ' #------------------------------------------------------------------------------ # WATCHDOG #------------------------------------------------------------------------------ # - Enabling - use_watchdog = on # - Watchdog communication Settings - wd_hostname = 'slave1' #本端 # Host name or IP address of this watchdog # (change requires restart) wd_port = 9000 # port number for watchdog service # (change requires restart) # - Virtual IP control Setting - delegate_IP = '192.168.123.183' # delegate IP address # If this is empty, virtual IP never bring up. # (change requires restart) if_cmd_path = '/sbin' # path to the directory where if_up/down_cmd exists # (change requires restart) if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0' # startup delegate IP command # (change requires restart) if_down_cmd = 'ifconfig eth0:0 down' # shutdown delegate IP command # (change requires restart) # -- heartbeat mode -- wd_heartbeat_port = 9694 # Port number for receiving heartbeat signal # (change requires restart) wd_heartbeat_keepalive = 2 # Interval time of sending heartbeat signal (sec) # (change requires restart) wd_heartbeat_deadtime = 30 # Deadtime interval for heartbeat signal (sec) # (change requires restart) heartbeat_destination0 = 'master' #对端 # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = 'eth0' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) # - Other pgpool Connection Settings - other_pgpool_hostname0 = 'master' #对端 # Host name or IP address to connect to for # (change requires restart) other_pgpool_port0 = 9999 # Port number for othet pgpool 0 # (change requires restart) other_wd_port0 = 9000 # Port number for othet watchdog 0 # (change requires restart)

下面在master和slave1这两台机器上
mkdir /opt/pgpool &&chown postgres:postgres /opt/pgpool

master上：

cd /opt/pgpool vim failover_stream.sh

# Failover command for streaming replication. # This script assumes that DB node 0 is primary, and 1 is standby. # # If standby goes down, do nothing. If primary goes down, create a # trigger file so that standby takes over primary node. # # Arguments: $1: failed node id. $2: new master hostname. $3: path to # trigger file. failed_node=192.168.123.180 new_master=192.168.123.181 trigger_file=$3 # Do nothing if standby goes down. if [ $failed_node = 1 ]; then exit 0; fi # Create the trigger file. /usr/bin/ssh -T $new_master /bin/touch $trigger_file exit 0;

保存退出,并赋权
chown postgres:postgres failover_stream.sh &&chmod 777 failover_stream.sh

slave1上：

cd /opt/pgpool vim failover_stream.sh

#! /bin/sh # Failover command for streaming replication. # This script assumes that DB node 0 is primary, and 1 is standby. # # If standby goes down, do nothing. If primary goes down, create a # trigger file so that standby takes over primary node. # # Arguments: $1: failed node id. $2: new master hostname. $3: path to # trigger file. failed_node=192.168.123.181 new_master=192.168.123.180 trigger_file=$3 # Do nothing if standby goes down. if [ $failed_node = 1 ]; then exit 0; fi # Create the trigger file. /usr/bin/ssh -T $new_master /bin/touch $trigger_file exit 0;

保存退出并赋权:
chown postgres:postgres failover_stream.sh &&chmod 777 failover_stream.sh

在master和salve1上建立日志文件:

mkdir /var/log/pgpool chown -R postgres.postgres /var/log/pgpool mkdir /var/run/pgpool chown -R postgres.postgres /var/run/pgpool

配置host：
在master、slave一、slave2的/etc/hosts配置ip映射

192.168.123.180 master 192.168.123.181 slave1 192.168.123.182 slave2

在master上建立repusr帐号:

su postgres
psql
create user repuser with password 'repuser';

防火墙开放：5432 9999 9898 9000 9694
4.五、启动pgpool
先在master上启动pgpool：
pgpool -n -d -D > /var/log/pgpool/pgpool.log 2>&1 &

再在slave1上启动pgpool:
pgpool -n -d -D > /var/log/pgpool/pgpool.log 2>&1 &
Master上的ifconfig会多一个虚拟ip出来

快速中止pgpool的指令:
pgpool -m fast stop
链接虚拟ip进入数据库:

Su postgres psql -h 192.168.123.183 -p 9999

查看集群节点状态：
show pool_nodes;

写在安装后

写文档才发现，边安装边趟坑边记录，还要注意格式是多么愉（xin）快（ku）的事，本着开源精神，贴出来给要用到pg集群的小伙伴们参考，本文侵删