kafka 集群部署搭建

时间 2020-10-25 标签 web vim centos bash 服务器 app socket 分布式 svg 大数据

1、简单介绍

Kafka 是一个分布式的基于发布/订阅模式的消息队列（Message Queue），主要应用于
大数据实时处理领域。

web

2、集群规划

主机	centos7-1	centos7-2	centos7-3	centos7-4
kafka	√	√	√	√
zookeeper	√	√	√	√

3、集群部署

解压安装包
修改配置文件

vim  conf/server.properties

#broker 的全局惟一编号，不能重复
broker.id=0

#删除 topic 功能使能
delete.topic.enable=true

#用来处理磁盘 IO 的现成数量
num.io.threads=8

#发送套接字的缓冲区大小
socket.send.buffer.bytes=102400

#接收套接字的缓冲区大小
socket.receive.buffer.bytes=102400

#请求套接字的缓冲区大小
socket.request.max.bytes=104857600

#kafka 运行日志存放的路径，kafka是暂存数据的，这里不只有运行的日志，还有以主题名称-分区号存储的数据
log.dirs=/data/kafka/data/kafka-logs

#topic 在当前 broker 上的分区个数
num.partitions=1

#用来恢复和清理 data 下数据的线程数量
num.recovery.threads.per.data.dir=1

#segment 文件保留的最长时间，超时将被删除
log.retention.hours=168

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

#配置链接 Zookeeper 集群地址
zookeeper.connect=centos7-1:2181,centos7-2:2181,centos7-3:2181

#链接zookeeper超时时间
zookeeper.connection.timeout.ms=18000

# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0

kafka集群不分服务端、客户端，之因此能够识别到集群，是由于他们都交给同一个zookeeper管理，并且在zk上用惟一的broker.id注册，因此另外三台服务器的broker.id修改一下便可，集群搭建完毕。