前面的章节介绍了hive的知识,本节博主将分享日志采集框架Flume的相关知识。在一个完整的大数据处理系统中,除了hdfs+mapreduce+hive组成分析系统的核心以外,还须要数据采集、结果数据导出、任务调度等不可或缺的辅助系统,而这些辅助工具在hadoop生态体系中都有便捷的开源框架,如图所示:java
1、Flume介绍node
(1)概述linux
Flume是一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统。
Flume能够采集文件,socket数据包等各类形式源数据,又能够将采集到的数据输出到HDFS、hbase、hive、kafka等众多外部存储系统中
通常的采集需求,经过对flume的简单配置便可实现
Flume针对特殊场景也具有良好的自定义扩展能力,所以,flume能够适用于大部分的平常数据采集场景web
(2)运行机制sql
a、Flume分布式系统中最核心的角色是agent,flume采集系统就是由一个个agent所链接起来造成
b、每个agent至关于一个数据传递员(Source 到 Channel 到 Sink之间传递数据的形式是Event事件;Event事件是一个数据流单元),内部有三个组件:shell
(3)Flume采集系统结构图apache
a、简单结构(单个agent采集数据)json
b、复杂结构(多级agent之间串联)centos
2、Flume实战案例api
(1) Flume的安装部署
一、Flume的安装很是简单,只须要解压便可,固然,前提是已有hadoop环境 下载Flume软件包,官网地址:http://flume.apache.org/ 上传安装包到数据源所在节点上 Alt+p put apache-flume-1.7.0-bin.tar.gz 而后解压 tar -zxvf apache-flume-1.7.0-bin.tar.gz -C apps/ 而后进入flume的目录,修改conf下的flume-env.sh,在里面配置JAVA_HOME 二、根据数据采集的需求配置采集方案,描述在配置文件中(文件名可任意自定义) 三、指定采集方案配置文件,在相应的节点上启动flume agent
先用一个最简单的例子来测试一下程序环境是否正常
a、先在flume的conf目录下新建一个文件
vi netcat-logger.conf
# 定义这个agent中各组件的名字 a1.sources = r1 a1.sinks = k1 a1.channels = c1 # 描述和配置source组件:r1 a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # 描述和配置sink组件:k1 a1.sinks.k1.type = logger # 描述和配置channel组件,此处使用是内存缓存的方式 a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # 描述和配置source channel sink之间的链接关系 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
b、启动agent去采集数据
bin/flume-ng agent -c conf -f conf/netcat-logger.conf -n a1 -Dflume.root.logger=INFO,console
-c conf 指定flume自身的配置文件所在目录 -f conf/netcat-logger.con 指定咱们所描述的采集方案 -n a1 指定咱们这个agent的名字
执行效果:
b、测试
先要往agent采集监听的端口上发送数据,让agent有数据可采
随便在一个能跟agent节点联网的机器上
#先检查telnet是否安装 rpm -qa | grep telnet #检查yum列表中可用telnet版本 yum list | grep telnet #安装 yum install -y telnet-server.x86_64 yum install -y telnet.x86_64
telnet anget-hostname port (telnet localhost 44444)
[hadoop@centos-aaron-h1 ~]$ talnet localhost 44444 -bash: talnet: command not found [hadoop@centos-aaron-h1 ~]$ telnet localhost 44444 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 121 OK sdf OK sd OK fsd OK f OK sd OK f OK sdhello OK hellow
(2) Flume采集文件夹里的文件
#建立在flume下conf文件夹中spooldir-logger.conf文件 vi spooldir-logger.conf #新增如下内容 # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source #监听目录,spoolDir指定目录, fileHeader要不要给文件夹前坠名 a1.sources.r1.type = spooldir a1.sources.r1.spoolDir = /home/hadoop/flumespool a1.sources.r1.fileHeader = true # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
#建立flume监控的文件夹 mkdir /home/hadoop/flumespool
#启动flume服务 bin/flume-ng agent -c ./conf -f ./conf/spooldir-logger.conf -n a1 -Dflume.root.logger=INFO,console
测试: 往/home/hadoop/flumeSpool放文件(mv ././xxxFile /home/hadoop/flumeSpool),可是不要在里面生成文件
执行效果:
[hadoop@centos-aaron-h1 apache-flume-1.6.0-bin]$ bin/flume-ng agent -c ./conf -f ./conf/spooldir-logger.conf -n a1 -Dflume.root.logger=INFO,console Info: Including Hadoop libraries found via (/home/hadoop/apps/hadoop-2.9.1/bin/hadoop) for HDFS access Info: Excluding /home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-api-1.7.25.jar from classpath Info: Excluding /home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar from classpath Info: Including Hive libraries found via (/home/hadoop/apps/apache-hive-1.2.2-bin) for Hive access + exec /usr/local/jdk1.7.0_45/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/home/hadoop/apps/apache-flume-1.6.0-bin/conf:/home/hadoop/apps/apache-flume-1.6.0-bin/lib/*:/home/hadoop/apps/hadoop-2.9.1/etc/hadoop:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/activation-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/avro-1.7.7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-codec-1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-collections-3.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-digester-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-lang-2.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-lang3-3.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-net-3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/curator-client-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/curator-framework-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/gson-2.2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/guava-11.0.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/hadoop-annotations-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/hadoop-auth-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/hamcrest-core-1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/httpclient-4.5.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/httpcore-4.4.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jersey-json-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jets3t-0.9.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jetty-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jetty-sslengine-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jsch-0.1.54.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/json-smart-1.3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jsr305-3.0.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/junit-4.11.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/nimbus-jose-jwt-4.41.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/servlet-api-2.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/snappy-java-1.0.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/stax2-api-3.1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/stax-api-1.0-2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/woodstox-core-5.0.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/xz-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/zookeeper-3.4.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/hadoop-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/hadoop-common-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/hadoop-nfs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/jdiff:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/sources:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/templates:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/hadoop-hdfs-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-annotations-2.7.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-core-2.7.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-databind-2.7.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/okhttp-2.7.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/okio-1.6.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-client-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-native-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-native-client-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-nfs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-rbf-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-rbf-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/jdiff:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/sources:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/templates:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/webapps:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/activation-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/apacheds-i18n-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/api-asn1-api-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/api-util-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/avro-1.7.7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-beanutils-1.7.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-cli-1.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-codec-1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-configuration-1.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-digester-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-lang-2.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-lang3-3.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-math3-3.1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-net-3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/curator-client-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/curator-framework-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/curator-recipes-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/fst-2.50.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/gson-2.2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/guava-11.0.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/guice-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/httpclient-4.5.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/httpcore-4.4.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/java-util-1.9.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/java-xmlbuilder-0.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jcip-annotations-1.0-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-client-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-json-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jets3t-0.9.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jettison-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jetty-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jetty-sslengine-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jsch-0.1.54.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/json-io-2.5.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/json-smart-1.3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jsp-api-2.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/nimbus-jose-jwt-4.41.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/servlet-api-2.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/snappy-java-1.0.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/stax2-api-3.1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/woodstox-core-5.0.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/xmlenc-0.52.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/xz-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-api-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-registry-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-router-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-tests-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/sources:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/test:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/timelineservice:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/avro-1.7.7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/hadoop-annotations-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/junit-4.11.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/snappy-java-1.0.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/jdiff:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib-examples:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/sources:/home/hadoop/apps/hadoop-2.9.1/contrib/capacity-scheduler/*.jar:/home/hadoop/apps/apache-hive-1.2.2-bin/lib/*' -Djava.library.path=:/home/hadoop/apps/hadoop-2.9.1/lib/native org.apache.flume.node.Application -f ./conf/spooldir-logger.conf -n a1 2019-02-13 07:40:44,890 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting 2019-02-13 07:40:44,897 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:./conf/spooldir-logger.conf 2019-02-13 07:40:44,901 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:931)] Added sinks: k1 Agent: a1 2019-02-13 07:40:44,902 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:k1 2019-02-13 07:40:44,902 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:k1 2019-02-13 07:40:44,912 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:141)] Post-validation flume configuration contains configuration for agents: [a1] 2019-02-13 07:40:44,913 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:145)] Creating channels 2019-02-13 07:40:44,921 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel c1 type memory 2019-02-13 07:40:44,927 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:200)] Created channel c1 2019-02-13 07:40:44,931 (conf-file-poller-0) [INFO - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:41)] Creating instance of source r1, type spooldir 2019-02-13 07:40:44,939 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: k1, type: logger 2019-02-13 07:40:44,943 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:114)] Channel c1 connected to [r1, k1] 2019-02-13 07:40:44,950 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:Spool Directory source r1: { spoolDir: /home/hadoop/flumespool } }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@70edf123 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} } 2019-02-13 07:40:44,957 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel c1 2019-02-13 07:40:44,988 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean. 2019-02-13 07:40:44,988 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: CHANNEL, name: c1 started 2019-02-13 07:40:44,989 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink k1 2019-02-13 07:40:44,990 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source r1 2019-02-13 07:40:44,990 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.SpoolDirectorySource.start(SpoolDirectorySource.java:78)] SpoolDirectorySource source starting with directory: /home/hadoop/flumespool 2019-02-13 07:40:45,084 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean. 2019-02-13 07:40:45,084 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SOURCE, name: r1 started 2019-02-13 07:43:51,716 (pool-3-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one. 2019-02-13 07:43:51,716 (pool-3-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /home/hadoop/flumespool/xxx4.txt to /home/hadoop/flumespool/xxx4.txt.COMPLETED 2019-02-13 07:43:55,040 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 68 65 6C 6C 6F 77 20 20 68 65 6C 6C 6F 20 0D hellow hello . } 2019-02-13 07:43:55,040 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 68 65 6C 6C 6F 77 78 20 68 65 6C 6C 6F 77 62 62 hellowx hellowbb } 2019-02-13 07:43:55,041 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 73 68 65 6C 6C 20 62 61 6E 61 6E 65 72 20 76 69 shell bananer vi } 2019-02-13 07:43:55,041 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 62 62 73 0D bbs. } 2019-02-13 07:43:55,041 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 74 74 78 20 74 74 74 74 20 79 79 79 79 20 68 68 ttx tttt yyyy hh } 2019-02-13 07:43:55,042 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 73 68 61 6D 65 20 6A 69 6D 65 20 61 70 70 6C 65 shame jime apple } 2019-02-13 07:43:55,042 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 63 68 65 6C 6C 79 chelly }
注意:
a、因为博主的linux系统中已经安装了jdk1.7,而Flume在1.8之后就只支持jdk1.8或者更新的版本,下载Flume1.7过程实在太慢,因此博主用Flume1.6的版原本作演示。在生产环境中建议小伙伴们根据本身所在项目的须要选择适合本身项目的版本。
b、上面介绍的Flume的收集方案配置文件都bind的是localhost主机,因此在其它机器上执行telnet是没法链接上的,正确的作法是将bind主机到Flume安装机器的主机名或ip。【此处道理同启动yarn集群,必须在配置的yarn resourcemanager的主机上执行启动脚本start-yarn.sh同样的道理】
c、以上配置文件博主只展现了最简单的配置,更多精细化的配置见官网说明,须要强调的是Flume的官网文档是很是值得确定的,相较于其余开源软件算是很是完善的。
最后寄语,以上是博主本次文章的所有内容,若是你们以为博主的文章还不错,请点赞;若是您对博主其它服务器大数据技术或者博主本人感兴趣,请关注博主博客,而且欢迎随时跟博主沟通交流。