Avro Source
正则表达式
简介:监听Avro端口并从外部的Avro客户端接收事件流,多个flume agent能够经过Avro造成一个组织。
shell
Property Name Default Descriptionapache
channels – 缓存
type – 须要配置为Avroapp
bind – 须要监听的主机名或ip地址dom
port – 监听的端口异步
threads – 工做的最大线程数ide
selector.type 测试
selector.* ui
interceptors – 空格分割的拦截器列表
interceptors.*
compression-type none 这块能够是“none”或者“deflate”,压缩类型必须和AvroSource匹配
ssl false 设置为true开启ssl. 同时必须明确 “keystore”和“keystore-password”.
keystore – 这是ssl须要的keystore file的地址
keystore-password – 这是ssl须要的keystore的密码
keystore-type JKS keystore的类型. 能够是“JKS”或者“PKCS12”.
exclude-protocols SSLv3 空格分割的须要排序的SSL/TLS协议. SSLv3默认被排除
ipFilter false 设置为true开启netty的ip过滤
ipFilter.rules – 经过逗号分隔的ip过滤规则
ipFilter.rules例子:ipFilter.rules=allow:ip:127.*,allow:name:localhost,deny:ip:*
2.Thrift Source
简介:监控Thrift客户端并从外部的Thrift客户端接收事件流.
Property Name Default Description
channels –
type – 须要配置为Thrift
bind – 须要监听的主机名或ip地址
port – 须要监听的端口
threads – 工做的最大线程数
selector.type
selector.*
interceptors – 空格分割的拦截器列表
interceptors.*
ssl false 设置为true开启ssl. 同时必须明确 “keystore”和“keystore- password”.
keystore – This is the path to a Java keystore file. Required for SSL.
keystore-password – The password for the Java keystore. Required for SSL.
keystore-type JKS The type of the Java keystore. This can be “JKS” or “PKCS12”.
exclude-protocols SSLv3 Space-separated list of SSL/TLS protocols to exclude. SSLv3 will always be excluded in addition to the protocols specified.
kerberos false Set to true to enable kerberos authentication. In kerberos mode, agent-principal and agent-keytab are required for successful authentication. The Thrift source in secure mode, will accept connections only from Thrift clients that have kerberos enabled and are successfully authenticated to the kerberos KDC.
agent-principal – The kerberos principal used by the Thrift Source to authenticate to the kerberos KDC.
agent-keytab —- The keytab location used by the Thrift Source in combination with the agent-principal to authenticate to the kerberos KDC.
3.exec source
简介:运行一个给定的unix命令,而且该命令会不断产生标准数据(stderr直接丢弃除非属性 logStdErr设置为true),若是进程由于任何缘由失败了,source仍然会继续运行可是没有数据.
Property Name Default Description
channels –
type – 须要设置为exec
command – 执行的unix命令
shell – shell将会调用的命令,如:/bin/sh -c.
restartThrottle 10000 多少毫秒从新执行
restart false 若是命令失败了是否重启
logStdErr false 是否stderr被记录
batchSize 20 一次发送到channel的记录条数
batchTimeout 3000 若是缓存尚未慢,数据在缓存多少毫秒后被发送
selector.type replicating replicating或者multiplexing
selector.* 取决于selector.type的值
interceptors – 空格分割的拦截器列表
interceptors.*
警告:ececsource和其余的异步都是没法保证的source,数据可能会丢失,好比当你在tailf一个文件的时候,进程出现了问题,当恢复正常的时候execsource没法知道上次读取到了什么位置,只会从当前位置开始tail,解决方法是可使用spool source.
提示:当使用tail时,使用-F参数而不是-f,-F支持文件旋转
4.JMS source
简介:JMS source从JMS目标地址好比queue或者是topic获取信息,咱们只有测试过ActiveMQ
Property Name Default Description
channels –
type – 须要设置为jms
initialContextFactory – e.g: org.apache.activemq.jndi.ActiveMQInitialContextFactory
connectionFactory – The JNDI name the connection factory shoulld appear as
providerURL – The JMS provider URL
destinationName – Destination name
destinationType – Destination type (queue or topic)
messageSelector – Message selector to use when creating the consumer
userName – Username for the destination/provider
passwordFile – File containing the password for the destination/provider
batchSize 100 Number of messages to consume in one batch
converter.type DEFAULT Class to use to convert messages to flume events. See below.
converter.* – Converter properties.
converter.charset UTF-8 Default converter only. Charset to use when converting JMS TextMessages to byte arrays.
converter:BytesMessage,TextMessage,ObjectMessage
5.Spooling Directory Source
简介:监听给定目录下文件,而后将文件传输,该source是可靠的不会丢失数据.放到该目录下的文件必须是不可变的,惟一的.
若是有如下条件产生,任务将会报错:
1.当目录中文件被打开写入时,flume将会报错,任务结束
2.当文件被再次使用的时候,flume将会报错,任务结束
为了解决这些问题,使用惟一标识,如timestamp将会有效解决
Property Name Default Description
channels –
type – 须要设置为spooldir.
spoolDir – 读取文件的目录
fileSuffix .COMPLETED 当文件读取完毕后加的后缀名
deletePolicy never 是否删除文件,never或者immediate
fileHeader false 文件是否存储到一个绝对路径
fileHeaderKey file 绝对路径的值
basenameHeader false Whether to add a header storing the basename of the file.
basenameHeaderKey basename Header Key to use when appending basename of file to event header.
ignorePattern ^$ 使用正则表达式表示哪些文件被跳过
trackerDir .flumespool 存储进程matadata文件的路径,若是不是绝对路径的话将被解释为spooldir的相对路径
consumeOrder oldest/youngest/random,文件被处理的前后,使用文件的最后修改时间来比较,若是时间一致,文件小的先被处理
maxBackoff 4000 当缓冲池满了后多少毫秒后从新尝试发送
batchSize 100 每次传送到channel的记录数
inputCharset UTF-8 文件被当作文本的编码.
decodeErrorPolicy FAIL 当文件没法解码时怎么作, FAIL: Throw an exception and fail to parse the file. REPLACE: Replace the unparseable character with the “replacement character” char, typically Unicode U+FFFD. IGNORE: Drop the unparseable character sequence.
deserializer LINE 指定文件被怎么样指定为事件, 默认将每一行当作一个事件.本身实现的类必须implement EventDeserializer.Builder.
deserializer.* Varies per event deserializer.
bufferMaxLines – (Obselete) 这个配置目前被忽略
bufferMaxLineLength 5000 (Deprecated) 多少字节的行能够被提交,不同意使用, Use deserializer.maxLineLength instead.
selector.type replicating replicating or multiplexing
selector.* Depends on the selector.type value
interceptors – Space-separated list of interceptors
interceptors.*