Docker监控方案(TIG)的研究与实践之Influxdb

时间 2019-11-10

标签 docker 监控方案 tig 研究实践 influxdb 栏目 Docker 繁體版

原文原文链接

前言：

Influxdb也是有influxdata公司(www.influxdata.com )开发的用于数据存储的时间序列数据库.可用于数据的时间排列。在整个TIG(Telegraf+influxdb+grafana)方案中，influxdb可算做一个中间件，主要负责原始数据的存储，并按照时间序列进行索引构建以提供时间序列查询接口。在整个TIG方案中，应该先构建的就是Influxdb。linux

Influxdb研究与实践：

influxdb介绍：git

使用TSM(Time Structured Merge)存储引擎，容许高摄取速度和数据压缩；
使用go编写，无需其余依赖；
简单，高性能写查询httpAPI接口；
支持其余数据获取协议的插件，好比graphite,collected,OpenTSDB；
使用relay构建高可用https://docs.influxdata.com/influxdb/v1.0/high_availability/relay/；
扩展的类sql语言，很容易查询汇总数据；
tag的支持，可用让查询变的更加高效和快速；
保留策略有效地自动淘汰过时的数据；
持续所产生的自动计算的数据会使得频繁的查询更加高效；
web管理页面的支持github

下载安装：web

github：https://github.com/influxdata/influxdb 源码编译
官网下载：
Centos系列：wgethttps://dl.influxdata.com/influxdb/releases/influxdb-1.0.0.x86_64.rpm && sudo yum localinstall influxdb-1.0.0.x86_64.rpm
源码包系列：wgethttps://dl.influxdata.com/influxdb/releases/influxdb-1.0.0_linux_amd64.tar.gz && tar xvfz influxdb-1.0.0_linux_amd64.tar.gz
docker系列：docker pull influxdb
安装手册：https://docs.influxdata.com/influxdb/v0.9/introduction/installation/sql

配置：docker

#cat /etc/influxdb/influxdb.conf
reporting-disabled = false
[registration]
[meta]
dir = "/var/lib/influxdb/meta"
hostname = "10.0.0.2"    #此hostname必须写本机，不然没法链接到数据操做的API
bind-address = ":8088"
retention-autocreate = true
election-timeout = "1s"
heartbeat-timeout = "1s"
leader-lease-timeout = "500ms"
commit-timeout = "50ms"
cluster-tracing = false
[data]
dir = "/var/lib/influxdb/data"
max-wal-size = 104857600 # Maximum size the WAL can reach before a flush. Defaults to 100MB.
wal-flush-interval = "10m" # Maximum time data can sit in WAL before a flush.
wal-partition-flush-delay = "2s" # The delay time between each WAL partition being flushed.
wal-dir = "/var/lib/influxdb/wal"
wal-logging-enabled = true
[hinted-handoff]
enabled = true
dir = "/var/lib/influxdb/hh"
max-size = 1073741824
max-age = "168h"
retry-rate-limit = 0
retry-interval = "1s"
retry-max-interval = "1m"
purge-interval = "1h"
[admin]
enabled = true
bind-address = ":8083"
https-enabled = false
https-certificate = "/etc/ssl/influxdb.pem"
[http]
enabled = true
bind-address = ":8086"
auth-enabled = false
log-enabled = true
write-tracing = false
pprof-enabled = false
https-enabled = false
https-certificate = "/etc/ssl/influxdb.pem"
[opentsdb]
enabled = false
[collectd]
enabled = false

注意：influxdb服务会启动三个端口：8086为服务的默认数据处理端口，主要用来influxdb数据库的相关操做，可提供相关的API；8083为管理员提供了一个可视化的web界面，用来为用户提供友好的可视化查询与数据管理；8088主要为了元数据的管理。须要注意的是，influxdb默认是须要influxdb用户启动，且数据存放在/var/lib/influxdb/下面，生产环境须要注意这个。shell

启动：数据库

和telegraf启动方式同样，可使用init.d或者systemd进行管理influxdb
注意，启动以后须要查看相关的端口是否正在监听，并检查日志确保服务正常启动api

使用：curl

若是说使用telegraf最核心的部分在配置，那么influxdb最核心的就是SQL语言的使用了。influxdb默认支持三种操做方式：
登陆influxdb的shell中操做:

建立数据库：
create database mydb
建立用户：
create user "bigdata" with password 'bigdata' with all privileges
查看数据库：
show databases;
数据插入：
insert bigdata,host=server001,regin=HC load=88
切换数据库：
 use mydb
查看数据库中有哪些measurement(相似数据库中的表):
show measurements
查询：
select * from cpu limit 2
查询一小时前开始到如今结束的：
#select load from cpu where time > now() - 1h
查询从历史纪元开始到1000天之间：
#select load from cpu where time < now() + 1000d
查找一个时间区间：
#select load from cpu where time > '2016-08-18' and time < '2016-09-19'
查询一个小时间区间的数据，好比在September 18, 2016 21:24:00:后的6分钟：
#select load from cpu where time > '2016-09-18T21:24:00Z' +6m
使用正则查询全部measurement的数据：
#select * from /.*/ limit 1
#select * from /^docker/ limit 3
#select * from /.*mem.*/ limit 3
正则匹配加指定tag：（=~ !~）
#select * from cpu where "host" !~ /.*HC.*/ limit 4
#SELECT * FROM "h2o_feet" WHERE ("location" =~ /.*y.*/ OR "location" =~ /.*m.*/) AND "water_level" > 0 LIMIT 4
排序：group by的用法必须得是在复合函数中进行使用
#select count(type) from events group by time(10s)
#select count(type) from events group by time(10s),type
给查询字段作tag：
#select count(type) as number_of_types group by time(10m)
#select count(type) from events group by time(1h) where time > now() - 3h
使用fill字段：
#select count(type) from events group by time(1h) fill(0)/fill(-1)/fill(null) where time > now() - 3h
数据聚合：
select count(type) from user_events merge admin_events group by time(10m)

使用API进行操做数据:

建立数据库:
curl -G "http://localhost:8086/query" --data-urlencode "q=create database mydb"
插入数据：
curl -XPOST 'http://localhost:8086/write?db=mydb' -d 'biaoge,name=xxbandy,xingqu=coding age=2'
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257'
将sql语句写入文件，并经过api插入：
#cat sql.txt
cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257
#curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt

查询数据：（--data-urlencode "epoch=s" 指定时间序列 "chunk_size=20000" 指定查询块大小）
# curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=ydb" --data-urlencode "q=select * from biaoge where xingqu='coding'"
数据分析：
#curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=mydb" --data-urlencode "q=select mean(load) from cpu"
#curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=mydb" --data-urlencode "q=select load from cpu"
能够看到load的值分别是42 78 15.4；用mean(load)求出来的值为45,13
curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=ydb" --data-urlencode "q=select mean(load) from cpu where host='server01'"

使用influxdb提供的web界面进行操做:

这里只是简单的介绍了influxdb的使用，后期若是想在grafana中汇聚并完美地展现数据，可能须要熟悉influxdb的各类查询语法。(其实就是sql语句的一些使用技巧，聚合函数的使用，子查询等等)

注意：原创著做，转载请联系做者！