开源的系统监控和报警工具,监控项目的流量、内存量、负载量等实时数据。mysql
它经过直接或短时jobs中介收集监控数据,在本地存储全部收集到的数据,而且经过定义好的rules产生新的时间序列数据,或发送警报。经过其它api能够将采集到的数据可视化。git
简单的说,主要就是如下几个步骤:github
本地
收集并存储数据。若是采用第三方系统收集数据metrics,且数据不是prometheus时序对,则须要定义exporter将那些metrics export为prometheus时序对。如今有不少已经定义好的官方或第三方的exporters。有些软件抓取的数据直接就是prometheus格式的。以一个例子来讲明部署流程。sql
有不少种安装方法,这里我使用预编译的二进制文件
。到这里下载。以后解压,terminal中输入./prometheus
,回车启动prometheus服务。docker
打开解压后的prometheus目录,发现其中有个prometheus.yml文件。prometheus.yml是设置监控对象等的配置文件。打开prometheus.yml,默认的prometheus.yml的初始配置以下:数据库
global: scrape_interval: 15s # By default, scrape targets every 15 seconds. # Attach these labels to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: monitor: 'codelab-monitor' # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s static_configs: - targets: ['localhost:9090']
这里设定监控目标为localhost:9090,即prometheus本身。浏览器打开localhost:9090
,就能访问prometheus提供的可视化界面。localhost:9090/metrics
提供了全部的监控数据信息。其中有一条prometheus_target_interval_length_seconds
,表示真实的数据获取间隔,在prometheus首页输入它并回车,就能够看到一系列的数据,它们有不一样quantile,从0.01至0.99不等。quantitle表示有多少比例的数据在这个值之内。若是只关注0.99的数据,能够输入prometheus_target_interval_length_seconds{quantile="0.99"}
查询。查询还支持函数,好比count(prometheus_target_interval_length_seconds)
可 以查询数量。
若是想要查询结果直接包含数量那个数据,建立一个prometheus.rules文件,在文件中定义这条规则,而后在prometheus.yml中配置rules文件。express
//prometheus.rules test:prometheus_target_interval_length_seconds:count = count(prometheus_target_interval_length_seconds) //prometheus.yml # my global config global: scrape_interval: 15s # By default, scrape targets every 15 seconds. evaluation_interval: 15s # By default, scrape targets every 15 seconds. # scrape_timeout is set to the global default (10s). # Attach these labels to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: monitor: 'codelab-monitor' # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "prometheus.rules" # - "second.rules" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090']
以后,就能够直接输入test:prometheus_target_interval_length_seconds:count
查询数据了。这里rule比较简单,若是有一些经常使用的但比较复杂的数据,均可以用rule的方法来定义获取。api
修改prometheus.yml,在文件最后添加:浏览器
- job_name: 'mysql' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9104'] labels: instance: db1
重启prometheus服务:服务器
$ ./prometheus -config.file=prometheus.yml
再打开localhost:9090,查看Status
-> Targets
页面下,就能够看到配置的两个target:一个是prometheus自己,State
为UP
,另外一个是mysql,State
为DOWN
,由于咱们尚未配置监控mysql的服务。
在在这里下载并解压mysql exporter,或者直接使用docker:
$ docker pull prom/mysqld-exporter
mysqld_exporter须要链接到mysql,须要mysql的权限,须要先为他建立用户并赋予所需的权限:
CREATE USER 'mysqlexporter'@'localhost' IDENTIFIED BY 'msyqlexporter'; GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost' WITH MAX_USER_CONNECTIONS 3;
而后在docker中运行exporter,其中DATA_SOURCE_NAME是环境变量,用于链接数据库。
$ docker run -d \ -p 9104:9104 \ -e DATA_SOURCE_NAME="mysqlexporter:mysqlexporter@(localhost:3306)/data_store" prom/mysqld-exporter
此时再刷下localhost:9090/targets
,就能够看到mysql的state
转为UP
,即已经成功的监测了mysql。
核心的几个点:
局限:
适用于监控全部时间序列的项目。