简洁的业务指标监控方案

时间 2019-12-05

标签简洁业务指标监控方案繁體版

原文原文链接

1.监控关键业务连接的状态(是否返回200，响应时间等等)

这在prometheus上已经提供了现成的exporter,能够参考html

https://github.com/prometheus/blackbox_exporterlinux

具体使用和效果这里还有一个博客git

https://medium.com/the-telegraph-engineering/how-prometheus-and-the-blackbox-exporter-makes-monitoring-microservice-endpoints-easy-and-free-of-a986078912eegithub

2.监控具体的业务指标

以tomcat为例，建立一个metrics的应用，而后里面加入一个index.jsp文件，将须要暴露的指标都写到这个文件中redis

好比数据库

[root@master metrics]# cat index.jsp 
# HELP helloworld_ordernumber Number of Order.
# TYPE helloworld_ordernumber gauge
helloworld_ordernumber 10
# HELP helloworld_orderamount Amount of Order.
# TYPE helloworld_orderamount gauge
helloworld_orderamount 100

说明以下：tomcat

整个是text格式，不须要加html,body什么的
每一个自定义指标前面加上HELP和TYPE, gauge类型意思是可大可小，而不是累加的counter类型。
这个具体指标的获取之后能够设计成经过调用程序接口或者访问数据库的模式，这里为了简化写死。

修改prometheus的配置文件jsp

加入被监控的target微服务

[root@master prometheus-2.7.1.linux-amd64]# cat prometheus.yml 
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9100']
    - targets: ['localhost:8080']

启动，而后打开http://192.168.56.108:9090/targetsthis

获取和访问指标

3.在OpenShift容器云环境下的监控

index.jsp或者相似的metrics,和业务应用绑定在一块儿，所以和业务是一个Pod
若是须要针对每一个微服务暴露的业务指标进行监控,须要在Openshift容器内部部署Prometheus.
若是是在集群外部署Prometheus,须要把须要监控的服务经过route暴露出来

4. 监控类别及方式说明

业务监控其实是获取业务的指标，好比存放在redis或者数据库，若是存在多个应用实例，只须要走任意一个实例访问获取便可。
若是是监控每一个实例是否正常工做，能够经过OpenShift提供的readness Probe和liveness Probe.由Kubernetes来保障