1. NRPE简介mysql
NRPE是Nagios的一个功能扩展,它可在远程Linux/Unix主机上执行插件程序。经过在远程服务器上安装NRPE插件及Nagios插件程序来向Nagios监控平台提供该服务器的本地状况,如CPU负载,内存使用,磁盘使用等。这里将Nagios监控端称为Nagios服务器端,而将远程被监控的主机称为Nagios客户端。linux
Nagios监控远程主机的方法有多种,其方式包括SNMP,NRPE,SSH,NCSA等。这里介绍其经过NRPE监控远程Linux主机的方式。ios
注意:经过SSH是能够实如今远程的Linux/UNIX主机上执行nagios插件的,好比说check_by_ssh插件就能够实现这项功能。虽然SSH的方式相较于NRPE插件方式更为安全,可是在CPU负载上,不管是监控端仍是被监控的远程主机,SSH方式也都更大一些,当面对被监控的主机涉及到成千上百台时,使用这种方式就会是个问题,这也是许多nagios管理员选择使用NRPE方式的主要缘由。git
check_nrpe插件,位于本地监控端;web
NRPE进程,运行于远程主机(Linux/UNIX),也就是被监控端。sql
Nagios会执行check_nrpe插件,并告诉它须要监控的服务项;安全
check_nrpe插件经过SSL方式与被监控端的nrpe进程链接;bash
nrpe进程运行对应的nagios插件来执行服务或资源的监测;
NRPE 进程将监测的结果返回给check_nrpe 插件,check_nrpe插件又将结果传递给nagios进程作后续处理。
[root@kk ~]#useradd nagios
[root@kk ~]#cd /home/softwares/
[root@kk softwares]#wget http://nagios-plugins.org/download/nagios-plugins-2.1.2.tar.gz
[root@kk softwares]#tar -xzf nagios-plugins-2.1.2.tar.gz
[root@kk softwares]#cd nagios-plugins-2.1.2
[root@kk nagios-plugins-2.1.2]#./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@kk nagios-plugins-2.1.2]#make [root@kk nagios-plugins-2.1.2]#make install
[root@kk nagios-plugins-2.1.2]# chown nagios.nagios /usr/local/nagios
[root@kk nagios-plugins-2.1.2]# chown -R nagios.nagios /usr/local/nagios/libexec
3.安装NRPE
[root@kk nagios-plugins-2.1.2]#cd ..
[root@kk softwares]#tar zxf nrpe-3.0.1.tar.gz
[root@kk softwares]#cd nrpe-3.0.1
[root@kk nrpe-3.0.1]#yum -y install openssl openssl-devel
[root@kk nrpe-3.0.1]#./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@kk nrpe-3.0.1]#make all
[root@kk nrpe-3.0.1]#make install-plugin
[root@kk nrpe-3.0.1]#make install-daemon
[root@kk nrpe-3.0.1]#make install-daemon-config
[root@kk nrpe-3.0.1]#make install-config
# iptables -I RH-Firewall-1-INPUT -p tcp -m tcp –dport 5666 -j ACCEPT
# service iptables save
[root@kk nrpe-3.0.1]#vim /usr/local/nagios/etc/nrpe.cfg
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_users
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_load
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_sda1
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_total_procs
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_zombie_procs
查看配置结果:
[root@kk ~]#grep -v '^#' /usr/local/nagios/etc/nrpe.cfg |sed '/^$/d'
log_facility=daemon
debug=0
pid_file=/usr/local/nagios/var/nrpe.pid
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=192.183.3.145,172.16.56.131
dont_blame_nrpe=0
allow_bash_command_substitution=0
command_timeout=60
connection_timeout=300
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 200 -c 300
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
[root@kk nrpe-3.0.1]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
[root@kk nrpe-3.0.1]#netstat -tulpn | grep nrpe
[root@kk nrpe-3.0.1]#vi /etc/init.d/nrped
#!/bin/bash
# chkconfig: 2345 88 12
# description: NRPE DAEMON
NRPE=/usr/local/nagios/bin/nrpe
NRPECONF=/usr/local/nagios/etc/nrpe.cfg
case "$1" in
start)
echo -n "Starting NRPE daemon..."
$NRPE -c $NRPECONF -d
echo " done."
;;
stop)
echo -n "Stopping NRPE daemon..."
pkill -u nagios nrpe
echo " done."
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 start|stop|restart"
;;
esac
exit 0
设置自启动:
[root@kk nrpe-3.0.1]#chmod +x /etc/init.d/nrped
[root@kk nrpe-3.0.1]#chkconfig --add nrped
[root@kk nrpe-3.0.1]#chkconfig nrped on
[root@kk nrpe-3.0.1]#service nrped start
Starting NRPE daemon... done.
root@monitors ~]# yum -y install openssl openssl-devel
不然编译nrpe时会出现以下问题:
[root@monitors ~]# cd /home/nagios/ [root@monitors nagios]# wget http://prdownloads.sourceforge.net/sourceforge/nagios/nrpe-3.0.1.tar.gz--2017-01-17 23:36:36-- http://prdownloads.sourceforge.net/sourceforge/nagios/nrpe-3.0.1.tar.gz [root@monitors nagios]# tar xzvf nrpe-3.0.1.tar.gz [root@monitors nagios]# cd nrpe-3.0.1 [root@monitors nrpe-3.0.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@monitors nrpe-3.0.1]# make all
[root@monitors nrpe-3.0.1]# make install-plugin
[root@monitors nrpe-3.0.1]# ll /usr/local/nagios/libexec/check_nrpe -rwxrwxr-x 1 nagios nagios 125293 1月 17 23:47 /usr/local/nagios/libexec/check_nrpe
[root@monitors libexec]# ./check_nrpe -h
NRPE Plugin for Nagios
Copyright (c) 1999-2008 Ethan Galstad (nagios@nagios.org)
Version: 3.0.1
Last Modified: 09-08-2016
License: GPL v2 with exemptions (-l for more info)
SSL/TLS Available: OpenSSL 0.9.6 or higher required
Usage: check_nrpe -H <host> [-2] [-4] [-6] [-n] [-u] [-V] [-l] [-d <dhopt>]
[-P <size>] [-S <ssl version>] [-L <cipherlist>] [-C <clientcert>]
[-K <key>] [-A <ca-certificate>] [-s <logopts>] [-b <bindaddr>]
[-f <cfg-file>] [-p <port>] [-t <interval>:<state>]
[-c <command>] [-a <arglist...>]
Options:
<host> = The address of the host running the NRPE daemon
-2 = Only use Version 2 packets, not Version 3
-4 = bind to ipv4 only
-6 = bind to ipv6 only
-n = Do no use SSL
-u = (DEPRECATED) Make timeouts return UNKNOWN instead of CRITICAL
-V = Show version
-l = Show license
<dhopt> = Anonymous Diffie Hellman use:
0 = Don't use Anonymous Diffie Hellman
(This will be the default in a future release.)
1 = Allow Anonymous Diffie Hellman (default)
2 = Force Anonymous Diffie Hellman
<size> = Specify non-default payload size for NSClient++
<ssl ver> = The SSL/TLS version to use. Can be any one of: SSLv2 (only),
SSLv2+ (or above), SSLv3 (only), SSLv3+ (or above),
TLSv1 (only), TLSv1+ (or above DEFAULT), TLSv1.1 (only),
TLSv1.1+ (or above), TLSv1.2 (only), TLSv1.2+ (or above)
<cipherlist> = The list of SSL ciphers to use (currently defaults
to "ALL:!MD5:@STRENGTH". WILL change in a future release.)
<clientcert> = The client certificate to use for PKI
<key> = The private key to use with the client certificate
<ca-cert> = The CA certificate to use for PKI
<logopts> = SSL Logging Options
<bindaddr> = bind to local address
<cfg-file> = configuration file to use
[port] = The port on which the daemon is running (default=5666)
[command] = The name of the command that the remote daemon should run
[arglist] = Optional arguments that should be passed to the command,
separated by a space. If provided, this must be the last
option supplied on the command line.
NEW TIMEOUT SYNTAX
-t <interval>:<state>
<interval> = Number of seconds before connection times out (default=10)
<state> = Check state to exit with in the event of a timeout (default=CRITICAL)
Timeout state must be a valid state name (case-insensitive) or integer:
(OK, WARNING, CRITICAL, UNKNOWN) or integer (0-3)
Note:
This plugin requires that you have the NRPE daemon running on the remote host.
You must also have configured the daemon to associate a specific plugin command
with the [command] option you are specifying here. Upon receipt of the
[command] argument, the NRPE daemon will run the appropriate plugin command and
send the plugin output and return code back to *this* plugin. This allows you
to execute plugins on remote hosts and 'fake' the results to make Nagios think
the plugin is being run locally.
[root@monitors libexec]# ./check_nrpe -H 192.183.3.145 -p 5666
NRPE v3.0.1
[root@monitors libexec]# cd /usr/local/nagios/etc/objects/ [root@monitors objects]# vim commands.cfg define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H "$HOSTADDRESS$" -c "$ARG1$" }
[root@monitors objects]# vi hosts.cfg
# Define a host for the remote machine
define host{
host_name kk
alias master-server
use linux-server
address 192.183.3.145
max_check_attempts 5
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period 24x7
notification_interval 30
notification_options d,r
contact_groups admins
}
###############################################################################
###############################################################################
#
# HOST GROUP DEFINITION
#
###############################################################################
###############################################################################
# Define an optional hostgroup for Linux machines
define hostgroup{
hostgroup_name remote-linux-servers ; The name of the hostgroup
alias remoteLinux Servers ; Long name of the group
members * ; Comma separated list of hosts that belong to this group
}
[root@monitors objects]# vim prilinuxserver.cfg
#PRIVATE SERVICE DEFINITIONS
#The following service will monitor the CPU load on the remote host.
# The "check_load" argument that is passed to the check_nrpe command
# defiition tells the NRPE daemon to run the "check_load" comman#d as defied in the nrpe.cfg fie
define service{
use local-service
host_name kk
service_description CPU Load
check_command check_nrpe!check_load
contact_groups admins
}
#The following service will monitor the number of currently logged in users on the remote host
define service{
use local-service
host_name kk
service_description Current Users
check_command check_nrpe!check_users
contact_groups admins
}
#The following service will monitor the free drive space on /dev/sda1 on the remote host.
define service{
use local-service
host_name kk
service_description /dev/sda1 Free Space
check_command check_nrpe!check_sda1
contact_groups admins
}
#The following service will monitor the total number of processes on the remote host.
define service{
use local-service
host_name kk
service_description Total PProcesses
check_command check_nrpe!check_total_procs
contact_groups admins
}
#The following service will monitor the number of zombie processes on the remote host.
define service{
use local-service
host_name kk
service_description Zombie Processes
check_command check_nrpe!check_zombie_procs
contact_groups admins
}
# monitoring the swap usage on the remote host
define service{
use local-service
host_name kk
service_description Swap Usage
check_command check_nrpe!check_swap
contact_groups admins
}
注意:监控端(Nagios服务端)定义的service命令与被监控端NRPE中内置的监控命令一致。
[root@monitors objects]# vim /usr/local/nagios/etc/nagios.cfg
添加行:
#definitions for monitoring the remote(linux/unix)host
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
#definitions for monitoring the remote(linux/unix)host private services
cfg_file=/usr/local/nagios/etc/objects/prilinuxserver.cfg
若是host.cfg已经定义过则略过!
[root@monitors objects]# service nagios configtest
或者
[root@monitors objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.2.0
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-01-2016
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Warning: Duplicate definition found for service 'Swap Usage' on host 'kk' (config file '/usr/local/nagios/etc/objects/publinuxserver.cfg', starting on line 75)
Warning: Duplicate definition found for service 'Total Processes' on host 'kk' (config file '/usr/local/nagios/etc/objects/publinuxserver.cfg', starting on line 51)
Warning: Duplicate definition found for service 'Current Users' on host 'kk' (config file '/usr/local/nagios/etc/objects/publinuxserver.cfg', starting on line 38)
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 20 services.
Checked 2 hosts.
Checked 2 host groups.
Checked 0 service groups.
Checked 2 contacts.
Checked 1 contact groups.
Checked 26 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 2 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Object precache file created:
/usr/local/nagios/var/objects.precache
[root@monitors objects]# service nagios restart
Running configuration check...
Stopping nagios: done.
Starting nagios: done.
在远程主机端的nrpe.cfg文件中增长新的命令定义;
在监控端的nagios配置文件中增长新的服务监控定义;
[root@kk libexec]#/usr/local/nagios/libexec/check_swap -w 20% -c 10%
SWAP OK - 59% free (2251 MB out of 3823 MB) |swap=2251MB;764;382;0;3823
[root@kk libexec]#vi /usr/local/nagios/etc/nrpe.cfg
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
[root@kk libexec]#service nrped restart
Stopping NRPE daemon... done.
Starting NRPE daemon... done.
[root@monitors ~]# vim /usr/local/nagios/etc/objects/prilinuxserver.cfg
define service{
use generic-service
host_name remotehost
service_description Swap Usage
check_command check_nrpe!check_swap
}
验证配置:
[root@monitors ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
重启
nagios
:
[root@monitors ~]# service nagios restart
Running configuration check...
Stopping nagios: done.
Starting nagios: done.