监控系统nagios(1)

Nagiosphp

1.1 Nagios简介

Nagios是一个监视系统运行状态和网络信息的监视系统。Nagios能监视所指定的本地或远程主机以及服务,同时提供异常通知功能等。[1] mysql

Nagios可运行在Linux/Unix平台之上,同时提供一个可选的基于浏览器的WEB界面以方便系统管理人员查看网络状态,各类系统问题,以及日志等等。linux

Nagios 能够监控的功能有:ios

1、监控网络服务(SMTPPOP3HTTPNNTPPING等);web

2、监控主机资源(处理器负荷、磁盘利用率等);sql

3、简单地插件设计使得用户能够方便地扩展本身服务的检测方法;apache

4、并行服务检查机制;windows

5、具有定义网络分层结构的能力,用"parent"主机定义来表达网络主机间的关系,这种关系可被用来发现和明晰主机宕机或不可达状态;浏览器

6、当服务或主机问题产生与解决时将告警发送给联系人(经过EMail、短信、用户定义方式);bash

7、能够定义一些处理程序,使之可以在服务或者主机发生故障时起到预防做用;

8、自动的日志滚动功能;

9、能够支持并实现对主机的冗余监控;

10、可选的WEB界面用于查看当前的网络状态、通知和故障历史、日志文件等;[1] 

11、能够经过手机查看系统监控信息;

12、可指定自定义的事件处理控制器;[2] 

 

系统的安装

1.2 环境设置

同步时间:

crontab –e

*/5 * * * * /usr/sbin/ntpdate pool.ntp.org>/dev/null 2>&1

关闭防火墙:

/etc/init.d/iptables stop

关闭selinux

[root@olwang-2 etc]# getenforce

Disabled

1.3 服务器端安装:

创建用户和组:

# useradd -s /sbin/nologin nagios
# mkdir /usr/local/nagios
# chown -R nagios.nagios /usr/local/nagios

 

并将nagios以及apache用户加入到nagcmd组中,确保nagiosapache有权限:

# groupadd nagcmd

# usermod –G nagcmd nagios

# usermod –G nagcmd apache

 

安装lamp

yum -y install httpdmysql-server perl-DBI perl-DBD-MySQL php php-devel php-mysqlphp-snmp php-pdophp-gd lm_sensors net-snmp net-snmp-libs net-snmp-utilsnet-snmp-devel

 

依赖库的安装:

yum install gccyum install gcc glibc glibc-common-y

yum install gd gd-devel -y

yum install mysql-server -y

yum instll httpd php php-gd -y

yum install httpd php php-gd –y

 

安装nagios

tar xf nagios.tar.gz

cd nagios

./configure --with-command-group=nagcmd

make all

make install

make install-init

make install-commandmode

make install-config

make install-webconf

/usr/local/nagios/bin/nagios -v/usr/local/nagios/etc/nagios.cfg

当以上安装完毕之后就能够在web界面看到ngios

wKioL1gjyT7xM1yVAAIwdiVmOKs750.png-wh_50

1.3.1 安装插件

1.3.1.1       安装插件nagios-plugs

cd nagios-plugins-1.4.16

./configure --with-nagios-user=nagios--with-nagios-group=nagios --enable-perl-modules --with-mysql

Make

Make install

检查插件

ls /usr/local/nagios/libexec/|wc –l

59

1.3.1.2       安装nrpe插件

,这个插件式客户端的插件,由于服务器这台机器也要监控,因此这台机器咱们也装上。

tar xf nrpe-2.12.tar.gz

cd nrpe-2.12

./configure

make all

make install-plugin

make install-daemon

make install-daemon-config

ls /usr/local/nagios/libexec/check_nrpe

ls /usr/local/nagios/libexec/|wc –l

60

1.3.1.3       其余插件安装

tar xf Class-Accessor-0.31.tar.gz

cd Class-Accessor-0.31

perl Makefile.PL

make

make install

#

tar xf Config-Tiny-2.12.tar.gz

cd Config-Tiny-2.12

perl Makefile.PL

make

make install

cd ..

 

###

tar xf Math-Calc-Units-1.07.tar.gz

cd Math-Calc-Units-1.07

perl Makefile.PL

make

make install

cd ..

#

tar xf Nagios-Plugin-0.34.tar.gz

cd Nagios-Plugin-0.34

perl Makefile.PL

make

make install

cd ..

#################

tar xf Params-Validate-0.91.tar.gz

cd Params-Validate-0.91

perl Makefile.PL

make

make install

cd ..

####

tar xf Regexp-Common-2010010201.tar.gz

cd Regexp-Common-2010010201

perl Makefile.PL

make

make install

1.3.2 配置并启动nagios服务

chkconfig nagios on

/etc/init.d/nagios start

echo "/etc/init.d/nagios start">>/etc/rc.local

配置文件验证:

[root@olwang-2 nrpe-2.12]# /etc/init.d/nagioscheckconfig

Running configuration check... OK.

1.4 客户端安装:

1.4.1 环境准备

同步时间:

crontab –e

*/5 * * * * /usr/sbin/ntpdate pool.ntp.org>/dev/null 2>&1

关闭防火墙:

/etc/init.d/iptables stop

关闭selinux

[root@olwang-2 etc]# getenforce

Disabled

 

建立用户:

useradd nagios -M -s /sbin/nologin

安装依赖库:

yum install perl-devel perl-CPAN openssl-devel -y

yum install perl-devel openssl-devel –y

1.4.2 插件的安装

1.4.2.1       nagios-plugins安装

tar xf nagios-plugins-1.4.16.tar.gz

cd nagios-plugins-1.4.16

./configure --with-nagios-user=nagios--with-nagios-group=nagios --enable-perl-modules --with-mysql

make

make install

cd ..

插件检查

[root@olwang-2 ~]# ls/usr/local/nagios/libexec/|wc -l

62

1.4.2.2       Nrpe安装

tar xf nrpe-2.12.tar.gz

cd nrpe-2.12

./configure

make all

make install-daemon

make install-daemon-config

make install-plugin

1.4.2.3       Class-Accessor安装

tar xf Class-Accessor-0.31.tar.gz

cd Class-Accessor-0.31

perl Makefile.PL

make

make install

#

 

 

tar xf Config-Tiny-2.12.tar.gz

cd Config-Tiny-2.12

perl Makefile.PL

make

make install

cd ..

 

###

tar xf Math-Calc-Units-1.07.tar.gz

cd Math-Calc-Units-1.07

perl Makefile.PL

make

make install

cd ..

#

tar xf Nagios-Plugin-0.34.tar.gz

cd Nagios-Plugin-0.34

perl Makefile.PL

make

make install

cd ..

#################

tar xf Params-Validate-0.91.tar.gz

cd Params-Validate-0.91

perl Makefile.PL

make

make install

cd ..

####

tar xfRegexp-Common-2010010201.tar.gz

cd Regexp-Common-2010010201

perl Makefile.PL

make

make install

 

1.4.3 生成登录web的密码

htpasswd -bc /usr/local/nagios/etc/htpasswd.usersusername password

 

 

1.4.4 客户端nrpe启动脚本:

chmod +x /etc/init.d/nrpe

 

 

[root@olwang-2 etc]# cat /etc/init.d/nrpe

#/bin/sh

Usage(){

echo "pls input (start|stop|restart)"

 

}

case $1 in

         start)

               /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d          

             ;;

          stop)

             pkill nrpe

              ;;

       restart)

             pkill nrpe

             /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

             ;;

            *)

             Usage

esac

1.5 服务器端配置文件

 

下面目录中的文件只是重要的几个 不是所有。其余不重要的就不在这里列举了。

[root@olwang ngios]# tree/usr/local/nagios/etc/

|--cgi.cfg        配置一些用户权限的文件           

|--htpasswd.users   保存用户名和密码的文件

|--nagios.cfg      

|--nrpe.cfg     这个文件主要是来配置nrpe模块的具体命令,以及设置语序访问的服务器ip

|--objects    项目目录

|   |-- commands.cfg  命令模板

|   |-- contacts.cfg  配置邮件

|   |-- hosts.cfg    配置客户端的信息

|   |-- services   

|   |-- templates.cfg 模板

 

/usr/local/nagios/etc/nagios.cfg

增长如下几行

cfg_file=/usr/local/nagios/etc/objects/hosts.cfg

cfg_file=/usr/local/nagios/etc/objects/services.cfg

cfg_dir=/usr/local/nagios/etc/objects/services 

wKiom1gjyeXhJtuRAAAb2OJbN3s469.png-wh_50

注释掉下面一行,此行是监控本地机器的


 wKioL1gjyfCjkCq5AAALOfcgCHw679.png-wh_50

修改配置文件cgi.cfg

此文件解决nagiosweb界面不显示服务的问题。

[root@olwang-2 etc]# sed -i's#nagiosadmin#oldboy#g' cgi.cfg

[root@olwang-2 etc]# grep oldboy cgi.cfg

authorized_for_system_information=oldboy

authorized_for_configuration_information=oldboy

authorized_for_system_commands=oldboy

authorized_for_all_services=oldboy

authorized_for_all_hosts=oldboy

authorized_for_all_service_commands=oldboy

authorized_for_all_host_commands=oldboy

 

添加如下文件并给权限

cd objects/

head -51 localhost.cfg >hosts.cfg

chown nagios.nagios hosts.cfg

touch services.cfg

chown nagios.nagios services.cfg

mkdir services

chown nagios.nagios services

 

 

配置文件hosts.cfg 

# Define a host for the local machine

 

define host{

       use                    linux-server                                                                    

       host_name               olwang-1

       alias                  nagios-client-2

       address                192.168.5.130

 

       max_check_attempts      3

       normal_check_interval   2

        process_perf_data       1

       action_url             /nagios/pnp/index.php?host=$HOSTNAME$

 

        }

define host{

        use                     linux-server                                                                        host_name               olwang

       alias                  nagios-server

       address                192.168.5.129

 

       max_check_attempts      3

       normal_check_interval   2

       process_perf_data       1

       action_url             /nagios/pnp/index.php?host=$HOSTNAME$

 

        }

 

define host{

       use                    linux-server                                                                                                                 

       host_name               olwang-2

        alias                   nagios-client-2

       address                192.168.5.131

 

       max_check_attempts      3

       normal_check_interval   2

       process_perf_data       1

       action_url             /nagios/pnp/index.php?host=$HOSTNAME$

        }

#

# HOST GROUP DEFINITION

define hostgroup{

       hostgroup_name  linux-servers ;The name of the hostgroup

       alias           Linux Servers ;Long name of the group

       members        olwang,olwang-1,olwang-2

                                                        }

 

 

 

添加监控模板

配置文件commands.cfg

#'check_nrpe'

define command{

       command_name    check_nrpe

       command_line    $USER1$/check_nrpe-H "$HOSTADDRESS$" -c $ARG1$ -t 30

        }

#'check_mem'

define command{

        command_name    check_mem

       command_line    $USER1$/check_mem-w $ARG1$ -c $ARG2$

        }

#'check_iostat'

define command{

       command_name    check_iostat

       command_line   $USER1$/check_iostat -w $ARG1$ -c $ARG2$

        }

 

define command{

       command_name    check_weburl

       command_line    $USER1$/check_http$ARG1$ -w 10 -c 30

        }

 

1.6 客户端配置文件修改

配置文件nrpe.cfg

 设置服务器端的ip

 77# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

  78

  79 allowed_hosts=192.168.5.129

 这里注释掉199-203,添加205-209.(针对主机性能的监控)

199#command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

200#command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c30,25,20

 201#command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p/dev/hda1

 202#command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10-s Z

203#command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

204

205command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

206command[check_disk]=/usr/local/nagios/libexec/check_disk -w 15% -c 7% -p /

207command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%

208command[check_iostat]=/usr/local/nagios/libexec/check_iostat -w 6 -c 10

209command[check_mem]=/usr/local/nagios/libexec/check_memory.pl -w 10% -c 3%

 

启动客户端守护进程

/usr/local/nagios/bin/nrpe -c/usr/local/nagios/etc/nrpe.cfg –d

1.7报错

 

 

 wKiom1gjygry7_6EAACJv1FGQhM492.png-wh_50

错误日志:

 [1477273522] SERVICE NOTIFICATION:nagiosadmin;olwang-2;Disk Partition;UNKNOWN;notify-service-by-email;Invalidhost name -c

[1477273552] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Disk Iostat;UNKNOWN;notify-service-by-email;Invalid hostname -c

[1477273602] SERVICE NOTIFICATION:nagiosadmin;olwang-2;Iostat;UNKNOWN;notify-service-by-email;Invalid host name-c

[1477273652] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Disk Partition;UNKNOWN;notify-service-by-email;Invalidhost name -c

[1477273702] SERVICE NOTIFICATION:nagiosadmin;olwang-2;Load;UNKNOWN;notify-service-by-email;Invalid host name -c

[1477273752] SERVICE NOTIFICATION:nagiosadmin;olwang-1;MEM Usage;UNKNOWN;notify-service-by-email;Invalid hostname -c

[1477273802] SERVICE NOTIFICATION:nagiosadmin;olwang-2;MEM Usage;UNKNOWN;notify-service-by-email;Invalid hostname -c

[1477273852] SERVICE ALERT:olwang-1;Ping;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 1.95 ms

[1477273952] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Swap Usage;UNKNOWN;notify-service-by-email;Invalid hostname -c

[1477274002] SERVICE NOTIFICATION: nagiosadmin;olwang-2;SwapUsage;UNKNOWN;notify-service-by-email;Invalid host name -c

[1477274052] SERVICE NOTIFICATION:nagiosadmin;olwang-1;Current Load;UNKNOWN;notify-service-by-email;Invalid hostname -c

解决办法:

Hosts解析问题。

修改文件/etc/hosts

问题2

[root@olwang-2 ~]#/usr/local/nagios/libexec/check_memory

-bash:/usr/local/nagios/libexec/check_memory: /usr/bin/perl^M: bad interpreter: Nosuch file or directory

wKiom1gjyiKyTVIyAABZAHpAtuI832.png-wh_50

问题总结:

*nix系统下使用Perl脚本有时会遇到以下错误:
/usr/bin/perl^M: bad interpreter: No such file ordirectory
最多见的缘由是由于该脚本在windows系统下进行了编辑。
windows系统下的换行符是\r\n,而unix下面是只有\n的。若是要解决这个问题,只要去掉\r便可。


第一种解决方案是用sed(假设出问题的脚本名叫filename):

解决办法:

 sed-i 's/\r$//' /usr/local/nagios/libexec/check_memory

 

 

问题3

wKiom1gjylPhXx2yAABr5PMZbws112.png-wh_50

解决办法:

       遇到这个问题,首先要检查咱们是否咱装了openssl openssl-devel,若是检查没问题。

       下面就来检查一下客户端配置文件/usr/local/nagios/etc/nrpe.cfg

        wKioL1gjyl_AOl0yAABPaBvR3_k121.png-wh_50


若以上两个问题解决,这个报错也就解决了。