Nagios 설치하여 서버 모니터링 하기
테스트 서버 정보
server : 192.168.122.1
vm1 : 192.168.122.20
vm2 : 192.168.122.30
Nagios 설치
1. rpm 버전의 nagios 설치를 위해서 rpmforge yum repository 추가
[root@jook ~]# wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el5.rf.x86_64.rpm
[root@jook ~]# rpm -Uvh rpmforge-release-0.5.2-2.el5.rf.x86_64.rpm
2. 모니터링 서버에 nagios 패키지 설치
[root@jook ~]# yum install nagios nagios-devel nagios-plugins-nrpe nagios-plugins
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: data.nicehosting.co.kr
* extras: data.nicehosting.co.kr
* rpmforge: ftp-stud.fht-esslingen.de
* updates: data.nicehosting.co.kr
Setting up Install Process
Resolving Dependencies
–> Running transaction check
—> Package nagios.x86_64 0:3.2.3-3.el5.rf set to be updated
–> Processing Dependency: php for package: nagios
–> Processing Dependency: httpd for package: nagios
—> Package nagios-devel.x86_64 0:3.2.3-3.el5.rf set to be updated
—> Package nagios-plugins.x86_64 0:1.4.15-2.el5.rf set to be updated
–> Processing Dependency: fping for package: nagios-plugins
–> Processing Dependency: perl(Net::SNMP) for package: nagios-plugins
—> Package nagios-plugins-nrpe.x86_64 0:2.12-1.el5.rf set to be updated
–> Running transaction check
—> Package fping.x86_64 0:2.4-1.b2.3.el5.rf set to be updated
—> Package httpd.x86_64 0:2.2.3-53.el5.centos.3 set to be updated
—> Package perl-Net-SNMP.noarch 0:5.2.0-1.2.el5.rf set to be updated
–> Processing Dependency: perl(Digest::HMAC) for package: perl-Net-SNMP
–> Processing Dependency: perl(Crypt::DES) for package: perl-Net-SNMP
–> Processing Dependency: perl(Digest::SHA1) for package: perl-Net-SNMP
—> Package php.x86_64 0:5.1.6-27.el5_7.5 set to be updated
–> Processing Dependency: php-cli = 5.1.6-27.el5_7.5 for package: php
–> Processing Dependency: php-common = 5.1.6-27.el5_7.5 for package: php
–> Running transaction check
—> Package perl-Crypt-DES.x86_64 0:2.05-3.2.el5.rf set to be updated
—> Package perl-Digest-HMAC.noarch 0:1.01-15 set to be updated
—> Package perl-Digest-SHA1.x86_64 0:2.11-1.2.1 set to be updated
—> Package php-cli.x86_64 0:5.1.6-27.el5_7.5 set to be updated
—> Package php-common.x86_64 0:5.1.6-27.el5_7.5 set to be updated
–> Finished Dependency Resolution
Dependencies Resolved
==================================================================================================================================================
Package Arch Version Repository Size
==================================================================================================================================================
Installing:
nagios x86_64 3.2.3-3.el5.rf rpmforge 3.8 M
nagios-devel x86_64 3.2.3-3.el5.rf rpmforge 42 k
nagios-plugins x86_64 1.4.15-2.el5.rf rpmforge 1.9 M
nagios-plugins-nrpe x86_64 2.12-1.el5.rf rpmforge 20 k
Installing for dependencies:
fping x86_64 2.4-1.b2.3.el5.rf rpmforge 52 k
httpd x86_64 2.2.3-53.el5.centos.3 updates 1.2 M
perl-Crypt-DES x86_64 2.05-3.2.el5.rf rpmforge 37 k
perl-Digest-HMAC noarch 1.01-15 base 12 k
perl-Digest-SHA1 x86_64 2.11-1.2.1 base 49 k
perl-Net-SNMP noarch 5.2.0-1.2.el5.rf rpmforge 96 k
php x86_64 5.1.6-27.el5_7.5 updates 2.3 M
php-cli x86_64 5.1.6-27.el5_7.5 updates 2.2 M
php-common x86_64 5.1.6-27.el5_7.5 updates 153 k
Transaction Summary
==================================================================================================================================================
Install 13 Package(s)
Upgrade 0 Package(s)
Total download size: 12 M
Is this ok [y/N]:
3. apache 설정파일(httpd.conf)에 nagios 관련 설정 추가
httpd.conf 파일에 아래 내용을 추가해 주거나, yum 설치시 자동으로 생성된 /etc/httpd/conf.d/nagios.conf 파일을 apache conf 디렉토리에 복사하고 include 한다
# Add for nagios
ScriptAlias /nagios/cgi-bin “/usr/lib/nagios/cgi”
<Directory “/usr/lib64/nagios/cgi”>
# SSLRequireSSL
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName “Nagios Access”
AuthType Basic
AuthUserFile /etc/nagios/htpasswd.users
Require valid-user
Alias /nagios “/usr/share/nagios”
<Directory “/usr/share/nagios”>
# SSLRequireSSL
Options None
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName “Nagios Access”
AuthType Basic
AuthUserFile /etc/nagios/htpasswd.users
Require valid-user
</Directory>
4. apache 인증 설정
/etc/nagios 디렉토리에 아파치 인증 설정을 한다.
[root@jook ~]# /home/apache/bin/htpasswd -c /etc/nagios/htpasswd.users nagiosadmin
[root@jook ~]# /home/apache/bin/htpasswd /etc/nagios/htpasswd.user guest
5. nagios 설정파일 수정
default 설정을 그대로 사용하면 되며, localhost.cfg 파일의 설정에 따라서 기본적으로 localhost만 모니터링 하게 된다.
/etc/nagios/cgi.cf
/etc/nagios/objects/nagios.cfg
/etc/nagios/objects/timeperiods.cfg
/etc/nagios/objects/contacts.cfg
/etc/nagios/objects/templates.cfg
/etc/nagios/objects/localhost.cfg
6. nagios 설정파일의 오류 검사
[root@jook nagios]# nagios -v nagios.cfg
Nagios Core 3.2.3
Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-03-2010
License: GPL
Website: http://www.nagios.org
Reading configuration data…
Read main config file okay…
Processing object config file ‘/etc/nagios/objects/commands.cfg’…
Processing object config file ‘/etc/nagios/objects/contacts.cfg’…
Processing object config file ‘/etc/nagios/objects/timeperiods.cfg’…
Processing object config file ‘/etc/nagios/objects/templates.cfg’…
Processing object config file ‘/etc/nagios/objects/localhost.cfg’…
Read object config files okay…
Running pre-flight check on configuration data…
Checking services…
Checked 8 services.
Checking hosts…
Checked 1 hosts.
Checking host groups…
Checked 1 host groups.
Checking service groups…
Checked 0 service groups.
Checking contacts…
Checked 1 contacts.
Checking contact groups…
Checked 1 contact groups.
Checking service escalations…
Checked 0 service escalations.
Checking service dependencies…
Checked 0 service dependencies.
Checking host escalations…
Checked 0 host escalations.
Checking host dependencies…
Checked 0 host dependencies.
Checking commands…
Checked 24 commands.
Checking time periods…
Checked 5 time periods.
Checking for circular paths between hosts…
Checking for circular host and service dependencies…
Checking global event handlers…
Checking obsessive compulsive processor commands…
Checking misc settings…
Total Warnings: 0
Total Errors: 0
Things look okay – No serious problems were detected during the pre-flight check
[root@jook nagios]#
7. nagios 데몬을 실행하고 아래와 같이 서버의 ip주소를 입력해서 확인하거나, 가상호스트 설정해서 별도 url로 접속한다.
[root@jook ~]# /etc/init.d/nagios start
nagios is stopped
Starting nagios: [ OK ]
[root@jook ~]#
http://서버 IP/nagios/
=======================================================================================================================================================
Nagios에서 client 추가하여 모니터링 하기
1. client도 마찬가지로 rpmforge repository 를 추가한 후, nagios-nrpe nagios-plugins 패키지를 설치 한다.
[root@vm1 ~]# wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el6.rf.i686.rpm
[root@vm1 ~]# rpm -Uvh rpmforge-release-0.5.2-2.el6.rf.i686.rpm
[root@vm1 ~]# yum install nagios-nrpe nagios-plugins
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: ftp.daum.net
* extras: ftp.daum.net
* rpmforge: fr2.rpmfind.net
* updates: centos.tt.co.kr
Setting up Install Process
Resolving Dependencies
–> Running transaction check
—> Package nagios-nrpe.i386 0:2.12-1.el5.rf set to be updated
—> Package nagios-plugins.i386 0:1.4.15-2.el5.rf set to be updated
–> Processing Dependency: fping for package: nagios-plugins
–> Processing Dependency: perl(Net::SNMP) for package: nagios-plugins
–> Running transaction check
—> Package fping.i386 0:2.4-1.b2.3.el5.rf set to be updated
—> Package perl-Net-SNMP.noarch 0:5.2.0-1.2.el5.rf set to be updated
–> Processing Dependency: perl(Socket6) >= 0.19 for package: perl-Net-SNMP
–> Processing Dependency: perl(Digest::HMAC) for package: perl-Net-SNMP
–> Processing Dependency: perl(Crypt::DES) for package: perl-Net-SNMP
–> Running transaction check
—> Package perl-Crypt-DES.i386 0:2.05-3.2.el5.rf set to be updated
—> Package perl-Digest-HMAC.noarch 0:1.01-15 set to be updated
—> Package perl-Socket6.i386 0:0.19-3.fc6 set to be updated
–> Finished Dependency Resolution
Dependencies Resolved
===========================================================================================================
Package Arch Version Repository Size
===========================================================================================================
Installing:
nagios-nrpe i386 2.12-1.el5.rf rpmforge 35 k
nagios-plugins i386 1.4.15-2.el5.rf rpmforge 1.6 M
Installing for dependencies:
fping i386 2.4-1.b2.3.el5.rf rpmforge 40 k
perl-Crypt-DES i386 2.05-3.2.el5.rf rpmforge 37 k
perl-Digest-HMAC noarch 1.01-15 base 12 k
perl-Net-SNMP noarch 5.2.0-1.2.el5.rf rpmforge 96 k
perl-Socket6 i386 0.19-3.fc6 base 22 k
Transaction Summary
===========================================================================================================
Install 7 Package(s)
Upgrade 0 Package(s)
Total download size: 1.9 M
Is this ok [y/N]: y
Downloading Packages:
(1/7): perl-Digest-HMAC-1.01-15.noarch.rpm | 12 kB 00:00
(2/7): perl-Socket6-0.19-3.fc6.i386.rpm | 22 kB 00:00
(3/7): nagios-nrpe-2.12-1.el5.rf.i386.rpm | 35 kB 00:00
(4/7): perl-Crypt-DES-2.05-3.2.el5.rf.i386.rpm | 37 kB 00:00
(5/7): fping-2.4-1.b2.3.el5.rf.i386.rpm | 40 kB 00:00
(6/7): perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch.rpm | 96 kB 00:01
(7/7): nagios-plugins-1.4.15-2.el5.rf.i386.rpm | 1.6 MB 00:23
———————————————————————————————————–
Total 66 kB/s | 1.9 MB 00:29
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : fping 1/7
Installing : perl-Socket6 2/7
Installing : perl-Crypt-DES 3/7
Installing : perl-Digest-HMAC 4/7
Installing : perl-Net-SNMP 5/7
Installing : nagios-plugins 6/7
Installing : nagios-nrpe 7/7
Installed:
nagios-nrpe.i386 0:2.12-1.el5.rf nagios-plugins.i386 0:1.4.15-2.el5.rf
Dependency Installed:
fping.i386 0:2.4-1.b2.3.el5.rf perl-Crypt-DES.i386 0:2.05-3.2.el5.rf
perl-Digest-HMAC.noarch 0:1.01-15 perl-Net-SNMP.noarch 0:5.2.0-1.2.el5.rf
perl-Socket6.i386 0:0.19-3.fc6
Complete!
[root@vm1 ~]#
2. client의 nrpe.cfg 파일에 서버 접근 허용
서버의 nagios 데몬과 client의 nrpe 데몬이 서로 통신하여 client를 모니터링 하게 되며, nrpe.cfg 파일에 서버의 ip를 접근허용해 준다.
[root@vm1 ~]# vi /etc/nagios/nrpe.cfg
allowed_hosts=127.0.0.1
-> allowed_hosts=192.168.122.1
3. client의 nrpe.cfg 파일에 command 등록
/etc/nagios/nrpe.cfg 파일에 command가 등록되어 있으며, nrpe.cfg 파일에 등록된 command는 서버에서 client로 command를 실행해서 clinet의 정보를 가져올 수 있다.
/usr/lib/nagios/plugins/ 디렉토리에 있는 명령어들을 /etc/nagios/nrpe.cfg 파일에 command 등록 할 수 있다. (64비트의 경우는 /usr/lib64/nagios/plugins/)
server에서 수집 하기를 원하는 clinet 정보가 있다면 client의 /etc/nagios/nrpe.cfg 파일에 command 를 추가 등록해야 한다.
(다시 말해서 client의 /etc/nagios/nrpe.cfg 파일에 command 등록되어 있는 정보만 서버가 모니터링 가능하다)
command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/xvda1
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200
4. nrpe 데몬 실행 및 command 실행 테스트
[root@vm1 ~]# /etc/init.d/nrpe restart
Shutting down Nagios NRPE daemon (nrpe): [ OK ]
Starting Nagios NRPE daemon (nrpe): [ OK ]
[root@vm1 ~]#
-> server와 client 간에 통신이 되면 아래와 같이 check_nrpe 명령으로 client의 정보를 가져 올 수 있다.
[root@jook nagios]# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.122.20 -c check_load
OK – load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;
[root@jook nagios]# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.122.20 -c check_hda1
DISK OK – free space: /boot 167 MB (92% inode=99%);| /boot=12MB;151;170;0;189
[root@jook nagios]# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.122.20 -c check_load
OK – load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;
[root@jook nagios]# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.122.20 -c check_users
USERS OK – 2 users currently logged in |users=2;5;10;0
[root@jook nagios]# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.122.20 -c check_swap
NRPE: Command ‘check_swap’ not defined
[root@jook nagios]# /usr/lib64/nagios/plugins/check_nrpe -H 192.168.122.20 -c check_ssh
NRPE: Command ‘check_ssh’ not defined
-> check_swap, check_ssh 명령어는 client의 /etc/nagios/nrpe.cfg 파일에 command 등록이 되어 있지 않기 때문에 사용이 불가능 하다.
추가로 확인 할 정보는 command 등록을 해 주면 된다.
5. server에서 check_nrpe command 등록
server의 설정파일 중 commands.cfg 파일에 check_nrpe 명령을 사용할 수 있도록 command 추가 한다.
/usr/lib64/nagios/plugins/ 디렉토리에 있는 command 들을 commands.cfg 파일에 추가해서 사용 할 수 있다.
# ‘check_nrpe” command definition
define command{
command_name check_nrpe
command_line /usr/lib64/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
6. server에서 client 설정파일 생성
/etc/nagios/object 디렉토리에 각각의 client 설정파일 생성한다. localhost에 대한 설정파일 localhost.cfg 파일을 복사해서 만들면 된다.
group 설정 부분은 제외하고, host와 service 부분만 정의
[root@jook objects]# cat /etc/nagios/objects/vm1.cfg
###############################################################################
# LOCALHOST.CFG – SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
# Last Modified: 05-31-2007
#
# NOTE: This config file is intended to serve as an *extremely* simple
# example of how you can create configuration entries to monitor
# the local (Linux) machine.
#
###############################################################################
###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################
# Define a host for the local machine
define host{
use linux-server ; Name of host template to use
; This host definition will inherit all variables that are defined
; in (or inherited by) the linux-server host template definition.
host_name vm1
alias vm1
address 192.168.122.20
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
# Define a service to “ping” the vm1 machine
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description PING
check_command check-host-alive!192.168.122.20
}
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description Boot Partition
check_command check_nrpe!check_hda1
}
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description Root Partition
check_command check_nrpe!check_hda2
}
# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 20 users, critical
# if > 50 users.
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description Current Users
check_command check_nrpe!check_users
}
# Define a service to check the number of currently running procs
# on the local machine. Warning if > 250 processes, critical if
# > 400 users.
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description Total Processes
check_command check_nrpe!check_total_procs
}
# Define a service to check the load on the local machine.
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description Current Load
check_command check_nrpe!check_load
}
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use generic-service ; Name of service template to use
host_name vm1
service_description HTTP
check_command check_http!192.168.122.20
notifications_enabled 0
}
[root@jook objects]#
-> client에 nrpe 데몬이 설치되어 있지 않는 경우는 외부에서 서버와 통신 하지 않고도 체크가 가능한 부분들만 service 등록해서 모니터링 가능하다.
[root@jook objects]# cat /etc/nagios/objects/vm2.cfg
###############################################################################
# LOCALHOST.CFG – SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
# Last Modified: 05-31-2007
#
# NOTE: This config file is intended to serve as an *extremely* simple
# example of how you can create configuration entries to monitor
# the local (Linux) machine.
#
###############################################################################
###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################
# Define a host for the local machine
define host{
use linux-server ; Name of host template to use
; This host definition will inherit all variables that are defined
; in (or inherited by) the linux-server host template definition.
host_name vm2
alias vm2
address 192.168.122.30
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
# Define a service to “ping” the local machine
define service{
use generic-service ; Name of service template to use
host_name vm2
service_description PING
check_command check-host-alive!192.168.122.30
}
# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 20 users, critical
# if > 50 users.
define service{
use generic-service ; Name of service template to use
host_name vm2
service_description SSH Status
check_command check_ssh!!192.168.122.30
}
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use generic-service ; Name of service template to use
host_name vm2
service_description HTTP
check_command check_http!192.168.122.30
notifications_enabled 0
}
생성한 client 설정파일을 /etc/nagios/nagios.cfg 파일에서 include 하도록 추가 한다.
– /etc/nagios/nagios.cfg 파일내에 cfg_file추가
cfg_file=/etc/nagios/objects/vm1.cfg
cfg_file=/etc/nagios/objects/vm2.cfg
7. 서버의 nagios 데몬 restart
[root@jook objects]# /etc/init.d/nagios stop
nagios (pid 24314) is running…
Stopping nagios: [ OK ]
[root@jook objects]# /etc/init.d/nagios start
nagios is stopped
Starting nagios: [ OK ]
[root@jook objects]#
8. url 접속 확인
특정메뉴에서 /var/nagios/rw/nagios.cmd 파일의 퍼미션 에러가 출력되는 경우가 발생한다.
yum 으로 설치 하면서 nagios.cmd 파일이 rpm 버전의 apache 소유 그룹으로 되어 있는경우, source 버전의 apache를 사용할때 에러가 발생한다.
소유 그룹을 source 버전의 apache 그룹과 동일하게 daemon으로 변경하면 된다.