Monday, 13 March 2017

MySQL Monitoring through Nagios on CentOS 6.8


Nagios is a powerful monitoring system and here we will learn how to monitor MySQL through Nagios.


Before we begin, do note that the steps shown here are actually a continuation from my earlier series of tutorials based on Nagios:


Prerequisites :
  • Create a Nagios Monitoring Server on CentOS 6.8  
  •     Install a Mysql Server on CentOS 6.8 .Installed, configured and located on a remote system on the same network as that of the Nagios Server.

Server's : 
  • Nagios server with hostname server10 and IP : 192.168.47.183
  • MySQL server with hostname server22 and IP : 192.168.47.181

Step 1 : Configuring Nagios Plugin and Agents on Remote Mysql server (server22(192.168.47.181))

a) Install Nagios Plugin on MySQL Server 


[root@server22 opt]# wget https://nagios-plugins.org/download/nagios-plugins-2.1.4.tar.gz

Compile and Install Nagios Plugin 


cd nagios-plugins-2.1.4/

./configure --with-nagios-user=nagios --with-nagios-group=nagios

make 

make install 

b) Install MySQL Plugin (check_mysql_health) 

check_mysql_health is a plugin for Nagios that allows you to monitor a MySQL database. Among the list of metrics are time to login, index usage, bufferpool hit rate, query cache hit rate, slow queries, temp tables on disk, table cache hit rate, connected threads, and many more. Requirements are either a DBD::mysql Perl module or a MySQL client package.


wget http://labs.consol.de/wp-content/uploads/2010/10/check_mysql_health-2.2.2.tar.gz



tar -zxvf check_mysql_health-2.2.2.tar.gz

Compile and Install 


./configure -prefix=/usr/local/nagios -with-nagios-user=nagios -with-nagios-group=nagios -with-perl=/usr/bin/perl

make
make install

c) Create the Database user in Mysql Instance 


mysql> grant usage, replication client on *.* to 'nagios'@'%' identified by 'XXXXXX';
mysql> flush privileges;



NRPE is an addon that allows you to execute plugins on remote Linux/Unix hosts. This is useful if you need to monitor local resources/attributes like disk usage, CPU load, memory usage, etc. on a remote host. 
Similar functionality can be accomplished by using the check_by_ssh plugin, although it can impose a higher CPU load on the monitoring machine – especially if you are monitoring hundreds or thousands of hosts.

Create nagios user and group by which we will be installing NRPE and Nagios-Plugin


useradd -m nagios
password nagios

Install Following Dependent Package 


$ yum install gcc glibc glibc-common xinetd 
$ yum install nrpe nagios-plugins-all openssl

Download/Untar/Compile and Install all necessary files for NRPE


$ cd /opt/
$ wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.14/nrpe-2.14.tar.gz
$ ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios
$ make all
$ make install-plugin
$ make install-daemon
$ make install-daemon-config

Note: NRPE by default is installed under /usr/local/nagios directory.

Install the NRPE daemon as a service under xinetd.


$  make install-xinetd

Note :

 You may face following error:
“checking for SSL libraries… configure: error: Cannot find ssl libraries”

Try specifying ssl & lib path as follows:

$  ./configure --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/i386-linux-gnu/

If this still doesn’t get resolved make sure you have required openssl and dependencies sorted.

Restart xinetd Service 

$ service xinetd restart

 We need to fix permission as well.

$ chown nagios:nagios /usr/local/nagios
$ chown -R nagios:nagios /usr/local/nagios/libexec

Nagios Client (NRPE) Configuration

Add the IP Address of the Nagios monitoring server to “only_from” directive in /etc/xinetd.d/nrpe file:

$ vi  /etc/xinetd.d/nrpe


# default: on
 # description: NRPE (Nagios Remote Plugin Executor)
 service nrpe
 {
  flags           = REUSE
  socket_type     = stream
  port            = 5666
  wait            = no
  user            = nagios
  group           = nagios
  server          = /usr/local/nagios/bin/nrpe
  server_args     = -c /usr/local/nagios/etc/nrpe.cfg --inetd
  log_on_failure  += USERID
  disable         = no
  only_from       = 127.0.0.1 192.168.47.183
 }  

 Note that allowed_hosts is a comma-delimited list while only_from is space separated.

Configure nrpe on system startup

$ chkconfig nrpe on

Add the following entry for the NRPE daemon to the /etc/services file:

 $ vi /etc/services 
 
# Add following line.

nrpe            5666/tcp                 NRPE

Restart the xinetd service:

$ service xinetd restart

Validation and Testing

We need to check if nrpe daemon is running under xinetd:
$ netstat -at | grep nrpe
 
tcp6       0      0 [::]:nrpe               [::]:*                  LISTEN  

$ netstat -anp|grep :5666

Now we need to do the functional testing of NRPE daemon:
$ /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1

Output :
NRPE vnrpe-3.0

Some in-build checks that we can do:

$  /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_users
   
   USERS OK - 3 users currently logged in |users=3;5;10;0

$  /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_load
   
   OK - load average: 0.00, 0.02, 0.00|load1=0.000;15.000;30.000;0; load5=0.020;10.000;25.000;0; load1

Customizing NRPE commands

cd /opt/nagios/ncrpe/sample-config 

cp nrpe.cfg   /usr/local/nagios/etc/
   
vi /usr/local/nagios/etc/nrpe.cfg 

Edit following variable 

log_facility=daemon
pid_file=/var/run/nrpe/nrpe.pid
server_port=5666
nrpe_user=nrpe
nrpe_group=nrpe
allowed_hosts=127.0.0.1,192.168.47.180
dont_blame_nrpe=0
debug=0
command_timeout=60
connection_timeout=300

command[check_users]=/usr/lib64/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/vda
command[check_zombie_procs]=/usr/lib64/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200
command[check_procs]=/usr/lib64/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$

Add the following 

command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%

On the NRPE Client machine (haproxy001) verify plugin is working fine:

/usr/local/nagios/libexec/check_swap -w 20% -c 10%

$ service xinetd restart

cd /usr/local/nagios/libexec/

./check_nrpe -H 127.0.0.1 


[root@server22 libexec]# ./check_nrpe -H 127.0.0.1
NRPE vnrpe-3.0

Firewall Rule for NRPE:

 Firewall port that needs to be open for NRPE daemon on client machine:

iptables -A INPUT -p tcp -m tcp –dport 5666 -j ACCEPT

Save the iptables rules and restart it.

service iptables save
service iptables restart

Step 2 : Configuring Nagios Plugins, NRPE on Nagios Monitoring Server (server10(192.168.47.183))

a) Install Nagios Plugin on Monitoring Server 


[root@server10 opt]# wget https://nagios-plugins.org/download/nagios-plugins-2.1.4.tar.gz

Compile and Install Nagios Plugin 


cd nagios-plugins-2.1.4/

./configure --with-nagios-user=nagios --with-nagios-group=nagios

make 

make install 

b) Install MySQL Plugin (check_mysql_health) 

check_mysql_health is a plugin for Nagios that allows you to monitor a MySQL database. Among the list of metrics are time to login, index usage, bufferpool hit rate, query cache hit rate, slow queries, temp tables on disk, table cache hit rate, connected threads, and many more. Requirements are either a DBD::mysql Perl module or a MySQL client package.


wget http://labs.consol.de/wp-content/uploads/2010/10/check_mysql_health-2.2.2.tar.gz



tar -zxvf check_mysql_health-2.2.2.tar.gz

Compile and Install 


./configure -prefix=/usr/local/nagios -with-nagios-user=nagios -with-nagios-group=nagios -with-perl=/usr/bin/perl

make
make install



NRPE is an addon that allows you to execute plugins on remote Linux/Unix hosts. This is useful if you need to monitor local resources/attributes like disk usage, CPU load, memory usage, etc. on a remote host. 
Similar functionality can be accomplished by using the check_by_ssh plugin, although it can impose a higher CPU load on the monitoring machine – especially if you are monitoring hundreds or thousands of hosts.

Create nagios user and group by which we will be installing NRPE and Nagios-Plugin


useradd -m nagios
password nagios

Install Following Dependent Package 


$ yum install gcc glibc glibc-common xinetd 
$ yum install nrpe nagios-plugins-all openssl
$ yum install perl-DBD-MySQL-4.023-5.el7.x86_64 

Download/Untar/Compile and Install all necessary files for NRPE


$ cd /opt/
$ wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.14/nrpe-2.14.tar.gz
$ ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios
$ make all
$ make install-plugin
$ make install-daemon
$ make install-daemon-config

Note: NRPE by default is installed under /usr/local/nagios directory.

Install the NRPE daemon as a service under xinetd.


$  make install-xinetd

Note :

 You may face following error:
“checking for SSL libraries… configure: error: Cannot find ssl libraries”

Try specifying ssl & lib path as follows:

$  ./configure --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/i386-linux-gnu/

If this still doesn’t get resolved make sure you have required openssl and dependencies sorted.

Restart xinetd Service 

$ service xinetd restart

 We need to fix permission as well.

$ chown nagios:nagios /usr/local/nagios
$ chown -R nagios:nagios /usr/local/nagios/libexec

Configure nrpe on system startup

$ chkconfig nrpe on

Add the following entry for the NRPE daemon to the /etc/services file:

 $ vi /etc/services 
 
# Add following line.

nrpe            5666/tcp                 NRPE

Restart the xinetd service:

$ service xinetd restart

Validation and Testing

We need to check if nrpe daemon is running under xinetd:
$ netstat -at | grep nrpe
 
tcp6       0      0 [::]:nrpe               [::]:*                  LISTEN  

$ netstat -anp|grep :5666

Now we need to do the functional testing of NRPE daemon:
$ /usr/local/nagios/libexec/check_nrpe -H 192.168.47.181

d) Configuring Nagios Server 

Define Generic Contact Template in templates.cfg 

The following generic-contact is already available under /usr/local/nagios/etc/objects/templates.cfg. 

[root@server10 nrpe-2.14]# grep templates /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
# performance data files.  The templates may contain macros, special


Note: generic-contact is available under

      /usr/local/nagios/etc/objects/templates.cfg



e) Define Individual Contacts in contacts.cfg



One you’ve confirmed that the generic-contact templates is defined properly, you can start defining individual contacts definition for all the people in your organization who would ever receive any notifications from nagios. Please note that just by defining a contact doesn’t mean that they’ll get notification.
Later you have to associate this contact to either a service or host definition as shown in the later sections below. So, feel free to define all possible contacts here. 



Note: Define these contacts in /usr/local/nagios/etc/objects/contacts.cfg

$ vi /usr/local/nagios/etc/objects/contacts.cfg
  

   define contact{
        contact_name                    nagiosadmin             ; Short name of user
        use                             generic-contact         ; Inherit default values from generic-contact template (defined above)
        alias                           Nagios Admin            ; Full name of user

        email                           nagios@server10 ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
        }

define contact{
        contact_name                    dptsource
        use                             generic-contact
        alias                           Dheeraj XXXX (DBA)
        email                           XXXXXX@gmail.com
        pager                           XXXXXXX@pager.com
        service_notification_period     24x7
        }


Define Contact Groups with Multiple Contacts in contacts.cfg

$ vi /usr/local/nagios/etc/objects/contacts.cfg

define contactgroup{
contactgroup_name          dba-adminss
alias                      Database Administrators
members                    dptsource
}

f) Attach Contact Groups or Individual Contacts to Service and Host Definitions

Once you’ve defined the individual contacts and contact groups, it is time to start attaching them to a specific host or service definition as shown below.

Note: Following host is defined under
     /usr/local/nagios/etc/objects/servers/localhost.cfg
     This can be any host definition file.

grep cfg_file /usr/local/nagios/etc/nagios.cfg

cd /usr/local/nagios/etc/objects/
   
cp localhost.cfg clienthost.cfg

chown nagios:nagios clienthost.cfg

Configure Monitoring hots in clienthost.cfg 

$ vi /usr/local/nagios/etc/objects/clienthost.cfg


# Define a host for the client machine

define host{
        use                     linux-server            ; Name of host template to use
                                                        ; This host definition will inherit all variables that are defined
                                                        ; in (or inherited by) the linux-server host template definition.
        host_name               server22
        alias                   server22
        address                 192.168.47.133
        contact_groups          dba-admins
        check_period             24x7
        check_command            check-host-alive
        notification_period       24x7
        }


  ###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################


# Define a service to "ping" the local machine

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       server22
        service_description             PING
 check_command   check_ping!100.0,20%!500.0,60%
        contact_groups                  dba-admins
        }


# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       server22
        service_description             Root Partition
 check_command   check_local_disk!20%!10%!/
        contact_groups                  dba-admins
        }



# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service{
              use                     generic-service
              host_name               server22
              service_description     Current Users
          contact_groups           dba-admins
              check_command           check_nrpe!check_users
              }

# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       server22
        service_description             Total Processes
 check_command   check_local_procs!250!400!RSZDT
        contact_groups                  dba-admins
        }



# LOAD

define service{
 use                             generic-service
 host_name                       server22
 service_description             CPU Load
 contact_groups                  dba-admins
 check_command                   check_nrpe!check_load
}


# Define a service to check the swap usage the local machine. 
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       server22
        service_description             Swap Usage
        contact_groups                  dba-admins
 check_command   check_local_swap!20!10
        }


## SWAP 

define service{
      use                    generic-service
      host_name              server22
      service_description    Swap Usage
      contact_groups         dba-admins
      check_command          check_nrpe!check_swap
      }


#Mysql Parameters 


define service{
        use generic-service
        host_name prodserver11
 service_description MySQL connection-time
        contact_groups                  dba-admins
 check_command check_mysql_health!prodserver11!3306!nagios!XXXXX!connection-time!
 }

define service{
        use generic-service
        host_name  prodserver22
        service_description MySQL connection-time
        contact_groups                  dba-admins
        check_command check_mysql_health!prodserver22!3306!nagios!XXXXX!connection-time!
        }

define service{
   use generic-service
   host_name prodserver11
 service_description MySQL Open Files
        contact_groups                  dba-admins
 check_command check_mysql_health!prodserver11!3306!nagios!XXXXX!open-files!
 }


define service{
        use generic-service
        host_name prodserver22
        service_description MySQL Open Files
        contact_groups                  dba-admins
        check_command check_mysql_health!prodserver22!3306!nagios!XXXXX!open-files!
        }




define service{
    use  generic-service 
    host_name prodserver11
    service_description MySQL UP Time
    contact_groups                  dba-admins
           check_command check_mysql_health!prodserver11!3306!nagios!XXXXX!uptime!
}


define service{
           use  generic-service
           host_name prodserver22
           service_description MySQL UP Time
           contact_groups                  dba-admins
           check_command check_mysql_health!prodserver22!3306!nagios!XXXXX!uptime!
}

define service{
    use generic-service
    host_name prodserver11
 service_description MySQL slave-io-running
  contact_groups                  dba-admins
 check_command check_mysql_health!prodserver11!3306!nagios!XXXXX!slave-io-running!
}


define service{
        use generic-service
        host_name prodserver22 
        service_description MySQL slave-io-running
        contact_groups                  dba-admins
        check_command check_mysql_health!prodserver22!3306!nagios!XXXXX!slave-io-running!
}



define service{
 use  generic-service
 host_name prodserver11
 service_description MySQL slave-sql-running
        contact_groups                  dba-admins 
 check_command check_mysql_health!prodserver11!3306!nagios!XXXXX!slave-sql-running!
}



define service{
        use  generic-service
        host_name prodserver22
        service_description MySQL slave-sql-running
        contact_groups                  dba-admins
        check_command check_mysql_health!prodserver22!3306!nagios!XXXXX!slave-sql-running!
}


define service{
        use  generic-service
        host_name prodserver11
        service_description MySQL slave-lag
        contact_groups                  dba-admins
        check_command check_mysql_custom1!prodserver11!3306!nagios!XXXXX!slave-lag!mysql!10!5!
}


define service{
        use  generic-service
        host_name prodserver22
        service_description MySQL slave-lag
        contact_groups                  dba-admins
        check_command check_mysql_custom1!prodserver22!3306!nagios!XXXXX!slave-lag!mysql!10!5!
}

define service{
        use  generic-service
        host_name prodserver11
        service_description MySQL threads_connected
        contact_groups                  dba-admins
        check_command check_mysql_custom1!prodserver11!3306!nagios!XXXXX!threads-connected!mysql!10!5!
}


define service{
        use  generic-service
        host_name prodserver22
        service_description MySQL threads_connected 
        contact_groups                  dba-admins
        check_command check_mysql_custom1!prodserver22!3306!nagios!XXXXX!threads-connected!mysql!10!5!
}


Note : Here XXXXX is the nagios user Password defined in Mysql Server 

g)  Now NRPE commands should definition needs to be created in commands.cfg file.

$ vi /usr/local/nagios/etc/objects/commands.cfg

###Client Connection Commands ####

define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}


define command{
command_name check_mysql_health
command_line $USER1$/check_mysql_health -H $ARG1$ –port $ARG2$ -username $ARG3$ -password $ARG4$ -mode $ARG5$
}

define command{
command_name check_mysql_custom1 
command_line $USER1$/check_mysql_health --hostname=$ARG1$ --port=$ARG2$ --username=$ARG3$ --password=$ARG4$ --warning=$ARG8$ --critical=$ARG7$ --mode=$ARG5$
} 


Note : ARGn$ value is the total no of arguments present in file it is not database specific 


h)  We needed to add Client configuration file in nagios.cfg

$ vi /usr/local/nagios/etc/nagios.cfg

Add below line.

cfg_file=/usr/local/nagios/etc/objects/clienthost.cfg

i)   Verify Nagios Configuration Files

Now we are all done with Nagios configuration and its time to verify it and to do so please run below command. 
If everything goes smoothly it will show up similar to below output.

$ /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg


Nagios Core 4.1.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-19-2015
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
Warning: Duplicate definition found for service 'Swap Usage' on host 'server22' (config file '/usr/local/nagios/etc/objects/clienthost.cfg', starting on line 131)
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
 Checked 19 services.
 Checked 2 hosts.
 Checked 1 host groups.
 Checked 0 service groups.
 Checked 2 contacts.
 Checked 2 contact groups.
 Checked 26 commands.
 Checked 5 time periods.
 Checked 0 host escalations.
 Checked 0 service escalations.
Checking for circular paths...
 Checked 2 hosts
 Checked 0 service dependencies
 Checked 0 host dependencies
 Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

j) Restarting the Nagios Server

$ service nagios reload

$ service httpd reload


Conclusion :

Here we have monitored Mysql  services Mysql Open files, Mysql Uptime , Mysql connection time , Mysql slave-io running ,Mysql slave-sql-running 


0 comments:

Post a Comment