Monitor ESXi servers health with Nagios

hwesxi1

Monitoring ESXi servers health is the keyword to keep the virtual infrastructure fully working and the servers status under control.

As monitoring systems, Nagios is the solution I mostly use for the networks I manage. To check VMware ESXi 4.x/5.0 servers, there is a great plugin called check_esxi_hardware.py written by Claudio Kuenzler mentioned also in the VMware community.

Information reported by the plugin are the same as shown in the vSphere Client navigating to Configuration tab –> Health Status.

hwesxi2

 

Prerequisites

To use the plugin, Nagios server requires the following components installed:

  • Python
  • Python extension pywbem. Download the extension here.

 

Procedure

In Nagios server, install Python using the command yum.

# yum install python

hwesxi3

Using wget command, download the Python pywbem extension in the system.

# wget http://downloads.sourceforge.net/project/pywbem/pywbem/pywbem-0.7/
pywbem-0.7.0.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fpywbem%2
Ffiles%2F&ts=1332321760&use_mirror=freefr

hwesxi4

Unpack the downloaded file using the tar command.

# tar -vxzf pywbem-0.7.0.tar.gz

hwesxi5

Install the pywbem extension running from the console the command setup.py.

# cd pywbem-0.7.0
# python setup.py install

hwesxi6

Download the check_esxi_hardware.py plugin and copy the file in the directory  /usr/lib/nagios/plugins.

# wget http://www.claudiokuenzler.com/nagios-plugins/check_esxi_hardware.py
# cp check_esxi_hardware.py /usr/lib/nagios/plugins/

hwesxi7

When the plugin has been copied, make the file check_esxi_hardware.py executable.

# chmod 755 check_esxi_hardware.py

The correct syntax to check the ESXi server is the following:

./check_esxi_hardware.py -H IP_address_esxi -U username -P password -V vendor

Where the username must be created in the host ESXi and member of the root group. Since the use of the root user is not recommended for security reasons, create a dedicated account with vSphere Client.

hwesxi8

 

Testing the plugin

When installation completes, the plugin should be tested to check the correct functionality. To test an HP server health status for instance, from the console type the command:

# ./check_esxi_hardware.py -H esxi1 -U username -P password -V hp

hwesxi8

If everything works as expected, you should receive a message similar as reported in the picture above.

To make the hardware check automatic, the correct command must be defined in Nagios.

define command { 
command_name   check_esxi_hardware 
command_line   $USER1$/check_esxi_hardware.py -H $HOSTADDRESS$ -U $ARG1$ –P $ARG2$ -V $ARG3$ 
}

The monitoring system is now able to to display the hardware health status of configured servers in the network. For additional configuration check the plugin author website.

hwesxi10

 

When the hardware status of ESXi servers is properly monitored, time requested to identify a problem is lower minimizing the risk of a service interruption.
esxi server health check 1