This is my first plug-in for Nagios. For my work we use as a NAS a EMC Clarion Device. So we wanted to get some statistics to see how much we are using this hardware (you have to pay a license for this service).
This is possible in two ways.
- One is thru the graphical interface (slow… very slow).
- The second is thru the help of Navicli – a component of the EMC software package that gives you the possibility to do some scripted operations on the Clarion devices.
So we choose the second part and for this check we used 2 machines – an intermediary machine and the Nagios server (this can also be done with one machine but for some security / resource reasons we chose this infrastructure).
First we do a one way ssh trust relation from the Nagios server to the other (called LOG-retainer). So we make sure that the Nagios machine can ssh on to the LOG-retainer without a password. A trap in which I felled is not to do this operation as the Nagios user. NB: If you fail to add the .ssh directory on the Nagios home directory the script will give errors.
After this step is completed it’s time to deploy the first part of our monitoring script on LOG-retainer:
#!/bin/bash
if [ -e NaviStats.txt ]; then
rm NaviStats.txt
fi
/opt/Navisphere/bin/naviseccli -User administrator -Password password -Scope 0 -Address 192.168.99.1 analyzer -archiveretrieve -file temp.nar -overwrite y -v
/opt/Navisphere/bin/naviseccli -Address 192.168.99.1 analyzer -archivedump -data temp.nar -out monitor.csv -join -header n -object s -overwrite y
#cat monitor.csv | gawk -F “,” ‘ { printf “%s %s %s %s %s\n”, $1,$2,$5,$14,$17 } ‘
echo “Object Name Poll Time Utilization (%) Total Bandwidth (MB/s) Total Throughput (IO/s)”
NaviStatsA=`cat monitor.csv | gawk -F “,” ‘ {if ($1 == “SP A”) printf “%s %s %s %s %s\n”, $1,$2,$5,$14,$17 | “tail -n 1″ } ‘`
NaviStatsB=`cat monitor.csv | gawk -F “,” ‘ {if ($1 == “SP B”) printf “%s %s %s %s %s\n”, $1,$2,$5,$14,$17 | “tail -n 1″ } ‘`
printf “$NaviStatsA\n$NaviStatsB\n” > NaviStats.txt
cat NaviStats.txt
#cat monitor.csv
if [ -e temp.nar ]; then
rm temp.nar
fi
Copy this in to a new file and save it as monitor.sh and set also the execute flag for this file.
Check if it works correctly by executing sh monitor.sh
The output should be something as:
Launching create archive
Attempting to retrieve file from array
Retrieve is complete.
Object Name Poll Time Utilization (%) Total Bandwidth (MB/s) Total Throughput (IO/s)
SP A 05/27/2009 16:26:42 9.448161 37.165939 754.397801
SP B 05/27/2009 16:26:42 11.464435 44.153673 678.160153
Now we are going to the Nagios machine and copy the second script in to the plug-ins directory
#!/bin/bash
version=0.1
case “$1″ in
?)
echo -e “\n”
echo “——————————————————————————————————————”
echo ” Usage: $0 [Service processor] [Query option] [check interval in minutes] [Warning level] [Critical level]“
echo ” -> Service processor = CX4-SPA \ CX4-SPB”
echo ” -> Query option ( Utilization (%)\Total Bandwidth (MB/s)\Total Throughput (IO/s))= Util \ TotBa \ TotTh”
echo -e “\n”
echo ” Usage Example: $0 CX4-SPB Util 15 50 90″
echo “——————————————————————————————————————”
exit 1
;;
esac
outfile=”/tmp/NaviStats.txt”
if [ -e $outfile ]; then
if [ $5 == 0 ] || [ $4 == 0 ]; then
echo ” !! Input a warning and/or a critical limit !!”
echo ” type $0 ? for more details.”
exit 1
else if (( $5 <= $4 )); then
echo ” !! Input a critical limit bigger (not equal) then the warning limit !!”
exit 1
fi
fi
else ssh root@LOG-retainer /root/mihai/test_get/monitor.sh > $outfile
fi
if [ -s $outfile ]; then
ssh root@LOG-retainer /root/mihai/test_get/monitor.sh > $outfile
fi
cd /tmp/
filmin=`cat $outfile | grep “SP A” | gawk ‘{ printf “%s\n”, $4}’| gawk -F “:” ‘{ printf “%01d”,$2 }’`
filh=`cat $outfile | grep “SP A” | gawk ‘{ printf “%s\n”, $4}’| gawk -F “:” ‘{ printf “%01d”,$1 }’`
datmin=`date +%M`
datmin=`printf “%1d” $datmin`
dath=`date +%H`
dath=`printf “%1d” $dath`
#echo “Comparing the time:: $filh : $filmin versus $dath : $datmin”
#We are doing the time calculations to check if we need a data refresh from the server
rem=$[$datmin - $filmin]
reh=$[$dath - $filh]
#echo The time difference is $rem minutes and $reh hours \(you selected to check the file on $3 minutes\)
if (( $rem < 0 )); then rem=$[$rem*-1]
fi
if [ $reh != 0 ]; then
# echo $reh !=0 Hour change —>> Querying the service processor for new informations
ssh root@LOG-retainer /root/mihai/test_get/monitor.sh > $outfile
fi
#echo The difference in hours is of $reh
if [ $rem > $3 ]; then
#echo $rem != $3 Minute change —>> Querying the service processor for new informations
ssh root@LOG-retainer /root/mihai/test_get/monitor.sh > $outfile
fi
#echo The difference in minutes is of $rem
#echo Your choice for $1 is $2
#result=”"
if [ "$1" == "CX4-SPA" ]; then
case “$2″ in
Util) result=`cat $outfile | grep “SP A” | gawk ‘{ printf “%s\n”, $5}’`
label=”Utilization”
label1=”100″
label2=”%”
;;
TotBa) result=`cat $outfile | grep “SP A” | gawk ‘{ printf “%s\n”, $6}’`
label=”TotalBandwidth”
label1=”4096″
label2=”MB”
;;
TotTh) result=`cat $outfile | grep “SP A” | gawk ‘{ printf “%s\n”, $7}’`
label=”TotalThroughput”
label1=”1000000″
label2=”IO”
;;
esac
else
if [ "$1" == "CX4-SPB" ]; then
case “$2″ in
Util) result=`cat $outfile | grep “SP B” | gawk ‘{ printf “%s\n”, $5}’`
label=”Utilization”
label1=”100″
label2=”%”
;;
TotBa) result=`cat $outfile | grep “SP B” | gawk ‘{ printf “%s\n”, $6}’`
label=”TotalBandwidth”
label1=”4096″
label2=”MB”
;;
TotTh) result=`cat $outfile | grep “SP B” | gawk ‘{ printf “%s\n”, $7}’`
label=”TotalThroughput”
label1=”1000000″
label2=”IO”
;;
esac
else
echo ” Error!! Unknow service processor. Type $0 ? for more details.”
fi
fi
#echo Warning $4
#echo Critical $5
tmp=`echo $result | gawk ‘{ printf “%01d”,$1 }’`
if (( $5 <= $tmp )); then echo “CRITICAL, $result $label2|$label=$result$label2;$4;$5;0;$label1″
else if (( $4 <= $tmp )); then echo “WARNING, $result $label2|$label=$result$label2;$4;$5;0;$label1″
else echo “OK, $result $label2|$label=$result$label2;$4;$5;0;$label1″
fi
fi
#echo $result
Modify the rights of the file setting the execute flag.
Check if it works correctly by executing sh script_name
sh check_emc_sp.sh CX4-SPA TotBa 15 50 90
WARNING, 50.172542 MB|TotalBandwidth=50.172542MB;50;90;0;4096
Now we are ready to start to edit the Nagios configuration file.
Create a new file called emc.cfg in which put this configuration:
###############################################################################
# emc.cfg – SAMPLE CONFIG FILE FOR MONITORING the EMC hosts
# Contains all the hosts, services, and
# host group definitions required to monitor the EMC devices.
#
# Last Modified: 17-04-2009.
#
# NOTES: This config file assumes that you are using the sample configuration
# files that get installed with the Nagios quickstart guide.
#
###############################################################################
###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################
define host{
use generic-host ; Name of host template to use
name emc-host
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24×7
notification_options d,u,r
hostgroups 009-emc-all ; Host groups this switch is associated with
contact_groups admins ; Notifications get sent to the admins by default
register 0
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS TEMPLATES
#
###############################################################################
###############################################################################
define service{
use generic-service
name emc-perf
service_description emc-perf
is_volatile 0
check_period 24×7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_interval 120
notification_period 24×7
notification_options c,r
check_command check_tcp!80
register 0
}
###############################################################################
###############################################################################
#
# HOST DEFINITIONS
#
###############################################################################
###############################################################################
define host{
use emc-host
host_name CX4-SPA
alias CX4-480 Service processor A
address IP address
}
define host{
use emc-host
host_name CX4-SPB
alias CX4-480 Service processor B
address IP address}
###############################################################################
###############################################################################
#
# HOST GROUP DEFINITIONS
#
###############################################################################
###############################################################################
define hostgroup{
hostgroup_name 009-emc-all
alias EMC servers (all)
}
define hostgroup{
hostgroup_name emc-sp-all
alias EMC Service processors (all)
members CX4-SPA,CX4-SPB
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
define service{
use emc-perf ; Inherit values from a template
hostgroup_name emc-sp-all
service_description Service processor Utilization
check_command check_emc_sp!Util!5!80!100
}
define service{
use emc-perf ; Inherit values from a template
hostgroup_name emc-sp-all
service_description Total Bandwidth
check_command check_emc_sp!TotBa!5!1500!5000
}
define service{
use emc-perf ; Inherit values from a template
hostgroup_name emc-sp-all
service_description Total Throughput
check_command check_emc_sp!TotTh!5!2000!5000
}
Modify the commands.cfg and add:
define command{
command_name check_emc_sp
command_line $USER1$/check_emc_sp.sh $HOSTNAME$ $ARG1$ $ARG2$ $ARG3$ $ARG4$
}
Now we should be able to check the configuration and restart the Nagios service and see the new hosts with their services up and running. From experience I saw that the update is done very slowly so you should put the check interval of 15-40 minutes rather then 5 minutes. This is the fault of the Navicli client not of the script.
The script needs testing so this is not the final version.
I presume that the same commands are working well (or can be easily modified) also for CX5. We don’t have the monitoring license for him so I could not make the tests to see that.
If you have comments / improvements please let me know.
Technorati : CX4, CX4-480, Clarion, Nagios, Nagios CX4 monitoring, Nagios Plugin
Del.icio.us : CX4, CX4-480, Clarion, Nagios, Nagios CX4 monitoring, Nagios Plugin
Zooomr : CX4, CX4-480, Clarion, Nagios, Nagios CX4 monitoring, Nagios Plugin
Flickr : CX4, CX4-480, Clarion, Nagios, Nagios CX4 monitoring, Nagios Plugin
//
//