› Clovertech Forums › Read Only Archives › Cloverleaf › Cloverleaf › Verifing Alerts
Thanks,
Tim J.
Under AIX we are able to list the processes and determine if alerts are running for a given site.
I have written some utility scripts to help me with doing that as follows:
– list_sites.ksh (lists all my cloverleaf sites)
– turn_on_alerts.ksh (loads monitord default.alrt file)
– turn_off_alerts.ksh (loads monitord off.alrt file)
– ps_md.ksh (lists all the monitord process that can be viewed or grepped for “off” or site anme)
Here is my ps_md.ksh I wrote:
#!/usr/bin/ksh
site_list=`list_sites.ksh`
for site in `list_sites.ksh`; do
if [ -a $HCIROOT/$site/exec/hcimonitord/pid ]; then
monitord_pid=`cat $HCIROOT/$site/exec/hcimonitord/pid`
cmd_args=`ps -p $monitord_pid -o “%a” | tail -1 | awk ‘{print $3}’`
echo “n Site ( $site ) hcimonitord pid ( $monitord_pid ) command/args ( $cmd_args )”
fi
done
echo “”
Here is a sample of what is displayed to the xterm display:
NOTE – I turned off the alerts to the first site listed so you can see how the off.alrt argument shows up
Site ( p_bldbnk ) hcimonitord pid ( 590000 ) command/args ( off.alrt )
Site ( p_cs_allergy ) hcimonitord pid ( 872638 ) command/args ( default.alrt )
Site ( p_flat_adt_mock ) hcimonitord pid ( 1114344 ) command/args ( default.alrt )
Site ( p_g2_adt_out ) hcimonitord pid ( 770074 ) command/args ( default.alrt )
Site ( p_g2_adt_out_mock ) hcimonitord pid ( 1503290 ) command/args ( off.alrt )
Site ( p_g_adt_out ) hcimonitord pid ( 671828 ) command/args ( default.alrt )
Site ( p_g_adt_out_mock ) hcimonitord pid ( 1511572 ) command/args ( default.alrt )
Site ( p_golive_bedtrk ) hcimonitord pid ( 831678 ) command/args ( default.alrt )
Site ( p_golive_maxsys ) hcimonitord pid ( 1020144 ) command/args ( default.alrt )
Site ( p_lis_mock ) hcimonitord pid ( 483436 ) command/args ( default.alrt )
Site ( p_sched_out ) hcimonitord pid ( 1343616 ) command/args ( default.alrt )
Site ( p_sched_out2 ) hcimonitord pid ( 782350 ) command/args ( default.alrt )
Site ( p_sched_out_mock ) hcimonitord pid ( 1663160 ) command/args ( default.alrt )
Site ( p_sms_23_adt_mock ) hcimonitord pid ( 598138 ) command/args ( default.alrt )
Site ( prod_cbord ) hcimonitord pid ( 1646624 ) command/args ( default.alrt )
Site ( prod_cbord_mock ) hcimonitord pid ( 1638618 ) command/args ( default.alrt )
Site ( prod_emr ) hcimonitord pid ( 762106 ) command/args ( default.alrt )
Site ( prod_flat_adt ) hcimonitord pid ( 974980 ) command/args ( default.alrt )
Site ( prod_genie ) hcimonitord pid ( 905256 ) command/args ( default.alrt )
Site ( prod_global_adt ) hcimonitord pid ( 876720 ) command/args ( default.alrt )
Site ( prod_lis ) hcimonitord pid ( 401442 ) command/args ( default.alrt )
Site ( prod_mymda ) hcimonitord pid ( 491752 ) command/args ( default.alrt )
Site ( prod_pharm ) hcimonitord pid ( 1384496 ) command/args ( default.alrt )
Site ( prod_sms_21_adt ) hcimonitord pid ( 1232976 ) command/args ( default.alrt )
Site ( prod_sms_22_adt ) hcimonitord pid ( 475176 ) command/args ( default.alrt )
Site ( prod_sms_23_adt ) hcimonitord pid ( 901146 ) command/args ( default.alrt )
Site ( prod_sms_order ) hcimonitord pid ( 1556656 ) command/args ( default.alrt )
Site ( prod_sms_sched ) hcimonitord pid ( 1454188 ) command/args ( default.alrt )
Site ( prod_super_adt ) hcimonitord pid ( 413880 ) command/args ( default.alrt )
Here is my turn_on_alerts.ksh script I wrote:
#!/usr/bin/ksh
kill_hcinetmonitor.ksh
hcisitectl -k m -s m -A “a=-cl ‘default.alrt'”
echo “”
ps -ef | head -1
ps -ef | grep hcimonitord | grep -v grep
Here is my turn_off_alerts.ksh script I wrote:
#!/usr/bin/ksh
kill_hcinetmonitor.ksh
if [[ ! -f $HCISITEDIR/Alerts/off.alrt ]]; then
touch $HCISITEDIR/Alerts/off.alrt
fi
hcisitectl -k m -s m -A “a=-cl ‘off.alrt'”
echo “”
ps -ef | head -1
ps -ef | grep hcimonitord | grep -v grep
Here is my list_sites.ksh script I wrote:
#!/usr/bin/ksh
# Begin Module Header ==============================================================================
#
#——
# Name:
#——
#
# list_sites.ksh
#
#———
# Purpose:
#———
#
# List all the sites for the current $HCIROOT
#
#——–
# Inputs:
#——–
#
# none
#
#——-
# Notes:
#——-
#
# This script assumes the proper environment is set when it is called
# and all sites are symbolic links at the $HCIROOT level
#
#———
# History:
#———
#
# 2001.02.22 Russ Ross
# – wrote initial version.
#
# 2001.07.25 Russ Ross
# – modified to look for $HCIROOT/*/siteInfo files to determine the list of sites,
# previously had looked for symbolic links in the $HCIROOT directory which only
# works if everyone follows MDACC conventions
#
# 2003.11.17 Russ Ross
# – modified to execlude siteProto from the list of sites
#
# End of Module Header =============================================================================
#—————————————————————————————–
# get a list of all the sites for the current $HCIROOT
# (Note: there is an assumption that all sites are a symbolic link at the $HCIROOT level)
#—————————————————————————————–
(cd $HCIROOT; ls ./*/siteInfo 2>/dev/null) | awk -F/ ‘{print $2}’ | grep -v siteProto | sort
Another usefull but simple script I wrote is called psgrep.ksh as seen here:
#!/usr/bin/ksh
echo “”
ps -ef | head -1
ps -ef | grep $1 | grep -v grep
echo “”
Which allows me to also do the following
psgrep.ksh hcimonitord
to get a quick overview of what site alerts are turned off or on as seen in the sample output below:
hci 401442 1 0 07:06:10 – 0:02 hcimonitord -cl default.alrt -S prod_lis
hci 413880 1 0 07:18:11 – 0:00 hcimonitord -cl default.alrt -S prod_super_adt
hci 475176 1 0 07:13:11 – 0:00 hcimonitord -cl default.alrt -S prod_sms_22_adt
hci 483436 1 0 Aug 31 – 25:21 hcimonitord -cl default.alrt -S p_lis_mock
hci 491752 1 0 07:09:11 – 0:01 hcimonitord -cl default.alrt -S prod_mymda
hci 590020 1 0 07:26:10 – 0:00 hcimonitord -cl default.alrt -S p_bldbnk
hci 598138 1 1 Aug 31 – 25:30 hcimonitord -cl default.alrt -S p_sms_23_adt_mock
hci 671828 1 0 07:10:10 – 0:00 hcimonitord -cl default.alrt -S p_g_adt_out
hci 762106 1 0 07:02:11 – 0:01 hcimonitord -cl default.alrt -S prod_emr
hci 770074 1 0 07:19:10 – 0:00 hcimonitord -cl default.alrt -S p_g2_adt_out
hci 782350 1 0 Aug 31 – 20:04 hcimonitord -cl default.alrt -S p_sched_out2
hci 831678 1 1 Aug 31 – 20:41 hcimonitord -cl default.alrt -S p_golive_bedtrk
hci 872638 1 0 07:05:10 – 0:01 hcimonitord -cl default.alrt -S p_cs_allergy
hci 876720 1 0 07:04:11 – 0:01 hcimonitord -cl default.alrt -S prod_global_adt
hci 901146 1 0 07:14:10 – 0:00 hcimonitord -cl default.alrt -S prod_sms_23_adt
hci 905256 1 0 07:07:11 – 0:01 hcimonitord -cl default.alrt -S prod_genie
hci 974980 1 0 07:03:10 – 0:01 hcimonitord -cl default.alrt -S prod_flat_adt
hci 1020144 1 0 Aug 31 – 23:16 hcimonitord -cl default.alrt -S p_golive_maxsys
hci 1114344 1 0 Aug 31 – 20:11 hcimonitord -cl default.alrt -S p_flat_adt_mock
hci 1232976 1 0 07:12:10 – 0:00 hcimonitord -cl default.alrt -S prod_sms_21_adt
hci 1343616 1 0 07:17:10 – 0:00 hcimonitord -cl default.alrt -S p_sched_out
hci 1384496 1 0 07:08:10 – 0:01 hcimonitord -cl default.alrt -S prod_pharm
hci 1454188 1 1 07:16:10 – 0:00 hcimonitord -cl default.alrt -S prod_sms_sched
hci 1503290 1 0 Sep 08 – 11:08 hcimonitord -cl off.alrt -S p_g2_adt_out_mock
hci 1511572 1 0 Aug 31 – 23:38 hcimonitord -cl default.alrt -S p_g_adt_out_mock
hci 1556656 1 0 07:15:11 – 0:00 hcimonitord -cl default.alrt -S prod_sms_order
hci 1638618 1 0 Aug 31 – 20:38 hcimonitord -cl default.alrt -S prod_cbord_mock
hci 1646624 1 0 07:00:11 – 0:01 hcimonitord -cl default.alrt -S prod_cbord
hci 1663160 1 0 Aug 31 – 23:13 hcimonitord -cl default.alrt -S p_sched_out_mock
I see the p_g2_adt_out_mock site has the alerts turned off at this time.
Russ Ross
RussRoss318@gmail.com
That is extremely helpful, thank you!!!
Greetings Russ and company:
I tried your illustrated command “ps -p monitord_pid -o %a” to get the args and no luck. We are running AIX5.3 TL8 SP7 and CIS5.6.2.
For example, executing >ps -p 307570 -o %a
yields: COMMAND
hcimonitord -S allina_prod3
Maybe an AIX version “feature”? (we recently went from SP3 to SP7 to migrate to IBM supported version of our disk hardware).
Just for thought here.
BobR
What you described I hope isn’t a SP7 issue.
What you described will also happen when you start the monitord by launching the netmonitor from the IDE instead of doing something like
hcisitectl -k m -s m -A “a=-cl ‘default.alrt'”
By default when starting the monitord by launching the netmonitor from the IDE loads the default.alrt file even though it doesn’t show up in the args, which I’m not crazy about either but learned to live with it.
When I don’t see any args I’ve been able to assume that the default.alrt file is loaded.
I like to see the args so when this comes to my attention I simply run the turn_on_alerts.ksh script I posted.
I’m still able to
ps_md | grep off
to see if any sites have their alerts turned off despite this inconsistancy.
Another thing to note is interfaces in a site will remain running even if the monitord is killed without stopping the interfaces, but I don’t believe the alerts will continue to run.
You could check for this condition if you wanted by looking for any process pid files and not finding
$HCISITEDIR/exec/hcimonitord/pid
Russ Ross
RussRoss318@gmail.com
Russ:
You must be correct in your assumption: our scripts stop/start the monitor and lock managers using the vanilla command syntax
>hcisitectl -K; hcisitectl -S
I am aware of the fuller syntax but our shop has not gone that far
at this point.
Just a caveat that unless other folks use that full syntax to start up their monitor daemons, it appears the “%a” option will not yield what alrt file was loaded.
See you at the conference?