Monitor Daemon will not start

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Monitor Daemon will not start

  • Creator
    Topic
  • #49875
    Robert Hamden
    Participant

      Dev system, 5.5MB and 5.41 on the same workstation.

      I have as a last resort removed and reinstalled bot environments with out success.

      1. Type showroot to make sure root and site are set correctly

      2. Stop all processes – hcienginestop -p (for all processes)

      3. Run hciprocstatus (to verify all engines are down)

      4. Stop Daemons – hcisitectl -K

      5. Run hcisitectl (to make sure lock manager and monitor daemon are down)

      6. Clear shared memory region – hcimsiutil -R

      7. Change to the database directory – /hci/root3.7.XP//exec/databases

      8. Remove vista.taf file – rm vista.taf

      9. Change directory to exec directory – cd ..

      10. Remove the monitorShmemFile – rm monitorShmemFile

      11. Reinitialize the ICL Database – hcidbinit

    Viewing 4 reply threads
    • Author
      Replies
      • #63959
        Michael Hertel
        Participant

          If you are having a problem with NetMonitor, look in it’s log.

          $HCISITEDIR/exec/hcimonitord/hcimonitord.log

          You seem to be going through a lot of hoops for nothing.

          Most of those steps you are going through are for when processes don’t start. Not the NetMonitor.

        • #63960
          Robert Hamden
          Participant

            That was part of the problem, no Log files are written.

          • #63961
            Rob Abbott
            Keymaster

              Robert I suggest you contact support to get this issue resolved.

              Rob Abbott
              Cloverleaf Emeritus

            • #63962
              Russ Ross
              Participant

                Ocassionaly we have circumstances that keep the monitor daemon from starting.

                The good news is that the interfaces continue to run even if the monitor daemon isn’t.

                However, I’m not sure if the alerts run when the monitor daemon is messed up.

                The first thing I do to get things back on track is see if there is a monitor daemon pid file and what is in it as follows:

                cat $HCISITEDIR/exec/hcimonitord/pid

                then I see if that process is running as follows:

                ps -ef | grep `cat $HCISITEDIR/exec/hcimonitord/pid` |grep -v grep

                If the process is running I stop or kill it if necessary as follows:

                kill -9 `cat $HCISITEDIR/exec/hcimonitord/pid`

                Then I make sure the pid and cmd_port files are gone or remove it as follows”

                rm -f $HCISITEDIR/exec/hcimonitord/pid

                rm -f $HCISITEDIR/exec/hcimonitord/cmd_port

                Then I can usually start the monitor daemon at that point.

                If for some reason 2 monitor daemons are running for the same site you will continue to get confusing behaivor until you kill all monitor daemons running for the current site and restart just one instance.

                We don’t know how it was possible to start 2 instance of the monitor daemon but we witnessed it once and it caused the alerts to continue to go off in a very confusing way.

                Russ Ross
                RussRoss318@gmail.com

              • #63963
                Troy Morton
                Participant

                  Russ,

                  You can start two hcimonitord processes for the same site by deleting the pid file from $HCISITEDIR/exec/hcimonitord and then running hcisitectl -s m.

                  I have written a script that will go out and find extra hcimonitord processes running on the server.  

                  It requires a file in $HCIROOT called prodsites that contains a list of the sites to check for extra hcimonitord processes.  I used to have the script automatically kill the extra processes, but decided to go with a safer approach of just echoing the results and then manually taking care of cleanup.

                  Enjoy!

                  Code:


                  function  killbadmonitordaemons {

                  echo “Searching for rogue Monitor Daemon processes…”

                  for site in `cat /qdxtest/qdx5.4/integrator/prodsites`
                  do

                   echo “nChecking site: $site”
                   setroot
                   setsite $site
                   
                   #echo “Getting pid from hcisitectl command”
                   goodpid_sitectl=`hcisitectl | grep hcimonitord | awk ‘{print $6}’`
                   
                   #echo “Getting pid from file HciSiteDir/exec/hcilockmgr/pid”
                   goodpid_pidfile=`cat $HCISITEDIR/exec/hcimonitord/pid`

                   #echo “Making sure the two pids match. If not, don’t do anything for this site.”
                   if [[ $goodpid_sitectl != $goodpid_pidfile ]]
                   then
                   
                      echo “** pid file and hcisitectl do not match for $site.nSkipping $site.”
                     
                   else

                     #echo “Good Monitor Daemon for site $site is running on pid: $goodpid_sitectl”

                     #echo “Grepping for bad Monitor Daemon pids for site $site”
                     for pid in `ps -ef | grep “$site ” | grep hcimonitord | grep -v $goodpid_sitectl | awk ‘{print $2}’`
                     do
                        echo “Found rogue Monitor Daemon running on pid $pid”
                        #kill -9 $pid
                        sleep 3
                       
                     done
                     
                   fi
                   
                  done

                  }

              Viewing 4 reply threads
              • The forum ‘Cloverleaf’ is closed to new topics and replies.