3.8.1 Net Monitor Issue

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf 3.8.1 Net Monitor Issue

  • Creator
    Topic
  • #48489
    Jared Miller
    Participant

      We’re on 3.8.1p rev 4 AIX 5.1

      We have an issue sometimes where the monitor will show all the threads dead on one PC, but on a different PC it will show all the threads up and running.  For instance, Sunday morning our Support Center contacted our Star support folks about an registration issue.  The call eventually made it to our Star/Cloverleaf server admin and when he saw no issues with Star he checked the Cloverleaf monitor from his laptop.  All the threads showed up dead, but when he called the support center their monitor indicated all threads were up and running.  The support center closed and reopened their monitor and when that happened the server admin’s monitor changed from dead to and Up status.  It had been reported that for 45 minutes nothing was crossing thru Cloverleaf until the support center restarted their monitor.

      has anyone else experienced this?  Can anyone explain why this happened?  It’s my understanding that the monitor won’t have an effect on the message flow unless someone shuts a thread down.  Am I misunderstanding something?

    Viewing 5 reply threads
    • Author
      Replies
      • #58762
        Brian Goad
        Participant

          Jared,

          I can’t say that we have had it happen in a production environment, however in our test environment I have had something similar happen. I attribute it to the number of processes running on this box. Now I am sure this will open a can of worms on this thread, many have opinions on the number of threads and processes a site can have. Unless you are running Windows a limitation does not exsist according to support. However your hardware can only handle so many processes and threads before so weird things start to happen.

          We are cutting back on the number of threads and processes in our test site until we can get that box upgraded.

          Just my 2 cents,

          Brian

        • #58763
          Jim Kosloskey
          Participant

            Just another .02.

            We utilize multiple sites (even in test) which allows us to manage the environment.

            Jim Kosloskey

            email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

          • #58764
            Brian Goad
            Participant

              Unfortuantely for us, our 1965 model HP-UX Test Box has multiple sites and still has problems. We are just thankful to have a test server.

            • #58765
              Tom Patton
              Participant

                Jared,

                We are AIX 5.1 and QDX 5.3 rev3 on a newer box with lots of power and still have the same issue.

                Look at the hcimonitord.log.old and hcimonitord.err.old files in the $HCISITEDIR/exec/hcimonitord directory.  You may see some helpful info in those logs.  For us, we are getting signal 4 panics and are still trying to resolve the problem.  

                We are under the guidelines for processes and threads on a site so that is not an issue.

                Support has recommended we make some changes to the etc/security/limits file as described in the install guide.  They also recommended I place the monitor process in debug.  I haven’t done that yet, but I’ve made the changes to the limits file and I’m waiting to see if another panic occurs.

                Other threads on this topic have focused on the alerts processing and the path defined in the action, but I don’t think that is an issue with us.

                If I find what solves this I will post it..

                Regards,

              • #58766
                James Cobane
                Participant

                  Tom,

                  With respect to the panics you are experiencing, you may want to verify that you have the following parameter set in the /etc/environment file:

                    AIXTHREAD_MNRATIO=1:1

                  Support can give you the specifics, but I know that for AIX 5.1 or greater, this parameter should be set.  We had issues with sporadic panics, and this resolved them.

                  Hope this helps.

                  Jim Cobane

                  Henry Ford Health

                • #58767
                  Tom Patton
                  Participant

                    James, as always, thank you for your input.  I think we might have talked about this before and that parm was missing on our system.  But I’ve since added it and am still experiencing the panics.

                    Support has been great and have suggested I make the limit file changes, and place the monitor deamon in debug.  So far, I’ve made the limit file changes and I have one test site monitor running in debug.

                    I’m scratching my head, but plodding along……

                Viewing 5 reply threads
                • The forum ‘Cloverleaf’ is closed to new topics and replies.