Running out of semephores

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Running out of semephores

  • Creator
    Topic
  • #48961
    John Mercogliano
    Participant

      Has anyone experienced problems with running out of semaphores?

      This is the situation.  On our test systems, we are in a very active development phase and have added about 12 new processes and 50 connections to the test system.  This is the error we get:

      [msi :msi :ERR /0:cerner_result_ord_cmd] msiUpdateRegion: Can’t init thread to_ppid semaphore: No space left on device

      We don’t have this problem on our production system but the only difference is one, we have not moved all the new sites to production, and two we don’t start and stop the sites/connections multiple times per day.  

      We clear this up by killing all engine processes, then delete any remaining semaphores using the HP-UX ipcs command.

      I’m wondering if maybe the hcienginestop/run or hcicmd command is not properly cleaning up the semaphores.

      If anyone has any insights, they would be welcome.  This is happening not about every three weeks on use.

      Thanks,

      John Mercogliano
      Sentara Healthcare
      Hampton Roads, VA

    Viewing 7 reply threads
    • Author
      Replies
      • #60247
        LeeAnne Kardas
        Participant

          This happen to my test environment.  I brought down all processes and sites then rebooted our HP/UX Unix Server, then brought up all sites and processes.  Try rebooting your HP/UX Unix Server.

        • #60248
          Russ Ross
          Participant

            Since I have a huge number of sites, threads and message flow, I was concerned about understanding semaphores a bit better.

            Fortunately, I’ve never had a single semaphore issue during my 8+ years here.

            I searched old postings related to your semaphore problem as a start.

            Here is one that might be of interest:

            https://usspvlclovertch2.infor.com/viewtopic.php?t=115&highlight=semaphore” class=”bbcode_url”>https://usspvlclovertch2.infor.com/viewtopic.php?t=115&highlight=semaphore

            One of the old posts states:

            Quote:

            The problem surfaces when someone starts the process with a user name other then hci.  This is because the semaphore and the shared memory segment cannot be accessed.  When you see this, you can do an ipcs -a and you will see the semaphore and shared memory segment information.

            Since we do everything as hci this may have kept us out of semaphore trouble.

            Also, a couple of old posts I located indicated this problem is more likely to occur on a Windows platform.

            I’m on AIX and have not located where the limit of semaphores is set but did see a reference to increasing the number of files in the /etc/security/limits file – mine is set to 2048 at this time.

            Here is a quote window showing my /etc/security/limits settings for the hci users (2 of them are for QDX support):

            Quote:

            hci:

                   fsize = 2097151

                   core = 2048

                   cpu = -1

                   data = 786432

                   rss = 196608

                   stack = 196608

                   nofiles = 2048

            hcitest:

                   fsize = 2097151

                   core = 2048

                   cpu = -1

                   data = 786432

                   rss = 196608

                   stack = 196608

                   nofiles = 2048

            hcispt1:

                   fsize = 2097151

                   core = 2048

                   cpu = -1

                   data = 786432

                   rss = 196608

                   stack = 196608

                   nofiles = 2048

            A couple of semaphore related commads whose output I’ve yet to figure out are:

            ipcs -a

            sar -m 5 3

            I also found posts suggesting to keep process and thread names short:

            Quote:

            We’ve had this alot, and I think it is because of long thread names. Threads shouldn’t be more than 15 characters, and processes 9. But after you change them, you need to delete the /stats directory, which is under /exec – probably why that fix works, but it just works temporarily.

            Do you have long thread or process names?

            We have exceeded these limits alot but do not have semaphore problems.

            Other posts also talked about the use of

            hcimsiutil -R

            hcimsiutil -rs

            Put it all together and I’m not really comfortable with my current understanding of semaphores

            My gut tells me that being I have such a high probablity of a semaphore error due to huge number of threads and sites and message flow, I suspect either running on AIX or just using the hci user ID has contributed to our stable success.

            Since security is pushing us to seperate logons, I will pay attention to see if any semaphore problems arise at that juncture.

            Russ Ross
            RussRoss318@gmail.com

          • #60249
            Bill Bertera
            Participant

              Does HP/UX use the /etc/system settings? On Solaris, you have to define the amount of available semaphores in that file. Look for stuff like this:

              * Cloverleaf Interface Engine Requirements

              set semsys:seminfo_semmap=2050

              set semsys:seminfo_semaem=16384

              set semsys:seminfo_semmnu=2048

              set pt_cnt = 1024

              set semsys:seminfo_semmsl=8192

            • #60250
              John Mercogliano
              Participant

                On HP-UX we use kmtune to query our kernel settings and they are all at QDX recommend for high use or higher. Also, we can clear this up with out rebooting, just doing the cleanup steps I mentioned.  I’ve gathered the semaphore count before any hci ones are made when we reboot and when I do the cleanup steps the counts are back down to around when the system was booted. Based on the values returned by glanceplus when we run into this problem we have not hit our semaphore max.   Usually this displays between 70 and 75 percent.  I’m thinking we did not encounter this to often in the past because we reboot the system every 6 weeks.  But the increase sites and connections might be causing it to hit sooner.  

                Russ, I understand you confusion.  I’ve also been trying to understand them and I feel I’m only half there.

                I have read all those post and except for the long names none of them apply.  We always log on as hci.  

                As for the ipcs command:  

                running ipcs -s | wc -l  will give you the current usage for the Semaphore Table (semmni).  You have to subtract three.

                ipcs -m | wc -l minus 3 gives you Shared Mem Table (shmmni) usage.

                Adding a grep for hci will give you a count just for the hci user.

                I use glance to look at the current usage percentage of my system table values.

                I’m trying to duplicate the problem but I’m not getting anywhere.  I have been trying to start and stop the process and monitord in different orders multiple times to see if that does anything by my counts are the same.  

                One interesting thing I have found so far, if monitord is down, starting and stopping the process will create and destroy the sem files in the exec directory but if monitord is running the sem files hang around.  

                I have also noticed that sometimes not all of the sem files are deleted when all processes and the monitord is shutdown.  Does anyone know why this might happen and how to reproduce it. I’m wondering if this might be causing our problem.

                Thanks and keep the thoughts coming…

                Merry Christmas

                John Mercogliano
                Sentara Healthcare
                Hampton Roads, VA

              • #60251
                Russ Ross
                Participant

                  John:

                  Here is what I see using

                  ipcs -s

                  IPC status from /dev/mem as of Wed Dec 27 13:30:42 CST 2006
                  T
                  [code]IPC status from /dev/mem as of Wed Dec 27 13:30:42 CST 2006
                  T

                  Russ Ross
                  RussRoss318@gmail.com

                • #60252
                  james tey
                  Participant

                    Bill Bertera wrote:

                    Does HP/UX use the /etc/system settings? On Solaris, you have to define the amount of available semaphores in that file. Look for stuff like this:

                    * Cloverleaf Interface Engine Requirements

                    set semsys:seminfo_semmap=2050

                    set semsys:seminfo_semaem=16384

                    set semsys:seminfo_semmnu=2048

                    set pt_cnt = 1024

                    set semsys:seminfo_semmsl=8192

                    Anyone can help to verify this applies for Solaris 10 systems?

                  • #60253
                    David Harrison
                    Participant

                      For Solaris 10 and Cloverleaf 5.6 you just need the following (according to the installation instructions):

                      set semsys:seminfo_semmni = 512

                      I’m guessing the rest is self tuning.

                    • #60254
                      james tey
                      Participant

                        Thanks David.

                        I’ve hit the exact same error when starting the 2nd test site monitor daemon (1st site monitor daemon started without any errors)

                        I switch the site and try, yet the same error occurs.

                        Nothing else was done but just adding the extra setting as listed by David after the 1st testing was done that shows the error. Rebooted Solaris and got the same thing.

                        What else I’ve miss out??  🙁

                    Viewing 7 reply threads
                    • The forum ‘Cloverleaf’ is closed to new topics and replies.