Memory leaks in Cloverleaf 5.8

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Memory leaks in Cloverleaf 5.8

  • Creator
    Topic
  • #52110
    Mark Rogers
    Participant

      Has anyone experienced memory leaks in Cloverleaf 5.8 that they did not see in Cloverleaf 5.6?  We have moved all of our 10 sites to Cloverleaf 5.8 and are experiencing Memory leaks which is causing our percent of file usage to clime to critical levels.  Bouncing all processes in a site and the Lock manager and Monitor Daemon does help regain the file space but cannot be permanent solution.

    Viewing 16 reply threads
    • Author
      Replies
      • #73101
        James Cobane
        Participant

          Mark,

          There is a known issue with memory and Cloverleaf 5.8.  From what I undersantd, 5.8 requires/utilizes more memory and is primarily due to the the re-compilation of tcl for multi-threading and how it allocates the memory.  I believe there is a Rev 2 patch in the pipeline to tune some of the memory utilization, but I don’t believe there is an ETA yet.  We added more memory to our servers, and increased the swap space.  We haven’t moved to production yet, but are currently running 5.8 in our test environment.  I would recommend contacting Support to provide specific answers.

          Hope this helps.

          Jim Cobane

          Henry Ford Health

        • #73102
          Rob Abbott
          Keymaster

            To be clear – this is NOT a memory leak per se.  

            In 5.8 the Tcl interpreter consumes more memory on startup.  We are actively working on a solution for this.

            We aren’t aware of any memory leaks – those that consume memory over time and the memory is not released – in 5.8.

            Rob Abbott
            Cloverleaf Emeritus

          • #73103
            Rob Abbott
            Keymaster

              Please note I’m referring to Jim’s comment … Mark please do follow up with Support so that they can help diagnose the issue you are seeing.

              Rob Abbott
              Cloverleaf Emeritus

            • #73104
              Robert Milfajt
              Participant

                Rob, do you know when the rev that adresses this issue will be released?  As far as 5.8 goes, what is the current rev level?

                Thanks,

                Bob

                Robert Milfajt
                Northwestern Medicine
                Chicago, IL

              • #73105
                Tim Pancost
                Participant

                  Here’s what we have found since we went live on 11/21 with one of our production machines, and we have a case open with Support.  When you cycle saved message files, the engine doesn’t release the disk used by the original saved message files.

                  So, let’s say you have a saved message file where the .idx is 10Mb, and the .msg is 20Mb.  When you cycle that thread, you’ll end up with a .old.idx that’s 10Mb, a .old.msg that’s 20Mb, a .idx that’s 0Mb and a .msg that’s 0Mb.  The thing is, you’ll also have two file descriptors(no filenames) left open with sizes 10Mb and 20Mb that are never released back to the operating system.  So, even though you may move the .old.xxx files off to another filesystem for archival, that 30Mb is still seen as being in use.  And when you cycle again, that space is likewise not returned, so the effect is cumulative.

                  We’ve found that the only way to get the engine to close the file descriptors and return the space is to do a stop_save/start_save, or to bounce the process.  Just bouncing the thread does not work.  For outbound threads, you can get around it by modifying your cycle save job to put the thread on hold, stop_save, copy off the files, start_save, and release the thread.  Unfortunately, since you can’t put an inbound thread on hold, this does work for those threads.  And if you just do the stop_save, you’re going to be missing messages from your saved message files, which kind of defeats the purpose of having them.

                  This wasn’t really noticed during testing for a couple of reasons.  First, test sites generally have no where near the volume of productions sites, so the disk just isn’t used up as fast.  Second, I know we tend to bounce test processes much more frequently, so the space would tend to get released more frequently.

                  Just thought folks would like to have this info as they’re thinking about upgrading.  I’ll keep everyone informed of what we hear from Support.

                  HTH,

                  TIM

                  Tim Pancost

                  Senior Technical Analyst

                  Trinity Information Services

                  Tim Pancost
                  Trinity Health

                • #73106
                  Rob Abbott
                  Keymaster

                    Robert Milfajt wrote:

                    Rob, do you know when the rev that adresses this issue will be released?

                    Rob Abbott
                    Cloverleaf Emeritus

                  • #73107
                    Robert Milfajt
                    Participant

                      Rob, can you give an idea of how much addtional memory is consumed?  I believe we will upgrade to 5.8 rev 1 and just pay the memory costs.

                      Thanks,

                      Bob

                      Robert Milfajt
                      Northwestern Medicine
                      Chicago, IL

                    • #73108
                      Rob Abbott
                      Keymaster

                        Hi Bob – we have seen approximately an additional usage of 1.4MB per protocol thread when comparing 5.8.0 to 5.7rev2.  

                        If you are counting threads also take into account the translation and command threads – so add 2 for each process you have.

                        Rob Abbott
                        Cloverleaf Emeritus

                      • #73109
                        Michael Lacriola
                        Participant

                          Mark, we are seeing something similar to what you are speaking of. We have not been able to put our finger on it and are making sure that our monitors are bounced and are test evironments are clean. We do cycle and save all our engine processes and inbound threads on the server 16 days.

                          We got up to 88% after being on 5.8 for about 2 months. We decided that it seems stable enough (besides having to bounce monitors), so we removed qdx5.6 directory structure which got us a lot of disc back.

                        • #73110
                          Steve MacDonald
                          Participant

                            There is no way to tell how much disk you will end up using.  It depends on the number of saved message files in your 5.8 sites and the volume of messages they ‘save’.  The file system will simply continue to grow until the process that created the ‘open’ file is bounced.  

                            ON AIX 5.3.0.0


                            >

                            if you do a ps -ef | grep [hci_process] and retrieve the pid – you can then check all of the open files associated with that process.  

                            on AIX – lsof -p [pid]

                            those with the exclusive write lock ‘wW’ are the active .idx and .msg files.  those where the write lock is released, but the allocated memory is not have the lowercase ‘w’ only.

                            These are files ‘taking up disk’ but do not have a ‘file name’ when searched for:

                            COMMAND    PID    USER   FD   TYPE             DEVICE  SIZE/OFF  NODE  NAME                                                                                

                            hciengine 893054  hci   82w  VREG              103,2  1359387 16804 /sites8 (/dev/ie8_fs01)

                            hciengine 893054  hci   83w  VREG              103,2  1154241 16772 /sites8 (/dev/ie8_fs01)

                            hciengine 893054  hci   84w  VREG              103,2  3322464 16774 /sites8 (/dev/ie8_fs01)

                            hciengine 893054  hci   85w  VREG              103,2  5349123 16783 /sites8 (/dev/ie8_fs01)

                          • #73111
                            Dave Thall
                            Participant

                              I am going to put this out there. Is there an extra over head with the new protocol configurations listed in 5.8?

                              We also saw some unexplainable binding errors intermittently stopping threads only configured for vpn. We have a fix in for now, but the networking group couldn

                            • #73112
                              Ian Morris
                              Participant

                                Rob Abbott wrote:

                                Robert Milfajt wrote:

                                Rob, do you know when the rev that adresses this issue will be released?

                              • #73113
                                Thomas Fortino
                                Participant

                                  Hello All,

                                  We’re in the initial stages of upgrading for AIX5.2 CL 5.3 rev2 to Linux 5.3  CL Version: 5.8.1.0P.

                                  Has anyone seen this behavior in Linux.

                                • #73114
                                  Rob Abbott
                                  Keymaster

                                    Looks like we have several issues here in the same thread.  I’ll try to address one by one:

                                    – I cannot give a release date for 5.8.2.  Sorry but with Lawson being public we have strict rules around this.  As always an announcement will be posted to Clovertech when it’s available.  

                                    – We are not aware of any memory leaks in 5.8.  We are aware that 5.8 uses more memory per thread than 5.7 or below.  This will be fixed in 5.8.2.

                                    – There does appear to be a “disk space leak” around SMAT file cycling and possibly log cycling.  We have reproduced this on AIX.  I don’t know if it’s happening on other platforms.  As soon as I have more information around a solution we’ll let you know.   Those of you with open support cases will of course receive details through Tech Support.

                                    – Dave I’m not really sure what you are asking about.  If you have something you can reproduce reliably please open a support case.

                                    Rob Abbott
                                    Cloverleaf Emeritus

                                  • #73115
                                    Ian Morris
                                    Participant

                                      Rob Abbott wrote:

                                      Looks like we have several issues here in the same thread.

                                    • #73116
                                      Robert Milfajt
                                      Participant

                                        Thanks Rob.

                                        Robert Milfajt
                                        Northwestern Medicine
                                        Chicago, IL

                                      • #73117
                                        Rob Abbott
                                        Keymaster

                                          Hi all, I’ve some information regarding upcoming patches:

                                        • 5.8.2 is due by the end of this month and will have several bug fixes including a fix for the increased memory usage.

                                          Rob Abbott
                                          Cloverleaf Emeritus

                                      Viewing 16 reply threads
                                      • The forum ‘Cloverleaf’ is closed to new topics and replies.