Mike Ellert

Forum Replies Created

Viewing 15 replies – 1 through 15 (of 16 total)
  • Author
    Replies
  • in reply to: Cloverleaf 5.8 rev3 on Redhat 5.4 #74720
    Mike Ellert
    Participant

      Hi Jerry.  Are you using the tcpip protocol or pdl-tcpip protocol?  I was using tcpip on 5.8R3 on Redhat 5.3 and it was a train wreck.  I had to switch back to the pdl-tcpip protocol.

      in reply to: open file limits on RedHat #73733
      Mike Ellert
      Participant

        Genious Ron – just freakin’ genious.  Nice idea!

        I checked using your python script – I modified it to wait for input so I could double dog check the files were simultaneously open.  Your script was able to open 1065 files at once.

        I’ll forward the findings on to Lawson support.

        Thank you.

        in reply to: open file limits on RedHat #73731
        Mike Ellert
        Participant

          We have them set in there at 20,000

          in reply to: open file limits on RedHat #73729
          Mike Ellert
          Participant

            No luck.  The process still cannot exceed 1024 open files.

            in reply to: open file limits on RedHat #73728
            Mike Ellert
            Participant

              Thanks for the response Ron

              file-max is set to 100,000

              ulimit -n reports 20,000 for the hci user.

              I can’t find any entry for nr_open but some quick research says that it defaults to 1024*1024.

              After our last failure, we booted the machine – all adjustments to file-max and the user hard and soft limits were made since the previous boot.  I will push the process this morning to get to the 1024 limit and see if it is still a problem and report later.

              in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73611
              Mike Ellert
              Participant

                I’m running on Redhat.  This is the exact configuration that was running on CL5.5 (also on Redhat).  This problem has only started since upgradeing to 5.8.

                in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73607
                Mike Ellert
                Participant

                  I am experiencing another difficulty with 5.8.3.0P that you might run into.  Currently about once a week, one of our processes panics after it starts receiving these types of errors:

                  [msi :msi :ERR /0:   softlab_in:02/11/2011 08:03:18] msiSectionLock: Can’t lock semaphore for thread softlab_in: Too many open files

                  [msi :msi :ERR /0:   softlab_in:02/11/2011 08:03:18] msiExportStats: Can’t lock data section for thread softlab_in

                  Lawson support is working on it and thought it was maybe the semaphore settings.  Increasing it did not help.  We have our open file limits set to 20,000 and as far as I can tell, at the time this error occurs, all of the hci processes have a total of about 1,500 files open.

                  If anyone else has experienced this and has a solution, I’d be happy to hear what you’ve done.

                  in reply to: Alerts: Protocol Status vs Thread Status #73666
                  Mike Ellert
                  Participant

                    Well that was embarrassingly simple.  Doh!

                    Thank you.

                    in reply to: Alerts: Protocol Status vs Thread Status #73664
                    Mike Ellert
                    Participant

                      I don’t understand the difference either.  Can someone please explain?

                      in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73605
                      Mike Ellert
                      Participant

                        I have an open case with support as well and R&D is looking through log files.

                        I also have another open case regarding the smat file cycling not releasing file handles.  Part of the problem still exists even after Rev 3 was applied.  If a thread or process is stopped and started when the log file is empty, the file handle will remain used.

                        in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73603
                        Mike Ellert
                        Participant

                          Hi  Calvin.  I’ve been live with 5.8 since Jan 18th.  I initially had all my threads set up as TCPIP with MLLP encapsulation but had to switch all the threads back to the mlp_tcp pdl a couple days later.

                          With the mlp_tcp pdl, I only have the connection issue you speak of when the host and server are on the same server – I use client/server pairs for inter-process communication.  For these, I have to make sure the server is up first or the client will never connect.

                          When the server is another host/application, the mlp_tcp pdl client seems to be able to connect under all conditions.

                          in reply to: 5.8 mlp #72382
                          Mike Ellert
                          Participant

                            I went live on CL5.8R2 on RHEL 5.3 Tuesday, Jan 18th.  As part of testing, I converted all of my threads to the new built-in encapsulated TCPIP protocol.

                            I already knew during testing that the threads were sluggish responding to commands such as pstart and pstop.  I use scripts to start and stop processes and threads.  However, we have about 20 outbound threads over VPN and the release notes state that the new protocol would handle the looping issue on VPN threads.  Sooooooooooo, I went with them.

                            I was pleased with the go-live through Tuesday but by Wednesday morning, I was starting to get concerned.  I email error logs nightly and Wednesday morning’s email was over 2MB – much of it “Can’t connects” which were expected – but there were a huge number of timeouts on threads – lots of them even on threads where I was connecting to another thread on the same server to ‘hop across’ processes.  I had threads that process 18,000 lab result messages/day (I know, not a big number) falling behind, writing partial messages and causing timeouts.  I had inbound threads not acking the sending systems in a timely manner.

                            The last straw came in the form of a phone call in the wee hours Thursday.  No lab results to our EHR in over four hours.  I logged in and everything was a mess.  I even had queues stacked up in the routes in net monitor – something I’d NEVER seen in over 12 years.  Over the course of a couple hours of looking at traces, I narrowed it down to a pair of connections to one server – somehow, these two threads had everything else ‘locked up’ for lack of a better term.  I killed the listeners on that server and bang – everything caught right up.

                            I spent the next couple hours converting all of my threads back to the mlp_tcp pdl and CL5.8 is now running like a charm again.  No timeouts, no backed up threads – and the threads respond to commands almost instantly.  I will go back to stopping/starting my VPN threads every half an hour to mitigate the looping logging problem – it’s a small price to pay for having a system that runs smoothly.

                            I will say this though – when I use a client and server pair to hop from one process to another, the server thread has to be up first or the client thread will never connect.  In CL5.5 and prior versions, it didn’t matter what order you brought them up in.

                            Just something to be aware of.

                            PS: I’m in no way bashing the product.  CL has been rock solid for as long as I’ve used it and I expect it will continue to be.

                            in reply to: PDL errors #70327
                            Mike Ellert
                            Participant

                              Hi Chris.  Did you ever get any help on this?  We struggle with the same issue and it ONLY occurs on threads over VPN tunnels.  I’ve never found a solution to the problem.

                              in reply to: Cloverleaf 5.4.1 on Linux #66934
                              Mike Ellert
                              Participant

                                Hi John.

                                We moved from HP-UX to Red Hat 4 about 14 months ago.  We’re currently running CL 5.5.

                                I couldn’t be happier – the transition was very smooth.  I had only one problem – using named ports did not work so I had to specify the ports by number.

                                Performance is incredible.  Performing an identical batch of translations was at least 10 times faster on Linux than on HP-UX and the HP-UX server was no slouch.

                                It has been and continues to be as stable as it was on HP-UX which is to say it runs – forever.  To the point of boredom!

                                My only concern is running on an Intel class platform vs the heavy iron.  I fully expect it to fail some day.  But, we’ll fail over to the T&D server and replace it – in fact, the Intel servers are so cheap relative to the HP-UX class that, we’ll probably replace the servers before they die for us.

                                Hope this helps

                                in reply to: Linux Hardware Recommendations? #59246
                                Mike Ellert
                                Participant

                                  Sorry Scott – I knew I was in over my head!  I just talked to the sys admin for the C/L boxes.  They’re not DL360’s.  They’re the 460 blade servers.

                                Viewing 15 replies – 1 through 15 (of 16 total)