Forum Replies Created
-
AuthorReplies
-
Hi Jerry. Are you using the tcpip protocol or pdl-tcpip protocol? I was using tcpip on 5.8R3 on Redhat 5.3 and it was a train wreck. I had to switch back to the pdl-tcpip protocol.
Genious Ron – just freakin’ genious. Nice idea!
I checked using your python script – I modified it to wait for input so I could double dog check the files were simultaneously open. Your script was able to open 1065 files at once.
I’ll forward the findings on to Lawson support.
Thank you.
We have them set in there at 20,000
No luck. The process still cannot exceed 1024 open files.
Thanks for the response Ron
file-max is set to 100,000
ulimit -n reports 20,000 for the hci user.
I can’t find any entry for nr_open but some quick research says that it defaults to 1024*1024.
After our last failure, we booted the machine – all adjustments to file-max and the user hard and soft limits were made since the previous boot. I will push the process this morning to get to the 1024 limit and see if it is still a problem and report later.
February 24, 2011 at 3:03 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73611I’m running on Redhat. This is the exact configuration that was running on CL5.5 (also on Redhat). This problem has only started since upgradeing to 5.8.
February 18, 2011 at 4:20 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73607I am experiencing another difficulty with 5.8.3.0P that you might run into. Currently about once a week, one of our processes panics after it starts receiving these types of errors:
[msi :msi :ERR /0: softlab_in:02/11/2011 08:03:18] msiSectionLock: Can’t lock semaphore for thread softlab_in: Too many open files
[msi :msi :ERR /0: softlab_in:02/11/2011 08:03:18] msiExportStats: Can’t lock data section for thread softlab_in
Lawson support is working on it and thought it was maybe the semaphore settings. Increasing it did not help. We have our open file limits set to 20,000 and as far as I can tell, at the time this error occurs, all of the hci processes have a total of about 1,500 files open.
If anyone else has experienced this and has a solution, I’d be happy to hear what you’ve done.
Well that was embarrassingly simple. Doh!
Thank you.
I don’t understand the difference either. Can someone please explain?
February 10, 2011 at 2:21 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73605I have an open case with support as well and R&D is looking through log files.
I also have another open case regarding the smat file cycling not releasing file handles. Part of the problem still exists even after Rev 3 was applied. If a thread or process is stopped and started when the log file is empty, the file handle will remain used.
February 10, 2011 at 2:02 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73603Hi Calvin. I’ve been live with 5.8 since Jan 18th. I initially had all my threads set up as TCPIP with MLLP encapsulation but had to switch all the threads back to the mlp_tcp pdl a couple days later.
With the mlp_tcp pdl, I only have the connection issue you speak of when the host and server are on the same server – I use client/server pairs for inter-process communication. For these, I have to make sure the server is up first or the client will never connect.
When the server is another host/application, the mlp_tcp pdl client seems to be able to connect under all conditions.
I went live on CL5.8R2 on RHEL 5.3 Tuesday, Jan 18th. As part of testing, I converted all of my threads to the new built-in encapsulated TCPIP protocol.
I already knew during testing that the threads were sluggish responding to commands such as pstart and pstop. I use scripts to start and stop processes and threads. However, we have about 20 outbound threads over VPN and the release notes state that the new protocol would handle the looping issue on VPN threads. Sooooooooooo, I went with them.
I was pleased with the go-live through Tuesday but by Wednesday morning, I was starting to get concerned. I email error logs nightly and Wednesday morning’s email was over 2MB – much of it “Can’t connects” which were expected – but there were a huge number of timeouts on threads – lots of them even on threads where I was connecting to another thread on the same server to ‘hop across’ processes. I had threads that process 18,000 lab result messages/day (I know, not a big number) falling behind, writing partial messages and causing timeouts. I had inbound threads not acking the sending systems in a timely manner.
The last straw came in the form of a phone call in the wee hours Thursday. No lab results to our EHR in over four hours. I logged in and everything was a mess. I even had queues stacked up in the routes in net monitor – something I’d NEVER seen in over 12 years. Over the course of a couple hours of looking at traces, I narrowed it down to a pair of connections to one server – somehow, these two threads had everything else ‘locked up’ for lack of a better term. I killed the listeners on that server and bang – everything caught right up.
I spent the next couple hours converting all of my threads back to the mlp_tcp pdl and CL5.8 is now running like a charm again. No timeouts, no backed up threads – and the threads respond to commands almost instantly. I will go back to stopping/starting my VPN threads every half an hour to mitigate the looping logging problem – it’s a small price to pay for having a system that runs smoothly.
I will say this though – when I use a client and server pair to hop from one process to another, the server thread has to be up first or the client thread will never connect. In CL5.5 and prior versions, it didn’t matter what order you brought them up in.
Just something to be aware of.
PS: I’m in no way bashing the product. CL has been rock solid for as long as I’ve used it and I expect it will continue to be.
Hi Chris. Did you ever get any help on this? We struggle with the same issue and it ONLY occurs on threads over VPN tunnels. I’ve never found a solution to the problem.
Hi John.
We moved from HP-UX to Red Hat 4 about 14 months ago. We’re currently running CL 5.5.
I couldn’t be happier – the transition was very smooth. I had only one problem – using named ports did not work so I had to specify the ports by number.
Performance is incredible. Performing an identical batch of translations was at least 10 times faster on Linux than on HP-UX and the HP-UX server was no slouch.
It has been and continues to be as stable as it was on HP-UX which is to say it runs – forever. To the point of boredom!
My only concern is running on an Intel class platform vs the heavy iron. I fully expect it to fail some day. But, we’ll fail over to the T&D server and replace it – in fact, the Intel servers are so cheap relative to the HP-UX class that, we’ll probably replace the servers before they die for us.
Hope this helps
Sorry Scott – I knew I was in over my head! I just talked to the sys admin for the C/L boxes. They’re not DL360’s. They’re the 460 blade servers. -
AuthorReplies