Replies Created – Forums – Mike Ellert

Forum Replies Created

Viewing 15 replies – 1 through 15 (of 16 total)

1 2 →

Author

Replies
July 6, 2011 at 2:57 pm in reply to: Cloverleaf 5.8 rev3 on Redhat 5.4 #74720
Mike Ellert
Participant
Hi Jerry. Are you using the tcpip protocol or pdl-tcpip protocol? I was using tcpip on 5.8R3 on Redhat 5.3 and it was a train wreck. I had to switch back to the pdl-tcpip protocol.
March 3, 2011 at 2:33 pm in reply to: open file limits on RedHat #73733
Mike Ellert
Participant
Genious Ron – just freakin’ genious. Nice idea!

I checked using your python script – I modified it to wait for input so I could double dog check the files were simultaneously open. Your script was able to open 1065 files at once.

I’ll forward the findings on to Lawson support.

Thank you.
March 1, 2011 at 7:16 pm in reply to: open file limits on RedHat #73731
Mike Ellert
Participant
We have them set in there at 20,000
March 1, 2011 at 5:11 pm in reply to: open file limits on RedHat #73729
Mike Ellert
Participant
No luck. The process still cannot exceed 1024 open files.
March 1, 2011 at 2:07 pm in reply to: open file limits on RedHat #73728
Mike Ellert
Participant
Thanks for the response Ron

file-max is set to 100,000

ulimit -n reports 20,000 for the hci user.

I can’t find any entry for nr_open but some quick research says that it defaults to 1024*1024.

After our last failure, we booted the machine – all adjustments to file-max and the user hard and soft limits were made since the previous boot. I will push the process this morning to get to the 1024 limit and see if it is still a problem and report later.
February 24, 2011 at 3:03 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73611
Mike Ellert
Participant
I’m running on Redhat. This is the exact configuration that was running on CL5.5 (also on Redhat). This problem has only started since upgradeing to 5.8.
February 18, 2011 at 4:20 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73607
Mike Ellert
Participant
I am experiencing another difficulty with 5.8.3.0P that you might run into. Currently about once a week, one of our processes panics after it starts receiving these types of errors:

[msi :msi :ERR /0: softlab_in:02/11/2011 08:03:18] msiSectionLock: Can’t lock semaphore for thread softlab_in: Too many open files

[msi :msi :ERR /0: softlab_in:02/11/2011 08:03:18] msiExportStats: Can’t lock data section for thread softlab_in

Lawson support is working on it and thought it was maybe the semaphore settings. Increasing it did not help. We have our open file limits set to 20,000 and as far as I can tell, at the time this error occurs, all of the hci processes have a total of about 1,500 files open.

If anyone else has experienced this and has a solution, I’d be happy to hear what you’ve done.
February 17, 2011 at 3:04 pm in reply to: Alerts: Protocol Status vs Thread Status #73666
Mike Ellert
Participant
Well that was embarrassingly simple. Doh!

Thank you.
February 17, 2011 at 12:47 pm in reply to: Alerts: Protocol Status vs Thread Status #73664
Mike Ellert
Participant
I don’t understand the difference either. Can someone please explain?
February 10, 2011 at 2:21 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73605
Mike Ellert
Participant
I have an open case with support as well and R&D is looking through log files.

I also have another open case regarding the smat file cycling not releasing file handles. Part of the problem still exists even after Rev 3 was applied. If a thread or process is stopped and started when the log file is empty, the file handle will remain used.
February 10, 2011 at 2:02 pm in reply to: Opening thread causes throughput issues – CIS 5.8.3.0 P #73603
Mike Ellert
Participant
Hi Calvin. I’ve been live with 5.8 since Jan 18th. I initially had all my threads set up as TCPIP with MLLP encapsulation but had to switch all the threads back to the mlp_tcp pdl a couple days later.

With the mlp_tcp pdl, I only have the connection issue you speak of when the host and server are on the same server – I use client/server pairs for inter-process communication. For these, I have to make sure the server is up first or the client will never connect.

When the server is another host/application, the mlp_tcp pdl client seems to be able to connect under all conditions.
January 21, 2011 at 8:28 pm in reply to: 5.8 mlp #72382
Mike Ellert
Participant
I went live on CL5.8R2 on RHEL 5.3 Tuesday, Jan 18th. As part of testing, I converted all of my threads to the new built-in encapsulated TCPIP protocol.

I already knew during testing that the threads were sluggish responding to commands such as pstart and pstop. I use scripts to start and stop processes and threads. However, we have about 20 outbound threads over VPN and the release notes state that the new protocol would handle the looping issue on VPN threads. Sooooooooooo, I went with them.

I was pleased with the go-live through Tuesday but by Wednesday morning, I was starting to get concerned. I email error logs nightly and Wednesday morning’s email was over 2MB – much of it “Can’t connects” which were expected – but there were a huge number of timeouts on threads – lots of them even on threads where I was connecting to another thread on the same server to ‘hop across’ processes. I had threads that process 18,000 lab result messages/day (I know, not a big number) falling behind, writing partial messages and causing timeouts. I had inbound threads not acking the sending systems in a timely manner.

The last straw came in the form of a phone call in the wee hours Thursday. No lab results to our EHR in over four hours. I logged in and everything was a mess. I even had queues stacked up in the routes in net monitor – something I’d NEVER seen in over 12 years. Over the course of a couple hours of looking at traces, I narrowed it down to a pair of connections to one server – somehow, these two threads had everything else ‘locked up’ for lack of a better term. I killed the listeners on that server and bang – everything caught right up.

I spent the next couple hours converting all of my threads back to the mlp_tcp pdl and CL5.8 is now running like a charm again. No timeouts, no backed up threads – and the threads respond to commands almost instantly. I will go back to stopping/starting my VPN threads every half an hour to mitigate the looping logging problem – it’s a small price to pay for having a system that runs smoothly.

I will say this though – when I use a client and server pair to hop from one process to another, the server thread has to be up first or the client thread will never connect. In CL5.5 and prior versions, it didn’t matter what order you brought them up in.

Just something to be aware of.

PS: I’m in no way bashing the product. CL has been rock solid for as long as I’ve used it and I expect it will continue to be.
January 13, 2010 at 4:07 pm in reply to: PDL errors #70327
Mike Ellert
Participant
Hi Chris. Did you ever get any help on this? We struggle with the same issue and it ONLY occurs on threads over VPN tunnels. I’ve never found a solution to the problem.
March 12, 2009 at 8:18 pm in reply to: Cloverleaf 5.4.1 on Linux #66934
Mike Ellert
Participant
Hi John.

We moved from HP-UX to Red Hat 4 about 14 months ago. We’re currently running CL 5.5.

I couldn’t be happier – the transition was very smooth. I had only one problem – using named ports did not work so I had to specify the ports by number.

Performance is incredible. Performing an identical batch of translations was at least 10 times faster on Linux than on HP-UX and the HP-UX server was no slouch.

It has been and continues to be as stable as it was on HP-UX which is to say it runs – forever. To the point of boredom!

My only concern is running on an Intel class platform vs the heavy iron. I fully expect it to fail some day. But, we’ll fail over to the T&D server and replace it – in fact, the Intel servers are so cheap relative to the HP-UX class that, we’ll probably replace the servers before they die for us.

Hope this helps
June 17, 2008 at 6:38 pm in reply to: Linux Hardware Recommendations? #59246
Mike Ellert
Participant
Sorry Scott – I knew I was in over my head! I just talked to the sys admin for the C/L boxes. They’re not DL360’s. They’re the 460 blade servers.
Author

Replies

Viewing 15 replies – 1 through 15 (of 16 total)

1 2 →

Mike Ellert

@mike-ellert

Forum Replies Created