How to speed up processing of inbound MSGS?

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf How to speed up processing of inbound MSGS?

  • Creator
    Topic
  • #49540
    Antonio Chennawi
    Participant

      Hi, I need your help.

      This is a production problem that causes a delay of processing production messages in one process.

      I have one process that contains one inbound thread with pdl-tcpip protocol and four outbound threads with pdl-tcpip and it is raw translation with no Xlate. The outbound threads are connected to another process in different sites, and we are using the standard recovery proc33 for acks. Currently, the process handles one message per one second.

      The sending system changed their interface to send 3.5 messages per one second; however, Cloverleaf was not able to process them. It is only processing one message per second.

      The only time that Cloverleaf can process more than one message per second is if I changed the outbound threads to a file with dev-null. Only then, Cloverleaf was able to process 3.8 messages per second. Also, if I put the outbound threads on hold, the inbound can process 3.8 messages per second.

      These are the things that I tried:

      1- Removed waiting for ACK in the outbound threads

      2- Removed the main inbound to another process that contains one outbound then feeding the records to the existing process

      3- Modified translation Throttlings

      None of them worked.

      Then I tried to change the protocol properties for DATA OPTIONS to close after write and it helpled speeding up the inbound messages. And now, Cloverleaf can process 1.8 messages per one second.

      My questions are:

      1-What another options we have or I can use in the pdl-tcpip or TCP/IP to speed up processing of the inbound messages?

      2- Are there any other options or solutions for my issue?

      We are using QDX 5.2 rev2 with OS AIX 5.2. Thank you

    Viewing 12 reply threads
    • Author
      Replies
      • #62391
        John Hamilton
        Participant

          Check to see what the system looks like.

          Disk I/O CPU usages .. all those good things.

          One message per second just seems real bad.

          Do you have any tcl procs working on them ?

          I would think turning close on after write would have made things slower.

        • #62392
          Antonio Chennawi
          Participant

            Thank you John for your reply.

            I looked at the system CPU and I didn’t see any thing.

            I have some TCLs in the inbound thread but not in the outbound. The only time Cloverleaf process more messages when I change the outbound thread’s protocols to a file or hold the threads. All other TCLs and configurations for the inbound thread are the same. So I am thinking is the TCP/IP outbound thread protocol that causing the process not to give some time to the inbound thread to receive and process messages.

            You are correct John “Close after write” will cause the outbound thread to slow down which this is what I want so the process can give more time to the inbound thread.

          • #62393
            Jim Kosloskey
            Participant

              Antonio,

              Is that all that is in the site – one process; one ib and four ob?

              Also what O/S and what version of CL?

              By the way, we have many such configurations and we do not have such a problem.

              Are your thread arrows showing queueing in the NetMonitor?

              Thanks,

              Jim Kosloskey

              email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

            • #62394
              Antonio Chennawi
              Participant

                Hi Jim,

                I have eight other processes in this site and we are using QDX 5.2 rev 2 on AIX 5.2.

                The arrows are okay it is just the inbound thread not able to process more messages. It will only process more messages if I put the outbound threads on hold or change the outbound thread protocol to file.

              • #62395
                Jim Kosloskey
                Participant

                  Antonio,

                  If you are determining connection arrow constraint based on color change – that is not necessarily reliable unless you have the default thresholds for routing density set.

                  I look at the actual route arrow queue data or the Status display for messages on Queue to determine if I have a Route Psuedo process constraint. Look at ALL the route queues not just the ones related to the inbound with which you are noticing the issue.

                  That is what it sounds like is happening in your case – but I have a very long distance view of your situation.

                  Do you have any cross process communication going on inside that site?

                  It may be time to consider breaking that site up.

                  Jim Kosloskey

                  email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

                • #62396
                  Antonio Chennawi
                  Participant

                    I even stopped all processes in that site except the one I am having a problem with and it is still slow. I am thinking about deleting the process and build a new one. However, I still think there is something else. Any help!…..

                  • #62397
                    Dennis Pfeifer
                    Participant

                      Try putting the outbound threads in a different process.

                      The outbound threads could be awaiting for a reply, hence, locking the process.

                      Dennis

                    • #62398
                      John Hamilton
                      Participant

                        Inner process communication is very slow when routing threads in one process to a thread in a second process.

                        In my previous I did exactly what you are doing and had 4 threads going to 4 other processes and had none of the issues you seem to be having.

                        I did have that as mlp tcp connection with all the recovery stuff in place.  

                        If you shutdown the outbound threads one at a time what do you see.

                        If it you have only one outbound up and running what happens ?

                      • #62399
                        Tom Rioux
                        Participant

                          In discussing this with Antonio (since I used to work at MH), you must know that this thread in question is a multi-server thread.   Do you think that would make a difference?

                          Thanks…Tom

                        • #62400
                          John Hamilton
                          Participant

                            Could you eliminate that for testing ?

                            Which threads are set up that way ?

                          • #62401
                            Jim Kosloskey
                            Participant

                              Tom,

                              Do you mean the intrasite thread (outbound from one site to another site) is a multi-server configuration?

                              If that is so, then it is quite possible this could be causing a delay. Although I am not fully versed in the pros and cons of the multi-server configuration.

                              Antonio – could you clarify and maybe explain the thinking behind that setup?

                              Thanks,

                              Jim Kosloskey

                              email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

                            • #62402
                              Antonio Chennawi
                              Participant

                                I would like to thanks everyone for their suggestions and ideas. Thank you.

                                What I found is that after Health Quest made their changes in Test and started testing in our Test Cloverleaf environment, Cloverleaf could not keep up with receiving the messages.  This was due to High Total Disk IO in which Cloverleaf did not receive enough CPU to accept data. In our test environment, we are using two disks to write all our OS, Application QDX, and, SMAT data. This is what caused the IO to go very high. Then I compared the test disks configurations to production and I found that in production we have two fast T fiber disks that used for the OS and the Application and another four fast T fiber disks that used for writing the SMAT files. So we decided to move our changes to production today (Health Quest changes to speed the process of messages and Cloverleaf changes to minimize the ACK length). Currently, we are monitoring the performance and we will know something by the end of the day.

                              • #62403
                                Antonio Chennawi
                                Participant

                                  FYI,

                                  Yes, the inbound thread is setup as a Multi server  but not the outbound threads. Also, I tested it with and without Multi server .

                                  Dennis, I even tried to take the waiting for ack off, and it did not work.

                                  John, I was about to try what you just suggested yesterday until I found out the I/O issue.

                                  I am monitoring the system via Spotlight to make sure I have no problems and the Health Quest message queuing. I hope that we fixed it.

                              Viewing 12 reply threads
                              • The forum ‘Cloverleaf’ is closed to new topics and replies.