Looking for advice from high volume CloverTechers…

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Looking for advice from high volume CloverTechers…

  • Creator
    Topic
  • #55653
    Joshua Myers
    Participant

      Hi all…my company is using Cloverleaf to replace the current data acquisition layer which is a bunch of Python code that has to be customized for each new customer.  While Cloverleaf has numerous advantages (scalability, implementation time, maintenance, logging, troubleshooting, etc.), I’m worried we will be a performance bottleneck in our goal of processing millions of transactions per customer per day.

      Our setup is fairly simple (though I’m open to suggestions here too).  One process with one thread that reads in .csv files from sFTP, processes through an xlate depending on the file name and creates .json objects, then sends to another thread in its own process that sends the data outbound.  I see  the file get read in fairly quickly but then the “pxqd” builds up and slowly gets dequeued.  If i take the average of “Xlated”  by “Xlt Time” things seem pretty fast but that does not seem to be a true time of how many we are processing per second (or there is a bottleneck between something being xlated and it actually sending outbound)

      I’m looking for any advice that other high volume Cloverleaf can provide in two areas:

      1) Processing times:  I know everyone has different data with variable amounts of tcl and xlate code, but what settings have you tweaked to maximize performance?  I’ve looked at a lot of the Process specific ones (multi-threading,  disk based queuing, disabling all logging), but I’m not sure which are helping and which actually end up slowing things down.  

      2) This may be worthy of a separate thread, but how does everyone benchmark these numbers?  So say I drop a file of 10,000 or 100,000 messages, what is the best way to determine inbound to outbound processing time of the entire file?  Has anyone figured out a way to automate benchmarking and, even better, send the results out of Cloverleaf?

      Thanks for reading this long post and any advice you can offer!

    Viewing 4 reply threads
    • Author
      Replies
      • #86022
        Jim Kosloskey
        Participant

          Joshua,

          If you are using the Fileset protocol, for the inbound file there are values you can set to control how often messages are read and how many messages are read when that time happens.

          By adjusting these values you can smooth out to some extent the impact a flood of messages can have allowing the engine to process messages more smoothly.

          At the last place I was employed we did 6-7 million messages per day. Now they were not all coming over one connection. If I understand your architecture you have one connection which can receive high volumes in a short period of time.

          Also if you are using cross-process routing with high volumes, that could be an issue. Have you tried comparing to having everything in one process in this case?

          Or are your volumes spread out throughout the day?

          When exactly is your peak arrival period and what is that peak volume wise and how long does it last?

          email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

        • #86023
          Steve Pringle
          Participant

            It would also help to know what hardware platform/OS you’re running on and how it’s configured.

            We process ~12 million messages a day, we’re running on a dedicated AIX server with 4 cpus and 32 Gb of memory.  This configuration is more than sufficient to handle the load.

          • #86024
            Joshua Myers
            Participant

              Jim Kosloskey wrote:

              Joshua,

              Also if you are using cross-process routing with high volumes, that could be an issue. Have you tried comparing to having everything in one process in this case?

              Or are your volumes spread out throughout the day?

              When exactly is your peak arrival period and what is that peak volume wise and how long does it last?

              I have tried to put everything in one process and it helps the transfer time between threads but then the xlate time drastically slows down.

              Our current setup is file based and each file can have 50K or messages to process.  Maximum load will not be daily but will be backloads when a new customer comes on board.  We can smooth this out but short of chopping files into 10K batches, we will still see times when 50K or more are read into the interface.

            • #86025
              Joshua Myers
              Participant

                Steve Pringle wrote:

                It would also help to know what hardware platform/OS you’re running on and how it’s configured.

                We process ~12 million messages a day, we’re running on a dedicated AIX server with 4 cpus and 32 Gb of memory.

              • #86026
                Jim Kosloskey
                Participant

                  Joshua,

                  Have you fiddled with the time between reads and number of messages read values in the Fileset Protocol settings to smooth out the arrival rate?

                  email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

              Viewing 4 reply threads
              • The forum ‘Cloverleaf’ is closed to new topics and replies.