Cloverleaf and Regular Expression Theory

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Cloverleaf and Regular Expression Theory

  • Creator
    Topic
  • #50183
    Michael Lacriola
    Participant

    This one is for all the people who are looking to improve the performance of their transactions flowing by just modifying some of their wild card routes that use regular expressions. I do not have answers, but I’m just posting the questions to see if any gurus may know the answer (Charlie, this one’s for you…)

    Wild Card Route 1: ADT_A0(1|4|5|6|7|8)

    Wild Card route 2: ADT_A0[14-8]

    Admittingly, they do the same thing. The question, “Do they process at the same speed?”

    Another possibility are the following wild card routes:

    ADT_A0(1|4|5|6|7|8)

    vs.

    ADT_A0(8|4|5|1|6|7)

    Once again, they do the same thing. Is one faster than the other if your enterprise does way more A08 transactions than the rest? In the first example, does Cloverleaf check 5 times before it hits the A08 where as the second example, it hits on the first try?

    I may be splitting hairs here, but, when dealing with hundreds of thousands of transactions, it may mean a lot. Who knows?

    Let’s hear from some of you top dogs…

    BTW, I haven’t received any emails lately from the digest since last Friday. Is something up or did our Firewall guys screw something up?

Viewing 5 reply threads
  • Author
    Replies
    • #65132

      My guess is that any difference in speed between the examples you posted would be insignificant even over a very large sample. I have no proof tho.  ðŸ˜›

      -- Max Drown (Infor)

    • #65133
      Michael Lacriola
      Participant

      Insignificant? 500,000 messages if you were able to save .25 second per message would be extremely large — I’ll let you do the math. I know there are Hospitals that do a million plus transactions. Any small advanatage can mean a lot. Even if it were .1 second per message, it would still be worth it.

    • #65134

      Understood. I’m thinking tho that the speed difference, if any, would be much smaller. Like .00001 seconds or smaller.

      Again, just a guess.

      -- Max Drown (Infor)

    • #65135
      Scott Lee
      Participant

      I agree with Max – although I also have never tested it to find out for sure.   🙄

      My feeling is, if they process in about the same time, then the syntax I should use should be, to the human eye, the most easily readable at a glance.  For me that means I write them like this…

      Code:

      ADT_A(01|02|03|04|06|08)

      IMO, It’s easy to read and understand even for someone relatively new to regular expressions.  

      But I know, that doesn’t answer the question…  Anyone know for sure?  I am curious too.

    • #65136
      Jim Kosloskey
      Participant

      If you are on a release of Cloverleaf(R) which deployes a Tcl with the ‘clock milliseconds’ command (I think 5.6 Cloverleaf(R) will do) maybe you can construct a proc to assist in measuring the impact.

      I suspect even without the clock milliseconds command a fairly good Tcl proc for measuring the consumption could be constructed.

      By the way, do not use the clock clicks command – it is unreliable.

      I used just such a technique to evaluate the impact of using GRM (I did not use the Clock milliseconds command as it was not available when I made the evaluation).

      Also, don’t forget your measurements need to be interpretted and published taking into regard the size of your processor(s).

      We do not use regexp based routing here so I have no input regarding performance but I would also agree that if using regexp using the simplest construct would make the more maintainable conguration.

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

    • #65137
      Brian Beverage
      Participant

      Assuming an regex like above where the expression is being compared to the same item in the message (ADT_AXX) Educated guess tells me N*2 where N is the length of the regex with 0 back references and 2 is the length of the string being compared. This would give us the number of comparisons required to determine a match.

      We also assume here that the CPU is executing one instruction per cycle with 0 no ops and on average a comparison creates 4 CPU instructions which is not entirely accurate but for sake of argument makes the example simple. 🙂

      Now we can say 4(N*2) would be the number of Clock cycles to evaluate that expression. Since we know that we can theoretically reduce N by putting the most common value in the first comparison that would speed up the evaluation. This is all based on what i remember from a Computer Architecture class I had a couple years ago so this is by no means an exact formula just an idea of what takes place. I would always try and make the regular expressions as efficient as possible even tho the increase in speed will be small it is an increase none the less…  ðŸ˜€

Viewing 5 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,292
Replies
34,432
Topic Tags
286
Empty Topic Tags
10