Trouble with automated cycling of SMAT files

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Trouble with automated cycling of SMAT files

  • Creator
    Topic
  • #48579
    David Gordon
    Participant

      CL 5.3 on AIX 5.2 and ongoing trouble with automated cycling of SMAT files.

      We’ve used several different scripts from the QDX folks, and even on I wrote myself, but the core problem seems to be that our threads sometimes just don’t respond to the cycle/save command.  Just to be specific I am talking about:

      hcicmd -p $Process -c ‘$Thread save_cycle in’

      hcicmd -p $Process -c ‘$Thread save_cycle out’

      No matter how we fire off the script (manually, cron job, what have you) some threads just won’t cycle and we end up with 200MB+ SMAT files that you have to search via text editor since the SMAT tool freaks out on anything that large.

      Now, this doesn’t seem to happen if we do each thread one at a time via command line or GUI, so I’m thinking it might be something in the automation.  This seems to happen with certain threads over and over, and the do tend to be high-volume threads as well.  

      Has anyone else seen an issue like this?

      Thanks in advance.

    Viewing 7 reply threads
    • Author
      Replies
      • #59036
        James Cobane
        Participant

          David,

          You may want to consider turning on ‘Translation Throttling’ for the processes that have the threads that are not cycling.  Translation throttling will give the command thread an opportunity to respond to ‘hcicmd’ command that were issued.  Without ‘Translation Throttling’ turned on for these high-volume threads, the engine is dedicating all it processing to translation, and is not giving any time to the command thread to respond to ‘hcicmd’ commands issued.  You’re probably seeing ‘No response within timeout — assuming process is hung’ responses to your “hcicmd -p $Process -c ‘$Thread save_cycle in’ “.

          Hope this helps.

          Jim Cobane

          Henry Ford Health

        • #59037
          Mike Grieger
          Participant

            Yes!  I have seen this before as well.  I have seen it happen where thread names are named similarly.  hcicyclesavemsgs is a perl script that primarily uses another perl script – hciconnstatus – to operate.  Default return by hciconnstatus only has 15 characters for a thread name, so if 2 threads have name patterns the same through 15 characters, you could have the first thread that matches cycle multiple times.  (If you happen to dump to output of the hcicyclesave off to a log, it should show you what is cycling/attempting to cycle and you would see this.)

            Anyway, we’ve modified our hciconnstatus (bin directory) so that our thread names display and use 28 characters, as we try to use descriptive thread names that typically are more than 15 characters long.

            < hciconnstatus snippet >

            #    FACTORY SETTINGS ARE BELOW, COMMENTED OUT per Mike G

            #    set hfmt “%-15.15s %-15.15s %-10.10s %-15.15s %-20.20s” ;# Heading format

            #    set fmt “%-15.15s %-15.15s %-10.10s %-15.15s %-20.20s” ;# Output line format

            #    echo [format $hfmt Process Connection State “Proto Status” When]

            #    echo ”






               set hfmt “%-15.15s %-28.28s %-5.5s %-7.7s %-20.20s” ;# Heading format

               set fmt “%-15.15s %-28.28s %-5.5s %-7.7s %-20.20s” ;# Output line format

          • #59038
            David Gordon
            Participant

              I had thought about the hciconnstatus short-name issue, so we are using hciconndump in our script instead.

              I’ll try the translation throttling options and see if that improves matters.

              Thanks.

            • #59039
              Anonymous
              Participant

                David,

                Suggestion:

                Assuming you areon Unix platform.

                Build the script as follows

                1. Stop the process

                2. run cycle_save

                3. start the process.

                schedule a corn job as you wish.

                Test.

                Deploy.

                Hope this helps.

                -Reggie-

              • #59040
                David Gordon
                Participant

                  Cycle/Save won’t run if the process or thread is down though…

                • #59041
                  Jim Kosloskey
                  Participant

                    David,

                    have you tried running these scripts during a loe volume period of the day (most coonnections have a low volume period).

                    If there is no low volume period try running after shutting down the source system’s connection on the source system (not on the engine). This, of course, will force a temporary low volume period.

                    If the scripts still don’t work then maybe xlate throttling is not the answer.

                    Just my suggestion – no solution unfortunately.

                    Jim Kosloskey

                    email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

                  • #59042
                    David Gordon
                    Participant

                      Well, I just revamped the script to log the hell out of everything, so hopefully I’ll have more answers after tonight.

                      Thanks again.

                    • #59043
                      David Gordon
                      Participant

                        And now, all of a sudden, it works.  I’m not sure how, but it does.  Same time of day, no changes to the threads, and it just suddenly works.

                        The only think I can think of is that switching from a Perl script to a KSH script seems to have done something.

                        Regardless of the reason, thanks for the suggestions!

                    Viewing 7 reply threads
                    • The forum ‘Cloverleaf’ is closed to new topics and replies.