Cloverleaf downtime with messages left in recovery db

Clovertech Forums Read Only Archives Cloverleaf General Cloverleaf downtime with messages left in recovery db

  • Creator
    Topic
  • #53145
    Steve Pringle
    Participant

      When we perform a downtime for maintenance we occasionally have outbound messages left in a recovery db, due to a remote host being offline.  We can save off the messages, delete them from the recovery db, then initialize the db, but trying to resend the messages in order can be tricky.

      We can’t selectively start threads as most of our threads are on autostart, and if you start one thread (where all other threads are down) any other threads in that process that are set to autostart will start.

      Anyone have any workarounds or thoghts on this issue?

      thanks,

      Steve

    Viewing 2 reply threads
    • Author
      Replies
      • #76706
        Elisha Gould
        Participant

          Best bet is to copy your NetConfig and change the { AUTOSTART 1 } to { AUTOSTART 0 } for all of the threads.

          They will then only start with a pstart.

          once your done copy the backup NetConfig back.

        • #76707
          Russ Ross
          Participant

            Welcome to the club of those that have learned autostart sometimes gets in the way.

            I have been exactly where you are once upon a time asking the same questions.

            Here are a couple of the answers I came up with.

            – I got away from using autostart in our production environment.

            – I created a method for handling downtime that has been useful for many situations including the one you described.

            Here is a URL with more detail describing my method for handling downtime:

                <a href="https://usspvlclovertch2.infor.com/viewtopic.php?t=1621&#8243; class=”bbcode_url”>https://usspvlclovertch2.infor.com/viewtopic.php?t=1621

            In conjuction with no autostart and HACMP, I created a method to dynamically create a snapshot script for each site that would start each site up the way it is at the time of the snapshot.

            Here is a URL that describes in more detail this method:

                <a href="https://usspvlclovertch2.infor.com/viewtopic.php?t=4286&#8243; class=”bbcode_url”>https://usspvlclovertch2.infor.com/viewtopic.php?t=4286

            This allows me to go to a site and run site_shutdown.ksh and when ready run site_startup.ksh and magically the site is back the way it was when the snapshot was taken to create site_startup.ksh.

            Now I no lnoger have the problem of forgetting to add sites to my HACMP scripts because the snapshot takes care of that.

            Now I no longer have any of the autostart problems that get in the way when manually addressing a prodcution issue.

            However, watch out becuase alerts can get in the way if they are configured to cycle an interface when a cloverleaf alert goes off.

            This is also why I have a method to toggle off alerts at both the site level and thread level.

            It was surprising to me there wasn’t an ergonmic way to extract messages from the recovery database in chronological order by default since that is almost always what is needed.

            There is a switch that can be used to extract them in chronological order but it is easy to overlook remembering to specify.

            If you type hcidbdump without any args it will list the switches and here is the one that it list for keeping the messages in chronological order:

            Code:

                         -O time      = Order by ascending date
                                        i = inbound arrival time
                                        o = outbound time
                                        r = recovery time
                                        x = xlate start time

            Here is one example of listing out messages to my_file in chronological order from the recovery database going to an outbound thread called ob_rxtfc_lab:

            Code:

            hcidbdump -r -d ob_rxtfc_lab -O i my_file

            Russ Ross
            RussRoss318@gmail.com

          • #76708
            Steve Pringle
            Participant

              Russ – thanks for the in-depth response, there’s a lot to consider here, and you’ve clearly done lots of work to get this functioning in production.  I think your approach makes sense, and the more threads you have on your engine the more critical it is to have some processes in use that promote automation.  

              Elisha – I like the idea of copying the NetConfig.  I can see us using this until we put a more automated process in place, such as Russ has.

              Clovertech is a great forum!

          Viewing 2 reply threads
          • The forum ‘General’ is closed to new topics and replies.