Cloverleaf downtime with messages left in recovery db

This topic has 3 replies, 3 voices, and was last updated 13 years, 1 month ago by Steve Pringle.

Creator

Topic
June 13, 2012 at 10:30 pm #53145
Steve Pringle
Participant
When we perform a downtime for maintenance we occasionally have outbound messages left in a recovery db, due to a remote host being offline. We can save off the messages, delete them from the recovery db, then initialize the db, but trying to resend the messages in order can be tricky.

We can’t selectively start threads as most of our threads are on autostart, and if you start one thread (where all other threads are down) any other threads in that process that are set to autostart will start.

Anyone have any workarounds or thoghts on this issue?

thanks,

Steve
Creator

Topic

Viewing 2 reply threads

Author

Replies
- June 14, 2012 at 3:51 am #76706
  Elisha Gould
  Participant
  Best bet is to copy your NetConfig and change the { AUTOSTART 1 } to { AUTOSTART 0 } for all of the threads.
  
  They will then only start with a pstart.
  
  once your done copy the backup NetConfig back.
- June 14, 2012 at 2:23 pm #76707
  Russ Ross
  Participant
  Welcome to the club of those that have learned autostart sometimes gets in the way.
  
  I have been exactly where you are once upon a time asking the same questions.
  
  Here are a couple of the answers I came up with.
  
  – I got away from using autostart in our production environment.
  
  – I created a method for handling downtime that has been useful for many situations including the one you described.
  
  Here is a URL with more detail describing my method for handling downtime:
  
  ~~<a href="~~https://usspvlclovertch2.infor.com/viewtopic.php?t=1621″ class=”bbcode_url”>https://usspvlclovertch2.infor.com/viewtopic.php?t=1621
  
  In conjuction with no autostart and HACMP, I created a method to dynamically create a snapshot script for each site that would start each site up the way it is at the time of the snapshot.
  
  Here is a URL that describes in more detail this method:
  
  ~~<a href="~~https://usspvlclovertch2.infor.com/viewtopic.php?t=4286″ class=”bbcode_url”>https://usspvlclovertch2.infor.com/viewtopic.php?t=4286
  
  This allows me to go to a site and run site_shutdown.ksh and when ready run site_startup.ksh and magically the site is back the way it was when the snapshot was taken to create site_startup.ksh.
  
  Now I no lnoger have the problem of forgetting to add sites to my HACMP scripts because the snapshot takes care of that.
  
  Now I no longer have any of the autostart problems that get in the way when manually addressing a prodcution issue.
  
  However, watch out becuase alerts can get in the way if they are configured to cycle an interface when a cloverleaf alert goes off.
  
  This is also why I have a method to toggle off alerts at both the site level and thread level.
  
  It was surprising to me there wasn’t an ergonmic way to extract messages from the recovery database in chronological order by default since that is almost always what is needed.
  
  There is a switch that can be used to extract them in chronological order but it is easy to overlook remembering to specify.
  
  If you type hcidbdump without any args it will list the switches and here is the one that it list for keeping the messages in chronological order:
  
  Code: -O time = Order by ascending date i = inbound arrival time o = outbound time r = recovery time x = xlate start time
  
  Here is one example of listing out messages to my_file in chronological order from the recovery database going to an outbound thread called ob_rxtfc_lab:
  
  Code: hcidbdump -r -d ob_rxtfc_lab -O i my_file
  
  Russ Ross
  RussRoss318@gmail.com
- June 14, 2012 at 10:08 pm #76708
  Steve Pringle
  Participant
  Russ – thanks for the in-depth response, there’s a lot to consider here, and you’ve clearly done lots of work to get this functioning in production. I think your approach makes sense, and the more threads you have on your engine the more critical it is to have some processes in use that promote automation.
  
  Elisha – I like the idea of copying the NetConfig. I can see us using this until we put a more automated process in place, such as Russ has.
  
  Clovertech is a great forum!
Author

Replies

Viewing 2 reply threads

The forum ‘General’ is closed to new topics and replies.