HA scripts and how long they take

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf HA scripts and how long they take

  • Creator
    Topic
  • #48664
    Anonymous
    Participant

    We did a test fall over on the engine last night. We are running AIX in an Active/ Active configuration. We have a large site that takes several minutes to shutdown. It all works it just take a long time.

    The scripts run in s serial mode. Shut site1 one down process by process, then site2 process1-process2-process3 and so on.

    Has anyone changed this to where the scripts will shutdown or startup all process in a parallel mode?  

    What did your script do about process that did not respond quickly enough?

Viewing 5 reply threads
  • Author
    Replies
    • #59323
      David Caragay
      Participant

      We had the same problem with our fail-over shutdown scripts, running way too long.  We are now running them in parallel and it has made a big difference.  No problems.

    • #59324
      Todd Lundstedt
      Participant

      Cloverleaf has newly developed scripts that run site shutdown and startup in parallel.  However, I believe there are some version restrictions involved, because their shutdown simulates a fatal crash of the server (essentially, they rip the process info out from under the the process).

      I haven’t been able to test them thoroughly, yet, so I am not entirely convinced it works as expected.

    • #59325
      Richard Hart
      Participant

      We run our own HA scripts that run in parrallel. The only issue we have had has been on the test server where we ran out of processes as our start scripts perrform a ‘crash recovery’ on a site bedfore starting it.

    • #59326
      Jonathan Hamilton
      Participant

      To help eliminate some of our HA timing issues we have begun using the new Quovadx HA 2.1 scripts.  Currently, we run an Active-Passive setup with 7 pairs of servers across our enterprise.

      When we first received the scripts and I reviewed their functionality, I was very concerned with the approach used, but after extensive testing in QA the scripts proved themself.  As Todd mentioned, the new scripts process in parallel to remove the process’ control files (pid & cmd_port), then kills all HCI processes on the box.

      With the old HA scripts our servers took anywhere from 10 to 30 minutes to shutdown depending on the complexity of the environment.  Now the shutdown is complete in less than 3 minutes.  Startup typically took 10 – 30 minutes due to starting sites in a serial order and our servers are severely overloaded.  Now the startup script is complete in less than a minute.

      In our production environment we have performed at least one HA fail-over per pair with no issues related to the HA scripts.

      Environment:

       7 Production HA Pairs

       320 Sites

       ~10K Threads

       3.4-3.7 Million Messages Day

    • #59327
      Mark Gathers
      Participant

      I read this post and we experienced the same problem taking the engine down.   The serial process takes too long.  We don’t have HA installed but I’m wondering if a faster method is available for taking the engine sites/processes down.  Any help would be appreciated.

      Regards,

      Mark Gathers

      Phone: 304.598.4000 x75615

      Email:  gathersm@wvuh.com

    • #59328
      Jonathan Hamilton
      Participant

      Technically, the HA scripts will work for a non-HA environment.

      For a really quick shutdown remove the pid and cmd_port files of the running process, then kill all of the hci owned pid’s.  Sounds scary but that is what the HA scripts do.  You should probably be on a newer release of Cloverleaf (>v5.3) to try this, I doubt the older versions will like it too much.

Viewing 5 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,293
Replies
34,435
Topic Tags
286
Empty Topic Tags
10