High Availability

Clovertech Forums Read Only Archives Cloverleaf General High Availability

  • Creator
    Topic
  • #47645
    Jeff Thomas
    Participant

      We are running Cloverleaf on UNIX, and I

    Viewing 13 reply threads
    • Author
      Replies
      • #56338
        Dan Goodman
        Participant

          I am also interested, especially from the point of view of application level support for fast failover.

          We have a tested home-grown solution to rehost our production site onto a backup platform, without a reboot.

          We are also looking at HA/XD, but expect that applevel support is required to do more for us than we are doing for ourselves now.

          We have thoroughly tested our failover (rehosting) mechanism on our test site, and are planning to go live on a new p615 running AIX 5.2L and Platform 5.3 (zero rev?)

        • #56339
          Jesse Pangburn
          Participant

            I am not the one responsible for deploying it, but as I understand it, the guys deploying one of my applications are using Sun Cluster softare for high availability.  Apparently it is kind of difficult to install Cloverleaf on the cluster  though, you have to set it up under a different user than hci and modify all the perl script to use your new user.  I’m not sure why hci doesn’t work on the cluster, but according to them, it doesn’t.

            They say if you can get cloverleaf setup under a different user /home/bob for example, then it can be handled by the cluster to failover onto different nodes.  The biggest problem seems to be fixing the perl scripts which have the hci user hardcoded.

            I can try to answer more questions on this subject, but I think I’ve explained most of what I know about it.

          • #56340
            Jeff Thomas
            Participant

              Thanks guys for your replies.  We are currently running with Tru64, and although I

            • #56341
              Dan Goodman
              Participant

                HA/XD is “shorthand” for HACMP/XD, which is the replacement for HA/GEO.

                “Back in the day”, IBM Global Services typically would come in and do all the requirements analysis, etc., then put up a canned two node config.

                Supposedly, with HA/XD, there is an “express” configuration option for the common two-node solution. (Think: no consulting fees for installation and setup.)

                But for us, it *could* be the case that we would eventually have three machines, with intermachine “health check” heartbeats.

                With the appropriate failover capability in the app, I would think three machines would be the minimum that could reasonably detect and lockout a failing node automatically.

                Here is a link to the IBM Announcement letter for HA/XD

                205085 IBM HACMP/XD expands its business continuity solution with

                         improved, simpler geographic-distance data mirroring and

                         disaster recovery (14.5KB)

                         

                http://www.ibm.com/isource/cgi-bin/goto?it=usa_annred&on=205-085HA

              • #56342
                Jeff Thomas
                Participant

                  Gotcha.  Thanks

                  Jeff

                • #56343
                  David Harrison
                  Participant

                    I’m on Solaris 8 with Veritas Cluster Server QuickStart. This comprises two Sun Fire V210 servers with a shared disk array. Cloverleaf sits on the shared array.

                    I am using HCI cluster scripts to monitor, stop, start and clean the application.

                    It works a treat.

                    Dave

                  • #56344
                    Jeff Thomas
                    Participant

                      Thanks David, were these scripts you wrote yourself, or did HCI provide them?  I willing to write them, but just wanted to make sure that I wasn

                    • #56345
                      David Harrison
                      Participant

                        The scripts are from HCI and were modified by our supplier. Send me your email address if you want to look at them.

                      • #56346
                        Rob Abbott
                        Keymaster

                          Sorry to tell you this, but the HA scripts we provide are an additional software offering, provided for an additional fee.

                          Rob Abbott
                          Cloverleaf Emeritus

                        • #56347
                          Jeff Thomas
                          Participant

                            Thank’s Rob, what is the collection called?  Is there any publically available documentation?  What platforms are supported?

                            Thanks

                            Jeff

                          • #56348
                            Rob Abbott
                            Keymaster

                              The package is known as the High Availability scripts for Cloverleaf.  

                              We have implemented them on AIX (HACMP), Solaris (veritas), HP/UX (MC/Service Guard), and Windows 2000 (windows clustering).

                              We’re also working on Linux clustering but have not implemented in a live environment as of yet.

                              They make Cloverleaf operate in an HA environment.  They rely on the particular HA solution to function.

                              We also provide full HA implementations for AIX/HACMP.

                              As I say, if you need more detailed information please contact your Customer Relations Manager (CRM).

                              Thanks!

                              Rob Abbott
                              Cloverleaf Emeritus

                            • #56349
                              ian.smith
                              Participant

                                We are running Cloverleaf 3.8.1 on AIX 5.1 and looking into implementing HACMP with the official Quovadx HA scripts.

                                Would those of you who are currently running this model of HA on AIX be willing to share:

                                A) The time it takes to failover?

                                B) Whether or not your standard reboot time increased (if so by how much)?

                                C) Did your test to production code migration become more complex (if so how)?

                                Thanks in advance.

                                -Ian Smith

                              • #56350
                                Bob Schmid
                                Participant

                                  Runn ing AIX 5.2 qdx 5.3 and have hacmp installed.

                                  Have recently tested HA failover to our test box and had alot of issues owing to recovery database corruption which prevented the processes from starting on the failover box. QDX has not delivered us an answer as to how to get around recovery database corruptino problems. We had to reinit the recovery database in order to get some of the processes back up.

                                  ALso..many of the downstream connections (TCP/IP) neede rebooted owing to the abrupt disconnect they experienced.

                                  all in all..the processes attempt to come up within 10 minutes…some did…but the overall mess took us about 2-3 hours to get back to semi-normal state and we lost data

                                  So yeah..we had an environment to work with

                                  but no…is aint clean by any stretch of the imagination

                                  The setup HACMP was setup by Quovadx.

                                  Little to no documentation delivered other than the config files themselves.

                                  Bob

                                • #56351
                                  Rich Durkee
                                  Participant

                                    We were using HACMP, but when we bought our new server we did not install HACMP. Reasons:

                                    1. Expense of the software and having a second server that basically does nothing.

                                    2. It complicates the system design.

                                    3. Have to have someone who knows HACMP.

                                    4. Have to run regular tests – takes planning and downtime.

                                    5. Only failed over once in four years on the old system.

                                    Because the IBM pServers are so reliable these days, it seem that the risk of hardware failure is minimal. We made the system as fault tolerant as possible – dual power supplies, dual cooling fans, dual ethernet adapters, dual fibre channel adapters to our SAN (we do not use internal disk in production), and so on. We have been on this configuration for 1.5 years and have not had a hardware filaure.

                                Viewing 13 reply threads
                                • The forum ‘General’ is closed to new topics and replies.