High Availability

This topic has 14 replies, 8 voices, and was last updated 19 years, 8 months ago by Rich Durkee.

Creator

Topic
April 14, 2005 at 3:41 pm #47645
Jeff Thomas
Participant
We are running Cloverleaf on UNIX, and I
Creator

Topic

Viewing 13 reply threads

Author

Replies
- April 14, 2005 at 6:40 pm #56338
  Dan Goodman
  Participant
  I am also interested, especially from the point of view of application level support for fast failover.
  
  We have a tested home-grown solution to rehost our production site onto a backup platform, without a reboot.
  
  We are also looking at HA/XD, but expect that applevel support is required to do more for us than we are doing for ourselves now.
  
  We have thoroughly tested our failover (rehosting) mechanism on our test site, and are planning to go live on a new p615 running AIX 5.2L and Platform 5.3 (zero rev?)
- April 14, 2005 at 11:12 pm #56339
  Jesse Pangburn
  Participant
  I am not the one responsible for deploying it, but as I understand it, the guys deploying one of my applications are using Sun Cluster softare for high availability. Apparently it is kind of difficult to install Cloverleaf on the cluster though, you have to set it up under a different user than hci and modify all the perl script to use your new user. I’m not sure why hci doesn’t work on the cluster, but according to them, it doesn’t.
  
  They say if you can get cloverleaf setup under a different user /home/bob for example, then it can be handled by the cluster to failover onto different nodes. The biggest problem seems to be fixing the perl scripts which have the hci user hardcoded.
  
  I can try to answer more questions on this subject, but I think I’ve explained most of what I know about it.
- April 15, 2005 at 12:48 pm #56340
  Jeff Thomas
  Participant
  Thanks guys for your replies. We are currently running with Tru64, and although I
- April 15, 2005 at 1:52 pm #56341
  Dan Goodman
  Participant
  HA/XD is “shorthand” for HACMP/XD, which is the replacement for HA/GEO.
  
  “Back in the day”, IBM Global Services typically would come in and do all the requirements analysis, etc., then put up a canned two node config.
  
  Supposedly, with HA/XD, there is an “express” configuration option for the common two-node solution. (Think: no consulting fees for installation and setup.)
  
  But for us, it *could* be the case that we would eventually have three machines, with intermachine “health check” heartbeats.
  
  With the appropriate failover capability in the app, I would think three machines would be the minimum that could reasonably detect and lockout a failing node automatically.
  
  Here is a link to the IBM Announcement letter for HA/XD
  
  205085 IBM HACMP/XD expands its business continuity solution with
  
  improved, simpler geographic-distance data mirroring and
  
  disaster recovery (14.5KB)
  
  http://www.ibm.com/isource/cgi-bin/goto?it=usa_annred&on=205-085HA
- April 15, 2005 at 3:03 pm #56342
  Jeff Thomas
  Participant
  Gotcha. Thanks
  
  Jeff
- April 22, 2005 at 10:02 am #56343
  David Harrison
  Participant
  I’m on Solaris 8 with Veritas Cluster Server QuickStart. This comprises two Sun Fire V210 servers with a shared disk array. Cloverleaf sits on the shared array.
  
  I am using HCI cluster scripts to monitor, stop, start and clean the application.
  
  It works a treat.
  
  Dave
- April 22, 2005 at 12:36 pm #56344
  Jeff Thomas
  Participant
  Thanks David, were these scripts you wrote yourself, or did HCI provide them? I willing to write them, but just wanted to make sure that I wasn
- April 22, 2005 at 12:46 pm #56345
  David Harrison
  Participant
  The scripts are from HCI and were modified by our supplier. Send me your email address if you want to look at them.
- April 22, 2005 at 2:54 pm #56346
  Rob Abbott
  Keymaster
  Sorry to tell you this, but the HA scripts we provide are an additional software offering, provided for an additional fee.
  
  Rob Abbott
  Cloverleaf Emeritus
- April 26, 2005 at 12:44 pm #56347
  Jeff Thomas
  Participant
  Thank’s Rob, what is the collection called? Is there any publically available documentation? What platforms are supported?
  
  Thanks
  
  Jeff
- April 26, 2005 at 2:53 pm #56348
  Rob Abbott
  Keymaster
  The package is known as the High Availability scripts for Cloverleaf.
  
  We have implemented them on AIX (HACMP), Solaris (veritas), HP/UX (MC/Service Guard), and Windows 2000 (windows clustering).
  
  We’re also working on Linux clustering but have not implemented in a live environment as of yet.
  
  They make Cloverleaf operate in an HA environment. They rely on the particular HA solution to function.
  
  We also provide full HA implementations for AIX/HACMP.
  
  As I say, if you need more detailed information please contact your Customer Relations Manager (CRM).
  
  Thanks!
  
  Rob Abbott
  Cloverleaf Emeritus
- July 14, 2005 at 3:18 pm #56349
  ian.smith
  Participant
  We are running Cloverleaf 3.8.1 on AIX 5.1 and looking into implementing HACMP with the official Quovadx HA scripts.
  
  Would those of you who are currently running this model of HA on AIX be willing to share:
  
  A) The time it takes to failover?
  
  B) Whether or not your standard reboot time increased (if so by how much)?
  
  C) Did your test to production code migration become more complex (if so how)?
  
  Thanks in advance.
  
  -Ian Smith
- October 14, 2005 at 2:00 pm #56350
  Bob Schmid
  Participant
  Runn ing AIX 5.2 qdx 5.3 and have hacmp installed.
  
  Have recently tested HA failover to our test box and had alot of issues owing to recovery database corruption which prevented the processes from starting on the failover box. QDX has not delivered us an answer as to how to get around recovery database corruptino problems. We had to reinit the recovery database in order to get some of the processes back up.
  
  ALso..many of the downstream connections (TCP/IP) neede rebooted owing to the abrupt disconnect they experienced.
  
  all in all..the processes attempt to come up within 10 minutes…some did…but the overall mess took us about 2-3 hours to get back to semi-normal state and we lost data
  
  So yeah..we had an environment to work with
  
  but no…is aint clean by any stretch of the imagination
  
  The setup HACMP was setup by Quovadx.
  
  Little to no documentation delivered other than the config files themselves.
  
  Bob
- October 19, 2005 at 1:26 pm #56351
  Rich Durkee
  Participant
  We were using HACMP, but when we bought our new server we did not install HACMP. Reasons:
  
  1. Expense of the software and having a second server that basically does nothing.
  
  2. It complicates the system design.
  
  3. Have to have someone who knows HACMP.
  
  4. Have to run regular tests – takes planning and downtime.
  
  5. Only failed over once in four years on the old system.
  
  Because the IBM pServers are so reliable these days, it seem that the risk of hardware failure is minimal. We made the system as fault tolerant as possible – dual power supplies, dual cooling fans, dual ethernet adapters, dual fibre channel adapters to our SAN (we do not use internal disk in production), and so on. We have been on this configuration for 1.5 years and have not had a hardware filaure.
Author

Replies

Viewing 13 reply threads

The forum ‘General’ is closed to new topics and replies.