We were in a similar scenario. Unless you have a third machine (or second cluster), there is no way to test Cloverleaf on the new OS unless you go without failover capability (or if your cluster is for production, and you have a separate, third server for Test). I don’t know what OS you are on; with AIX, the high availability function is very sensitive to changes and most likely will not work with two different versions of the OS.
You would disable the automated failover functionality, and prepare for a fully manual failover if the need arises – have all the commands ready, etc. ( I’m not an expert, so the previous ‘etc.’ covers a lot of ground; ensuring that you still have the proper virtual IP address is critical). Luckily your hardware, network adapters, storage are not changing – that is a big help.
If you just have the cluster of two machines, you would run from the primary server while the secondary is upgraded. Then manually shutdown Cloverleaf and ‘move it’ to the secondary server, and upgrade the primary server. Move Cloverleaf back to the primary server as time permits. When we fail over, the overall downtime to the hospital is about five minutes. With a manual failover, your downtime could be triple that. You want to get this part right. Maybe you should just ‘practice’ manually failing over, verifying Cloverleaf and failing back, just to be sure everything is good. Then you have less troubleshooting to do if you encounter problems during the OS change.
Now you can reconfigure your high availability setup. If my memory serves me right, you can do this while Cloverleaf is running. Please confirm this.
I would seriously consider upgrading your high availability software, if it is separate software like it is on AIX, during this process. Depending on the OS level, you may be required to do this anyway.
Part of the project will be to verify the high availability failover by failing over and back. Prepare for downtime.
Peter Heggie
PeterHeggie@crouse.org