DR Configuration

This topic has 2 replies, 3 voices, and was last updated 18 years, 10 months ago by Russ Ross.

Creator

Topic
May 29, 2007 at 12:18 am #49304
Bill May
Participant
Was looking for someone to share their experiences with Cloverleaf and DR setup

My main question is how have people addressed the issue of keeping the DR box in sync with the production box

Has anyone had any experience with a HA configuration physically split across separate sites ?

Cheers,

Bill
Creator

Topic

Viewing 1 reply thread

Author

Replies
- May 29, 2007 at 12:49 pm #61448
  Jeff Thomas
  Participant
  Our solution is more of a DR solution, rather that an HA one. We have two servers each at two “sites”. (Currently different wings). We have our data and software on the SAN (EMC) which we replicate (SRDF) between the two sites. We also have a job that runs each night that scp’s pieces of our live sites to our test box. That way we have a fairly current setup on our test box–minus the current logs and recovery database… We also have a current copy of the production LUN’s at the remote site which we could manually reassign to the test box in case of an emergency…
  
  Not a prefect solution yet, and it doesn
- May 29, 2007 at 6:49 pm #61449
  Russ Ross
  Participant
  This is a very timely and interesting topic as we have been asking ourselves some of these questions.
  
  I hope to see other responses because what we have contemplated so far has just been amongst a tiny group in-house.
  
  When setting up our new cloverleaf production HACMP servers 2 years ago using LPARs, I asked about doing wide area fail-over as a means of DR but that was rejected.
  
  Now the topic of DR is becoming a political hot potato but the relectance for doing wide area fail-over here still persits.
  
  Until recently, few of us have been able to take these conversations seriously.
  One thing we have learned already, the DR solution has to be discussed with all departmental silos involved to come up with an enterprise DR solution.
  Even though we have learned this we are still struggling to accomplish enterprise cooperation.
  Another obvious point that some involved tended to ignore until it hit them in the face is that the band width to the DR location will need to be high enough to handle the load.
  The way it looks for us now is that each server deemed crucial enough to justify DR will get a LPAR on a machine in the DR location.
  
  So I will be getting a LPAR for Cloverleaf DR and SAN connections in the DR location but will not be using any wide area fail-over.
  
  So my current thoughts about how to approach my situation will be to maintain a DR cloverleaf server on my DR LPAR that I keep mostly in sync with the production cloverleaf server by routing a copying of all prodcution inbound messages to the DR server.
  
  All the outbound threads on the DR server could be in down time mode or add a tps_message_kill.
  
  When a DR situation occurs, then we selectively modify the inbound listeners to receive messages from those senders that might be working if any; and modify outbound senders on the DR cloverleaf LPAR to stop killing or saving messages to dump files and start sending them to foreign systems that might be working if any.
  
  Since a real DR will likely trash many if not all of the sending systems, this DR approach could turn out to be as fast as a wide area HA solution, especially since a decision has been made not to use wide area HA on any servers that I know of in the hospital.
  
  I have observed a willingness to sacrifice costs in favor of taking several days to accommodate a DR situation.
  It does seem to me at first glance that unless all or most of the enterprise systems are not using wide area fail-over, it might be that cloverleaf will be waiting day(s) to receive/make a DR connection from/to foreign systems anyway.
  
  Russ Ross
  RussRoss318@gmail.com
Author

Replies

Viewing 1 reply thread

The forum ‘Cloverleaf’ is closed to new topics and replies.