In addition to what they do at Robert’s place 😆 , we also have a couple of layers of disaster preparedness. For our four, main, clinical interface engine machines, we have High Availability in place. Two of the machines failover to each other, and the other two failover to a separate, dedicated fail-to machine.
In the event of a datacenter-wide disaster, we have a Disaster Recovery datacenter located in another city/state. We have two machines in that datacenter which can run the workload of the two groups of machines mentioned above. Each night, we have cronjobs(we’re on AIX) that run on each of the four main machines that tar up the production sites(not including run-time files, of course, like the exec subdirectories), along with various supporting files(puts the crontab out to a file, the contents of /home/hci, $HCIROOT/contrib, $HCIROOT/java_uccs, and $HCIROOT/usercmds). These tar files are then scp’d down to the DR machines. A couple hours later, separate cronjobs on the DR machines restore the tar files, creating a copy of our production machine configurations.
From a network standpoint, we have a separate DNS alias for the interface engine machines that we have inbound interfaces use. In the event of a disaster, this alias can be switched to point from the main datacenter IP to the DR machines. This way, we don’t have to have hundreds of sending systems repoint to a different IP/DNS name.
We’ve had this configuration in place for a couple of years now, and thankfully have never had to use it. Well, other than for testing, and as an occasional source of files that inadvertantly got deleted off the main machine. 😳
HTH,
TIM
Tim Pancost
Trinity Health