Using a SAN

  • Creator
    Topic
  • #47590
    Mark Brown
    Participant

      I know from asking here before if Cloverleaf would work using a SAN instead of a local drive.  What I’m asking now, from anyone who has done this or has considered using a SAN, what were your pros and cons?

      My boss was interested in using a SAN to make an emergency switch over to our test engine should the production one go down for an extended period.  However, it’s been my experience that when the engine goes down for 5 or 6 hours, it’s because the database has been corrupted. That being the case, it really wouldn’t do much good to switch boxes if the database on the SAN is messed up.

      Or am I missing something?  Do any advantages outweigh disadvantages like the one I mentioned?

    Viewing 6 reply threads
    • Author
      Replies
      • #56189
        Michael Hertel
        Participant

          We’re on AIX.

          Our experience running Cloverleaf on AIX is that it is very stable and can run indefinetly. I was very apprehensive about moving to our SAN because now we depend on hardware outside of the interface team’s control. And I was right.

          We have had to bring the box down on various occasions due to “SAN maintenance”.

          The upside to the SAN is that it is fast, fast, fast.

          And almost unlimited space if need be.

          We’ve never swapped and doubt we would. One reason being that you’d need to use a new license key, etc., etc.

          -mh

        • #56190
          Dan Goodman
          Participant

            We are AIX 5.1L, using SAN in a mirrored config (one local, one SAN drive).

            We *have* had loss of SAN copy, including one unplanned, due to SAN maintenance. AIX resyncs the mirror in the background when the SAN is restored. This eliminates the issue of possible SAN MTTF (mean time to failure) and MTTR (mean time to recover) being possibly worse than local hardware.

            The database corruption point is a good one, although I still see some interest locally in using the shared SAN disk as a quick recovery method.

            What we have done instead is (1) acquire identical HW platforms (p615’s beefed up, incl. total of 4 disks); (2) mirror rootvg locally; (3) mirror separate appvg, one local, one SAN; (4) retain 4th drive for future OS upgrades using alt_disk_install; (5) acquired dual HBA’s for primary platform and loaded auto-failover driver (Hitachi HDLM).

            (We have license keys, tied to systemID already in place on both platforms.)

            In addition, we replicate our $CLROOT/production directory from our primary platform to our secondary platform nightly. This does not conflict with the secondary platform’s role as a development/test machine, as that work is done in $CLROOT/test.

            We may, in the future, *in addition to* not in place of, our current config, add an additional SAN disk to our primary mirroring, with the idea of fast failing it, and the app, to the secondary machine.

            We expect that this will require additional software from QVDX as well, and proof-of-concept of the same.

            Remember that moving the SAN disk from one platform to another does nothing for messages backed up in memory only queues, so you need to either store *all* messages to disk (correct me if I’m wrong), or have one slick retransmit capability with all your ancillaries, preferably automated, but at a minimum, with automatic detection and removal of duplicated transactions.

            Not sure what all of this would buy us, in that we can autofail our production (tcpip, SNA, hostnaming) from the primary to the secondary (all but application times) in under five minutes, without a reboot to either box.

            The actual runtime from command initiation, is under fifteen seconds, except for the SNA, whose routing is controlled by our z/OS, which interfaces with SMS/Siemens. This is more on the order of five minutes.

            Ancillaries with robust socket management seem to pick this up automatically, but there are always a few stragglers that need to bounce their connection…

            We like it.   8)  (Up 597 days since last, inplace, OS and app upgrade — 3 SAN outages, one unplanned, all due to SAN maintenance — zero HW errors — zero SW outages at OS level).

            Dan Goodman

          • #56191
            Mark Brown
            Participant

              I guess I should have mentioned that we’re running QDXi 5.3 Rev 1 on a Windows 2003 server.

              Am I understanding correctly that if the database is on the SAN and the database gets blown away, it could be restored from the SAN?

              I’m trying to get the sites on both our production and test/devl servers the same so we can swap the names and ip addresses and come up on the test box so the hospital doesn’t grind to a halt.  The last time that happened, all that high-level attention was a little uncomfortable.

              So, it sounds like moving to a SAN is a good thing.  How do you get Cloverleaf to look at the SAN instead of it’s local drive?  Do you have to re-install?

            • #56192
              Rob Abbott
              Keymaster

                On Windows I imagine you will have to re-install as the SAN will show up as a new drive.

                On Unix, you can use symbolic links to point to the new filesystem.

                Rob Abbott
                Cloverleaf Emeritus

              • #56193
                Robert Milfajt
                Participant

                  We plan on using a SAN to run AIX 5.2 and QDXi 5.3 in a clustered environment.  The reason we chose SAN vs. direct attached storage is that it’s fiber connected, and that it will allow for a geographical clustering in two different data centers, with mirroring of data between two separate SAN boxes.  It also allowed us to put multiple AIX clusters and Windows boxes on the same disk storage subsystem.

                  Robert Milfajt
                  Northwestern Medicine
                  Chicago, IL

                • #56194
                  Wesley Combs
                  Participant

                    We have Aix 5.2 with CL 5.2 on a LPAR on a p650.  We have no local drives for cloverleaf.  The San is mirrored to the other data center 30 miles away.  We are working on another LPAR in the other DC that we can use to bring up the engine in case of downtime.   I have a question for some of you. How do you handle the license key when bringing the app back up on another box?  Will QVDX issue a spare license key for a DR box?

                    Thanks

                  • #56195
                    Dan Goodman
                    Participant

                      I would also like to know the answer to the licensing question when mirroring to a remote server for recovery. We are in the preliminary planning stages for doing this.

                      8)

                  Viewing 6 reply threads
                  • The forum ‘Operating Systems’ is closed to new topics and replies.