Problems with ZFS file system

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Problems with ZFS file system

  • Creator
    Topic
  • #51152
    Tim Wanner
    Participant

      I’m searching for other users that are using ZFS file system for Cloverleaf 5.5.

      We are running a SUN M5000 with 8 x 2.15 GHz processesors and 32 GB system memory.

      When we turn on x number of interfaces the system begins utilizing so much CPU that all of the Cloverleaf I/O’s begin to slow.  

      Messges proccess through the engine, but cannot get out and inbound messages cannot get in and mesages que in the engine.

      If we stop a few processes it will free enough resources and the messages cross.

      A server this size should be able to handle the load.  We have ~550 interfaces and 425 processes running.

      Any feedback would be greatly appreciated.

      Tim

    Viewing 3 reply threads
    • Author
      Replies
      • #68962
        Charlie Bursell
        Participant

          Is the file system mounted local or remote?  If remote take a look at how the SAN is configured.  Remember, Cloverleaf is an I/O hog and must be granted enough cycles to keep it going.

          One other thing I find rather strange is the allocaton ot interfaces to processes.  You state you have about 550 interfaces but 425 processes?

          That is less than 2 threads/per process.  Is there a valid reason fot this?  Even with 8 CPUs, 425 processes is a lot!

          I certainly am not a salesman and never wish to be construed as such, but if you continue to have problems, you may want to talk to your CSR about a Site Survey.

        • #68963
          Joe Halbrook
          Participant

            Hi Charlie.

            Your response regarding granting enough cycles triggered another question.  We’ve been running CL 5.6 Rev 2 on a virtual server (VMware + ESX hosts + EMC Clarion SAN) under Red Hat 5.0.

            Recently, we had an incident where the SAN array which the Cloverleaf virtual server is stored on experienced higher than normal write cache flushes.  Due to the fact that the ESX host timed out waiting for a write confirmation from the SAN array, the ESX host hosting the vitrual server sent SCSI abort commands to the SAN array. In turn, Red Hat was unable to write to its local drive in a timely manner.  As a protective measure, Red Hat went into a read-only mode, which of course brought down all the Cloverleaf processes.

            I was curious if you or any others have heard of / experienced this behavior in a virtual server / SAN environment.  Oddly, none of the other virtual servers (hosting a variety of applications) writing to the same SAN reacted in a similar manner during this incident.

            Thanks, in advance.

          • #68964
            Charlie Bursell
            Participant

              Joe:

              I think we had problems similar to this with our Red Hat servers.  I am not sure what the resolution was but Goutham will probably know.  I’ll find out on Monday.

              Send me an e-mail and remind me.

            • #68965
              Joe Halbrook
              Participant

                Thank you, Charlie.

                Any assistance would be appreciated.  We encountered this incident again Friday night (8/29) for the second time in less than two weeks.

                Joe

            Viewing 3 reply threads
            • The forum ‘Cloverleaf’ is closed to new topics and replies.