Few Messages Lost (misplaced?)

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Few Messages Lost (misplaced?)

  • Creator
    Topic
  • #48406
    bob robinson
    Participant

      Received messages from Lab’s server and the Lab server received an “ACK”.  However, messages weren’t in SMAT files, Recovery database or in the receiving system.    The Quovadx server was rebooted without first stopping the Processes.  When I checked the Quovadx server (when I came in) one of the threads was green; however, had to stop/start Process to get messages flowing.

      Any idea what might have happened? And what I can do to prevent this from happening again.  thanks.  bob r.

    Viewing 1 reply thread
    • Author
      Replies
      • #58553
        Michael Hertel
        Participant

          I see that nobody wanted to touch this one so I’ll give you my two cents.

          Rule #1 – Cloverleaf will never lose/misplace a message.

          Rule #2 – If you have doubts, refer to rule #1.

          Check the process log first. Since the system was rebooted, check the .old version. Usually if something is going to blowup, you’ll find it there.

          The only time I’ve seen something like you describe without being a

          configuration error is when a disk drive went out from under me.

        • #58554
          Nathan Martin
          Participant

            Here’s one way I believe you can lose a message:  Under certain conditions, stray ACK’s can pile up in the message stream so that the sending system will eventually read an extra ACK in response to a message which really failed to be sent or stored properly.  This can happen when the systems ignore ACK message ID’s and may also involve timeouts, resends, delete duplicate routines, or connection restarts.

            I also suspect problems with network connectivity.  Some systems seem to send one last message before realizing that the connection has been dropped.  (Especially when VPN’s are involved.)

            Alternatively, you could have received and ack’d a message that was lost through “configuration error”, like Michael said.  Are you using the recovery database all the way through?  Even on newly created messages?  Is there any proc that could KILL a message when it should have been CONTINUED?

            It could be missing from SMAT because someone was viewing the smat files while the thread was starting up, or drive space could have been low enough that the threads automatically stopped saving.  Or, if the machine was power cycled, I imagine the SMAT file buffers could have been lost.  (I don’t know if SMAT files are sync’d on write or not.)

        Viewing 1 reply thread
        • The forum ‘Cloverleaf’ is closed to new topics and replies.