Inrecoverable socket error

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Inrecoverable socket error

  • Creator
    Topic
  • #47750
    Janice Criscoe
    Participant

      I have a process that is receiving the following error message.  Is anyone familiar with this type of error.  Thanks.

      [icl :tcpi:ERR /0:anc_result_cmd] write failed: Broken pipe

      [cmd :cmd :INFO/0:anc_result_cmd] Since there are some error had occured while attempted to send an Ack back to client

      [cmd :cmd :INFO/0:anc_result_cmd] We try to process the command anyway but no further Ack will be send back to Client

      [cmd :cmd :INFO/0:anc_result_cmd] Received command: ‘anc_result_xlate xrel_post’

      [cmd :cmd :INFO/0:anc_result_xlate] Doing ‘xrel_post’ command with args ‘

      [cmd :cmd :INFO/0:anc_result_cmd] Inrecoverable socket error.  Closing connection.

      isACSII: TRUE

    Viewing 3 reply threads
    • Author
      Replies
      • #56618
        Richard Hart
        Participant

          Janice.

          We have had similar errors and raised, through our vendor, a support call to Quovadx.

          In our case we did not lose data, it just slowed the communication by about 75%!

        • #56619
          Michael Hertel
          Participant

            Here’s an explanation from the archive:

            =========================


            Original Message


            From: Rob Abbott

            Sent: Thursday, June 17, 2004 1:52 PM

            To: Technical Issues

            Subject: [clovertech] Re: broken pipe

            Here’s the explanation:

            hcicmd connects to the command thread via a TCP/IP loopback connection.

            hcicmd does a “resend” or other expensive operation that keeps a thread busy

            for over 30 seconds.

            hcicmd waits for acknowledgement from the engine that the command has

            completed.

            hcicmd times out waiting for ack (30 seconds)

            Engine operation completes. Command thread gets control.

            Command thread attempts to send acknowledgement back on socket.

            hcicmd has gone away. O/S returns “broken pipe” on the socket due to the

            client disconnect.

            This is a non-fatal error. It’s simply that hcicmd has disconnected before

            the engine has had a chance to ACK. When the engine tries to ACK the error

            occurs.

            You often see this error when starting an engine with a lot of threads. The

            reason for this is when each thread starts, it sends a message to each

            engine process letting the engine know “I’m alive, if you have any pending

            messages for me, please release them” – the command is “xrel_post”.

            Since engines with a lot of threads may take a while to start, the “hcicmd

            xrel_post” processes will time out, and you’ll see a load of broken pipe

            errors once the engine fully starts and is able to ack all the xrel_post

            commands it’s receiving.

            I hope this helps clear things up. The bottom line is that these broken

            pipe errors are non-fatal and should not require an engine bounce or

            anything of the sort.

            Regards — Rob

            ================

            Also:

            ================

            Date:  Thu, 17 Jun 2004 14:04:43 -0500

            Author:  Rob Abbott <Rob.Abbott@quovadx.com>

            Subject:  Re: broken pipe

            Body:  I neglected to mention that hcicmd is a perl script. If you want (at your

            own risk 🙂) to change the 30-second timeout, look for “my $time=30;” at

            around line 254. Change 30 to whatever integer you wish. 0 (zero) means

            wait forever.

            — Rob

          • #56620
            Kathy Zwilling
            Participant

              We started experiencing this same error in the last week and it has occurred 3 times now so I am anxious to find a way to “fix” it.   We are on 5.2 rev. 1.

              In our case, the data does stop processing completely like the connections are frozen in the site affected.  All connections are showing “up” and “green” but data is not either coming in or going out.

              It seems to me that the monitor daemon has to be related to this because the site that is affected is “frozen” with no data processing and when I stop the Monitor Daemon the data starts to flow.  Note is does not wait for me to start the monitor daemon back up.

              This is happened in 3 different sites in the past week so it is not the same site each time.

              Any ideas what might be happening?  The messages I am getting are the same as those listed in the email above.

              Should I implement a cron to cycle the monitor daemon daily to avoid this?

              Thanks for your help!

              Kathy Zwilling

            • #56621
              Richard Hart
              Participant

                Kathy.

                Using ‘cron’ to cycle the monitor daemon would probably help – we don’t use them, so haven’t seen this issue.

                My post on this topic a few months back was related to communications between Cloverleaf and an Application socket listener.  In our case, it happened after about 7 hours of sending maximum messages (for the receiving application) through and required a thread stop/start to get the communications back on track.  We were never given enough time to prove the issue, but the networks guys indicated that the receiving application did not send the TCP ACK back that we were expecting.

            Viewing 3 reply threads
            • The forum ‘Cloverleaf’ is closed to new topics and replies.