Panic caused threads shutdown – how to recover

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Panic caused threads shutdown – how to recover

  • Creator
    Topic
  • #49044
    Thao Tran
    Participant

      Hello,

      We were exporting some messages from EMR to another application. There was one message that caused panic and killed theads and site.  We know what that message is.  We can not bring the site and threads backup because it will try to process the same message and shut down the site again.  We would like to get rid of that message while retaining the other messages in the memory db. Is there a clean way to do this? Any help is greatly appreciated.

      Thank you

      281-204-1524

    Viewing 5 reply threads
    • Author
      Replies
      • #60540
        Jeremy Goslin
        Participant

          Here is how we would handle this at our facility:

          1) hcidump -r -d

          That will give you a list of all messages that are backed up for that particular thread.  Theoretically the message causing the error should be

          the first message in the queue, but that is not necessarily the case.  This

          command searches the recovery database base on the destination of the

          message.

          The numbers that scroll by will look something like this:

          07:21:52   0.0.94728084

          You only want the 94728084 which is the message number.

          3.  Now view the message to see if it is the one causing you problems.

          hcidbdump -r -m 94728084 -L -c

          4.  If that message is the one to delete then run this command:

          hcidbdump -r -m 94728084 -D save-dump

          the -D will delete the message and the save-dump at the end will be the file

          name where the message is saved.

          5.  Now restart the process and all of the threads.  Check the list and see if

          the message are now crossing.

          Then we’d look into that particular message (now saved @ save-dump).  If later you decide to resend it you can do so by:

          hcicmd -p -c “ resend

          This has been out policty for QDX 3.4.1 and 5.4 rev 1 on AIX 5.3.

        • #60541
          Michael Lacriola
          Participant

            The message that you are trying to reprocess for the thread will be at the top provided that you run the hcidbdump command correctly.

            hcidbdump -r -d -O o

            This will sort it by outbound arrival time. Those round things on the command are not zeros, they are the letter “oh.”

            Also, pay close attention to any of the messages that are state 14. We have experienced that if you bring down a thread or engine process that contains a state 14 message in recovery db, that engine process will not start.

          • #60542
            Russ Ross
            Participant

              I agree that state 14 messages are one of the first things I look at if an interface gets clogged up.

              Attached is a screen shot that illustrates how to display any state 14 message(s), and delete just that desired state 14 message(s) using the message ID.

              Russ Ross
              RussRoss318@gmail.com

            • #60543
              Michael Lacriola
              Participant

                Hi Russ!!

                Long time … a quick tip: you do not need to include the 0.0 portion of the message id. Just the last sequence will get you through. Also specifying the flag -s on hcidbdump command will show state 14 messages more clearly.

                hcidbdump -r -s 14

                Will display the messages accordingly. You know the rest. Probably new this too, but it may help the others.

              • #60544
                Charlie Bursell
                Participant

                  Another quick tip  ðŸ™‚

                  If you have to delete State 14 messages you have something else wrong, especially if you have more than one in State 14.  Don’t simply remove the message.  Figure out why.  If multiple Sttae 14 messages on the same connection, most likely recover_33 is not properly configured.

                • #60545
                  Thao Tran
                  Participant

                    Thank you all for your responses. They absolutely helped.

                    Thao

                Viewing 5 reply threads
                • The forum ‘Cloverleaf’ is closed to new topics and replies.