Messages routed to wrong thread on different site

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Messages routed to wrong thread on different site

  • Creator
    Topic
  • #49655
    Bill Bertera
    Participant

    We encountered something very weird over the weekend with 2 of our Cloverleaf sites. 5.2.1 on Solaris. Here’s a description of the interfaces involved and what happened:

    Site A has processes A1 & A2.

    Process A1 has thread 1 that sends messages to thread 2 in Process A2.

    Site B has processes B1 & B2.

    Process B1 has thread 3 that sends messages to thread 4 in Process B2.

    1.Saturday morning: Process A2 panicked and shut down – log file shows no reason. A2 was not started back up until Monday morning.

    2.For the next 24-30 hours about 80 messages queued up in recoveryDB in process A1 waiting for A2 to come back up.

    3.Sunday afternoon: thread 4 in Process B2 received those 80 messages. Those messages never went through Process B1 or thread 3. The messages were Xlated with the route in process A1, but went through the OBTPS of thread 4.

    4. Those 80 messages stayed in Site A’s recovery DB, until Monday morning when Process A2 was restarted, and A1 cycled.

    Basically, it looks like the messages went through A1’s Xlate, but were sent to the wrong OB Pre-TPS queue – of the interface in a completely different site.

    The log files do not have any “resend” commands, and there was no one working who would have resent with Smat, or dumped from RDB and sent to the other thread.

    The log file of Process B2 shows the messaging being sent out, and their metadata looks like they came directly from Process A1. The source & destination threads are 1 & 2, NOT 3 or 4.

    Process B2 logs this error for each message, I assume because its trying to delete from RDB a message that was never there:

    11/25/2007 17:46:36

    [dbi :dbi :ERR /0:23788_ob_23res] [0.0.27222925] dbiWriteLogMsg: mid doesn’t exist

    11/25/2007 17:46:36

    [dbi :dbi :ERR /0:23788_ob_23res] [0.0.27222925] dbiWriteLogMsg: mid doesn’t exist

    11/25/2007 17:46:36

    [dbi :dbi :WARN/0:23788_ob_23res] [0.0.27222925] Requested to delete non-existent mid

    Here’s where it gets interesting, the message ID is from the mid number wheel of Site B, but the OriginalMID is from the number wheel range of Site A.

    Has anyone ever seen anything like this? All signs point to the ICL thread as the culprit, but there’s no real way to diagnose that. Any other suggestions of where to look?

    EDIT: discovered the panic on Saturday was caused by a different thread in the process – unrelated to any of these routes or threads.

Viewing 6 reply threads
  • Author
    Replies
    • #62887
      Tom Rioux
      Participant

      This may be way too simple, but it sounds like you have the same port number on all the threads.  If that is the case and you have it set to localhost, then the scenario you mentioned sounds like something that can happen under those circumstances.   Can we get some more information about your set up?

      Thanks…

      Tom Rioux

    • #62888
      Bill Bertera
      Participant

      Thomas Rioux wrote:

      This may be way too simple, but it sounds like you have the same port number on all the threads.  If that is the case and you have it set to localhost, then the scenario you mentioned sounds like something that can happen under those circumstances.   Can we get some more information about your set up?

      Thanks…

      Tom Rioux

      It wouldn’t be a TCP confusion, that would have shown up in the smat files.

      Thread 1 sends the thread 2 through a route, and the same for threads 3 & 4.

    • #62889
      Terence Gucwa
      Participant

      Did anyone find an answer to this?  We’ve had the same problem – messages getting routed to outbound threads in the wrong site.  But this is in Cloverleaf 5.7.2, AIX 5.3.  The messages sit in the wrong recovery database because the outbound threads don’t exist in that site.

    • #62890
      Bob Richardson
      Participant

      Greetings,

      If I read your post correctly you are running an old Cloverleaf version 5.2.1?  I seem to recall a behavioral trait of the Integrator that if threads

      have names greater than 15 characters and the first 15 characters are the same then the engine can route messages to the wrong thread or not at all – they get lost in a bit bucket.  There will be no process log entries to indicate that problem.

      Whether or not this remains true with the latest Integrator (we are running 5.8.6.0 on AIX 6.1 TL7SP4) is uncertain.  We just make sure

      that long thread names are unique for the first 15 characters.

      Otherwise I would suggest logging a support case to INFOR.

      But be prepared:  they may just tell you to upgrade first and then see if

      the problem continues.

    • #62891
      Terence Gucwa
      Participant

      Regarding my post, we’re on 5.7.  Yes, we have a couple of thread names longer than 15 chars, but they don’t seem involved in this issue, and are unique with the first 15.  Still, thank you Bob, for pointing that out and that’s really something we should get rid of.  

      I have talked to Infor and yes, they would want us to upgrade before they would pursue this as a bug.  I’d be happy though, just to know how to avoid the problem, rather than having a fix.  It’s pretty rare, happens only in the wee hours of Sunday mornings, but when it happens, it’s a mess.

    • #62892
      Bob Richardson
      Participant

      Greetings,

      Is your equivalent of a Security department going TCP port probes?

      I seem to recall that the 5.7 engine can panic/crash when that happens.

      There may be a 5.7 patch (Revision 3) that fixes that problem.

      (If memory serves me here).

      You could see if it is still available; review the Release notes about

      that problem and apply the patch.  Avoids a major upgrade for now.

      Good hunting!

    • #62893
      Bob Richardson
      Participant

      Errata:  read “doing” for “going”.  BobR

Viewing 6 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,115
Forums
28
Topics
9,290
Replies
34,426
Topic Tags
286
Empty Topic Tags
10