Connections keep going off

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Connections keep going off

  • Creator
    Topic
  • #50306
    David Teh
    Participant

    Hi folks,

    I have QDX 5.3 running on AIX 5.1.

    Currently doing testing with a new Philips CIS system which uses Rhapsody as their interface engine. There is a firewall in between the system and Cloverleaf.

    Cloverleaf has 2 OB threads configured as clients and Philips is ‘server’.

    Continuously, probably after a period of inactivity, we would notice messages pending up on the Cloverleaf threads but the thread are still ‘UP’. When either side refeshes the connection independently, we would get FIN_WAIT. Normalised only after both sides stopped and started after the ‘FIN_WAIT’ was gone from netstat.

    The Cloverleaf thread configuration is pretty standard throughout the engine. Could it be the firewall or is Philips missing something? Or is something amiss with the network?

    Extract from Process Log (EO enabled):

    [pd  :thrd:INFO/1:   adt_to_cis] OB-Data queue has 2 msgs

    [pd  :thrd:INFO/1:   adt_to_cis] OB-Data queue has NO work

    [pti :sche:DBUG/2:   adt_to_cis] Thread 7 has been enabled

    [pti :sche:INFO/1:   adt_to_cis] Thread has 1 ready events.

    [pti :even:DBUG/0:   adt_to_cis] Processing ACTIVE_TIMER () even

    t 0x2215d688

    [pti :even:DBUG/1:   adt_to_cis] Calling cb 0x2009c674

    [msi :msi :DBUG/1:   adt_to_cis] msiExportStats: export for thread: adt_to_cis

    [pti :even:DBUG/0:   adt_to_cis] Unregistering ACTIVE_TIMER () e

    vent 0x2215d688 for tid 7

    [pti :even:DBUG/0:   adt_to_cis] evUnregister ACTIVE_TIMER event 0x2215D688 for

    tid 7

    [diag:leak:DBUG/0:   adt_to_cis] diag timeval alloc 0x212b6ab8

    [diag:leak:DBUG/0:   adt_to_cis] diag dqe alloc 0x21ccf6c8

    [pti :even:DBUG/0:   adt_to_cis] Registering ACTIVE_TIMER () eve

    nt 0x2215d688 for tid 7

    [pti :even:DBUG/0:   adt_to_cis] Registering ACTIVE_TIMER event for tid 7

    [diag:leak:DBUG/0:   adt_to_cis] diag timeval free  0x212b6ab8

    [pti :sche:INFO/1:   adt_to_cis] Thread has 0 ready events left.

    [pd  :thrd:INFO/1:   adt_to_cis] OB-Data queue has 2 msgs

    [pd  :thrd:INFO/1:   adt_to_cis] OB-Data queue has NO work

    [pti :sche:INFO/2:     trashbin] Performing apply callback for thread 2

    [msi :msi :DBUG/1:     trashbin] msiExportStats: export for thread: trashbin

    [pti :sche:INFO/2:  pacs_r10063] Performing apply callback for thread 3

    [msi :msi :DBUG/1:  pacs_r10063] msiExportStats: export for thread: pacs_r10063

    [pti :sche:INFO/2:  pacs_r10062] Performing apply callback for thread 4

    [msi :msi :DBUG/1:  pacs_r10062] msiExportStats: export for thread: pacs_r10062

    [pti :sche:INFO/2:cern_path_cis] Performing apply callback for thread 5

    [msi :msi :DBUG/1:cern_path_cis] msiExportStats: export for thread: cern_path_ci

    s

    [pti :sche:INFO/2:ish_to_trendcare] Performing apply callback for thread 6

    [msi :msi :DBUG/1:ish_to_trendcare] msiExportStats: export for thread: ish_to_tr

    endcare

    [pti :sche:INFO/2:   adt_to_cis] Performing apply callback for thread 7

    [msi :msi :DBUG/1:   adt_to_cis] msiExportStats: export for thread: adt_to_cis

    [pti :sche:DBUG/2:   adt_to_cis] Thread 7 has been enabled

    [pti :sche:INFO/1:   adt_to_cis] Thread has 1 ready events.

    [pti :even:DBUG/0:   adt_to_cis] Processing TIMER () event 0x21f

    f7b58

    [pti :even:DBUG/1:   adt_to_cis] Calling cb 0x2009b750

    [pdl :open:INFO/0:   adt_to_cis] Driver attempting reopen

    [dbi :elog:DBUG/3:   adt_to_cis] [0.0.58411777] Looking for mid in error db

    [dbi :rlog:DBUG/3:   adt_to_cis] [0.0.58411777] Looking for mid in recovery db

    [msg :Mid :DBUG/3:   adt_to_cis] Assigned mid [0.0.58411777] to msg 30003228

    [msg :Msg :DBUG/0:   adt_to_cis] [0.0.58411777] MSG alloc 0x30003228

    [pdl :PDL :DBUG/2:   adt_to_cis] PDL changed states: old 6, new 7

    [pdl :PDL :DBUG/0:   adt_to_cis] Evaling:

           proc hci_pd.write { info } {

               global MsgId

               

               keylget info message MsgId

               keylset continuations ok            write.done

               keylset continuations error         write.error

               keylset continuations timeout      

                 hci_pd_send basic-msg

        ] $cont

        inuations

               }

               

               proc write.done {info} {}

               

               proc write.error {info} {

                   hci_pd_report_exception 1 “write failure”

                   hci_pd_set_result_code 1

               }

               

               proc write.timeout {info} {

                   global MsgId

                   msgmetaset $MsgId FLAGS {{proto_timeout 1}}

                   hci_pd_set_result_code 1

               }

               proc hci_pd.read {info} {

               

                   keylset continuations basic-msg             read.done

                   keylset continuations error         read.error

                   keylset continuations timeout      

                   hci_pd_receive $continuations

               }

               

               proc read.done {info} {

                   keylset accept text

          ]

                     keylset accept end [keylget info end]

                     hci_pd_accept $accept

                 }

                 proc read.error {info} {

                     

                     keylget info type type

                     switch -exact — $type {

                         input-error {

                             hci_pd_report_exception 1 “device error (remote side probabl

          y shut down)”

                             hci_pd_ignore_input -all

                         }

                         no-match {

                             hci_pd_ignore_input 1

                             hci_pd_ignore_input -until xb

                         }

                         default {

                             hci_pd_report_exception 2 “unknown fail: $type”

                             hci_pd_ignore_input -all

                         }

                     }

                 }

                 

                 proc read.timeout {info} {

                     hci_pd_ignore_input 1

                     hci_pd_ignore_input -until xb

                 }

          [pdl :PDL :DBUG/2:   adt_to_cis] PDL changed states: old 7, new 4

          [pdl :PDL :DBUG/0:   adt_to_cis] Calling Tcl procedure: hci_pd.initialize

          [pdl :PDL :DBUG/0:   adt_to_cis] with args: {}

          [pdl :PDL :DBUG/0:   adt_to_cis] Tcl procedure hci_pd.initialize returns ”

          [pdl :PDL :INFO/0:   adt_to_cis] connected to 10.194.105.163 on port 10118

          [pdl :PDL :DBUG/0:   adt_to_cis] tcp-client: attempting connect to: 10.194.105.1

          63:10118

          [pdl :PDL :INFO/0:   adt_to_cis] tcp-client: connect error (Operation now in pro

          gress)

          [pdl :PDL :DBUG/1:   adt_to_cis] PDL setting timeout in 0.10 seconds

          [diag:leak:DBUG/0:   adt_to_cis] diag ev alloc 0x21ffc548

          [diag:leak:DBUG/0:   adt_to_cis] diag dqe alloc 0x212b6ab8

          [pti :even:DBUG/0:   adt_to_cis] Registering TIMER () event 0x21

          ffc548 for tid 7

          [pti :even:DBUG/0:   adt_to_cis] Registering TIMER event for tid 7

          [pti :even:DBUG/0:   adt_to_cis] Unregistering TIMER () event 0x

          21ff7b58 for tid 7

          [pti :even:DBUG/0:   adt_to_cis] evUnregister TIMER event 0x21FF7B58 for tid 7

          [diag:leak:DBUG/0:   adt_to_cis] diag ev free  0x21ff7b58

          [pti :sche:INFO/1:   adt_to_cis] Thread has 0 ready events left.

          [pd  :thrd:INFO/1:   adt_to_cis] OB-Data queue has 2 msgs

          [pd  :thrd:INFO/1:   adt_to_cis] OB-Data queue has NO work

      Viewing 4 reply threads
      • Author
        Replies
        • #65572
          Jim Kosloskey
          Participant

          David,

          I would suspect the firewall is severing the pipe to the receiving system due to extensive idle time.

          Frequently when firewalls do this it is not done gracefully thus Cloverleaf(R) thinks it is still up.

          Perhaps you can convince the Firewall folks to relieve the idle constraint for just that connection or check Clovertech as some others have faced the same issue and have had other resolutions.

          email: jim.kosloskey@jim-kosloskey.com

        • #65573
          Gary Atkinson
          Participant

          I have the same issue for two of my threads, but they are over a VPN.  What I do is set a trigger on the outbound queue depth.  When the alert triggers I bounce the connection.  99.99% that gets messages flowing again.  I gave up trying to get the network/vpn/firewall people to “fix” anything.  One thing I did before was to just keeping sending the message, but then the vendor complained I was sending the same message too many times  ðŸ™„ I never could win  ðŸ˜•

        • #65574
          Michael Hertel
          Participant

          See this thread:

          http://clovertech.infor.com/viewtopic.php?t=734

          You need to tweak AIX’s tcp_keepidle parameter.

        • #65575
          David Teh
          Participant

          Thanks for the replies…..will dig further……

        • #65576
          John Custer
          Participant

          We had the vendor disable the keepalives in the IPSEC configuration on their side of the interface on the VPN.

      Viewing 4 reply threads
      • The forum ‘Cloverleaf’ is closed to new topics and replies.

      Forum Statistics

      Registered Users
      5,073
      Forums
      28
      Topics
      9,251
      Replies
      34,230
      Topic Tags
      275