firewall problems and workarounds –

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf firewall problems and workarounds –

  • Creator
    Topic
  • #48105
    Anonymous
    Participant

    As more and more remote systems are being used, firewalls are being put in place that timeout after one hour and prevent data from flowing. I am interested in your solutions and why you chose them, details of what you could or could not do and why. How you handle outbound ports as well as inbound ports and how your remote vendors handle the same thing.  how does cloverleaf/remote know that the firewall has timed out ?

    We seem to have a firewall that simply stops letting data thru but otherwise does not notify anyone of the fact (no tcp fin). Any one else seen this ? The more detailed your answers/questions the more help it will be for all who read this thread.

Viewing 16 reply threads
  • Author
    Replies
    • #57649
      Dennis Pfeifer
      Participant

      in working with HDX, their solution was to send a ‘HEARTBEAT’ every 10 minutes ..

      Guess you/we could do the same with a timer proc…

      Basically send a do nothing message ..

      Perhaps a timer proc with a ping? ..

      or .. just a plain cronjob with a ping ..

      Dennis

    • #57650
      Ryan Boone
      Participant

      On AIX —

      The default keepalive setting for AIX is 2 hours. We have a lot of remote connections and had smilar issues until I lowered it to 15 mins. Now we never have that problem (although remote connections go down for other reasons, obviously, so they aren’t hassle-free). The keepalive setting affects all of the cloverleaf sockets.

      To determine current keepalive setting:

      $ no -a | grep tcp_keepidle

                  tcp_keepidle = 14400  (14400=2 hours in 1/2-second intervals)

      To change keep alive setting to 30 minutes (value is in half-second intervals):

      – Login as Root and type the following command:

      no -o tcp_keepidle=3600

      (1800 for 15 minutes)

      Also, add the command to the bottom of /etc/rc.net to auto-reset after reboot.

    • #57651
      Dennis Pfeifer
      Participant

      on Linux .. value is in seconds

      cat /proc/sys/net/ipv4/tcp_keepalive_time

      to change to 15 minues

      echo 900 > /proc/sys/net/ipv4/tcp_keepalive_time

      This is not perm ..

    • #57652
      Bill Bertera
      Participant

      Anyone know where to find keepalive setting on Solaris? thanks

    • #57653
      Anonymous
      Participant

      looks like a possible solution on the aix side – what about our myriad of vendors however ? They open sockets to us and it would be nice if they also sent keepalives every 30 minutes.  Anyone know of nay problem with the fiewall and the tcp keepalives ?

    • #57654
      Anonymous
      Participant

      I found this associated with windows based systems.  It looks like a standard implementation of tcp includes a default keep alive at 2 hours with a socket open with keepalive modifying it for that socket only. And of course other parms that will cause the socket to close after several failures of the keepalive ack.

      I am in the porcess of setting this up between an aix and windows system with a brain dead firewall (60 min global timeout,no notification) and will post the results.

      http://www.winguides.com/registry/display.php/891/

      and this from another website


      I-322 If your aborted sessions aren’t properly cleaned up or if your idle but live sessions are dropped inadvertently, you may need to adjust these two registry parameters.

      Hive: HKEY_LOCAL_MACHINE

      Key: SystemCurrentControlSetServicesTcpipParameters

      Value Name: KeepAliveTime

      Data Type: REG_DWORD

      Value: 7,200,000  

      I-323 Hive: HKEY_LOCAL_MACHINE

       Key: SystemCurrentControlSetServicesTcpipParameters

       Value Name: KeepAliveInterval

       Data Type: REG_DWORD

       Value: 1000  

      Both values are in milliseconds. The default value for KeepAliveTime is 7,200,000, or 2 hours, and the default for KeepAliveInterval is 1000, or 1 second. KeepAliveTime governs how often Windows NT sends a keep alive packet. A specific application can request that keep-alive packets be sent. If the target system is able, it responds with an acknowledgment. The KeepAliveInterval works with the KeepAliveTime and governs how often keep-alive packets are sent until an acknowledgment is received. If the target machine doesn’t respond and the number of retries exceeds the value of TCPMaxDataRetransmissions, the connection is terminated. Restart your machine for any changes to take effect.

      Looking at this, the implication is

      KeepAliveTime governs how often Windows NT sends a keep alive packet  – every 2 hours the system send a keepalaive packet

      A specific application can request that keep-alive packets be sent. If the target system is able, it responds with an acknowledgment. The KeepAliveInterval works with the KeepAliveTime and governs how often keep-alive packets are sent until an acknowledgment is received.

      implication – if an app opens a socket with keepalive, a keep alive will be sent every second, if no keepalive ack after 2 hours, ….  While not specifically stated I am assuming the connection may closed ? I thing the other parameters play a bigger part in closing the connection on keepalive timeouts

    • #57655
      Anonymous
      Participant

      What about using multiserver?

      This is what I believe it happens (in English) Please let me know if I have the wrong picture:

      If I

    • #57656
      Bill Bertera
      Participant

      we’ve got plans to try the multiserver approach. I’ll let you know how it goes.

    • #57657
      Anonymous
      Participant

      Some advocate a multiserver on cloverleaf to work around the problem.

      Keep in mind that the problem (firewall) affects both inbound and outbound connections.

      depending on how cl and vendor connections are set up, the hung connections will be eventually errored out but that time might be execcessive due to the tcp notcpack algorithims.

      cl inbound

      some advocate setting up cl as multiserver, and that would definetly allow inbound connections when the sender eventually does something to try to establish a new connection, but how long is it going to take the sender to know there is a problem and bounce their side ? There might be a lot

      of important messages that need to flow at that time that will be delayed. What might be other ramifications, security concerns, etc. ?

      cl outbound

      we still have the problem of when will tcp do its notcpack and cause a socket error so the thread will attempt a reconnect if we don’t do message timeouts, and if we do do message timeouts and resends, if we don’t do a thread down/up in a reasonable amount of time to restablish a new connection, we are still dependent on the tcpnoack. If we do a thread down/up, it doesn’t establish a new connecttion unless the reciever is in a mutliserver mode. There is also an additional problem in that of you shell out from a thread a bg processes that will stop/start the thread, it will ad

      d the that threads process environment eventually causing that process to panic when it runs out of env space.

      In these scenarios keep in mind that may be multiple firewalls invloved, 2 or more and all by different vendors – bear in mind both inbound and outbound connections, and possible vendor requirements.


      Ideally, one should be able to configure a firewall(s) for no timeout on selected connections.

      They say they cant do that.

      2nd choice would be tcp keepalives – cloverelaf does not support opening a socket in that manner. And it is not known if all vendor products would open their sockets with keep alives. So maybe the system level tcp keepalives could be changed to 30 minutes instead of 2 hours. this would keep the connections connected.  the vendors would have to either support the socket open with keepalive or be willing to change the system level keepalive to 30 minutes

      All this is assuming that the firewalls will pass the keepalive packets. I say this because the timeout on the firewall to to stop data and tcp keeplaive would not allow the firwall to do so, defeating that firewall, so whats the purpose of this firewall option?

      3rd. aplication level keepalive messages – works well, meets the requirements of the firewall. Cloverleaf is easily set up to handle inbound, scheduled resends can do the outbound. requires vendor coding to support. Unkown what it would require of each vendor.

      4th multiserver – see discussion at top – has its own set of problems

      comments are solicited on opinions of each method, adtvantages/drawbacks of each for both cloverl

      eaf and any known vendors.

    • #57658
      Anonymous
      Participant

      To test socket open with keepalive, I modified hcitcptest to open a socket to the remote system (going thru at least 2 firewalls) with SO_KEEPALIVE,

      sent a message and received and ack – waited 68 minutes with no messages being sent, sent another message and received the ack.

      None of the firewalls timed out the connection.

      I am currently testing the same connection without the SO_KEEPALIVE and

      expect the system keepalive that will be sent in 2 hours to fail to go thru the firewall and start the notcpack sequences to start and error the socket to determine how long that takes.

      For anyone who would like to try the same and report on your findings,

      make a copy of hcictptest to tcptestkeepalive, and add the line (a single line)

      setsockopt(NS,&SOL_SOCKET,&SO_KEEPALIVE,undef) || warn “setsockopt: $!”;

      where shown below.

      start it connecting to your remote as cloverleaf would do

      for example

           tcptestkeepalive -h 192.168.4.4 -p 8075 -t mlp

      and send a message like MSH||||||

      leave your test program running.

      It should be acked OK or rejected unless the system you connected is brain dead ( which some are)

      Wait at least one hour and then some and send the message again

      If it is working you will receive the same ack  message other wise nothing.

      #######################################

      # init_client – initialize and connect

      #               to host as client

      sub init_client {

         $them = $opt_h;

         $iaddr = inet_aton($remote);

         $paddr = sockaddr_in($port, $iaddr);

         $proto = getprotobyname(‘tcp’);

         socket(SOCKET, AF_INET, SOCK_STREAM, $proto) || die “socket error: $!”;

      # use keepalive

      setsockopt(NS,&SOL_SOCKET,&SO_KEEPALIVE,undef) || warn “setsockopt: $!”;

         print STDOUT “connecting…nn”;

         if (connect(SOCKET,$paddr)) {

              print STDOUT “Connected to host: $them, port: $portnn”;

         } else {

              die “socket error: $!”;

         }

      }

    • #57659
      Anonymous
      Participant

      Carlos said the below – * are my inserted comments


      This is what I believe it happens (in English) Please let me know if I have the wrong picture:

      If I

    • #57660
      Kevin Scantlan
      Participant

      Does the keep_alive setting need to be set for only the client side or the server side or does it not matter?

    • #57661
      Ryan Boone
      Participant

      Once I lowered the keepalive on the engine box, it maintained all of the socket connections. This is very important for us because we connect to a lot of systems outside of our network (client offices, clinics, hospitals, etc). Beforehand, any systems that had a lower keepalive setting would maintain the connection, but those that did not would time out.

    • #57662
      Daniel Lee
      Participant

      Whenever we connect with a system outside of our network we set up a VPN tunnel for the connection.  Our network guy has some way that he can set up a keep alive on the firewall to keep this tunnel from timing out.  Since he set this up we haven’t had a problem with the tunnel timing out.

    • #57663
      Bill Bertera
      Participant

      Has anyone tried “Close after Write” for the firewall problem, when CLV is the client? Does it actually close after each message, or does it wait a certain amount of time in case others are pending, so it doens’t bounce up & down for every message.

      thanks

    • #57664
      David Harrison
      Participant

      On Solaris, use ndd to inspect or set tcp settings.

      To inspect the keepalive:

    • ndd -get /dev/tcp tcp_keepalive_interval

    • The keepalive interval is in miliseconds and the default is 7200000 (2 hours) and is system wide.

      To set the keepalive:

    • ndd -set /dev/tcp tcp_keepalive_interval
      nnnnnnn
  • #57665
    Nathan Martin
    Participant

    I’ll add my 2 cents.

    We noticed that our outbound connections (over VPN) have trouble re-connecting when “Wait for ACK Timeout” is set to “-1”.  But, those same connections just fix themselves when configured with a reasonable timeout value… No keepalive changes necessary.

    Of course, rather than just reconnect all the time, also treat the problem by applying the suggested keepalive fixes.

  • Viewing 16 reply threads
    • The forum ‘Cloverleaf’ is closed to new topics and replies.

    Forums

    Forum Statistics

    Registered Users
    5,127
    Forums
    28
    Topics
    9,299
    Replies
    34,443
    Topic Tags
    288
    Empty Topic Tags
    10