firewall problems and workarounds –

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf firewall problems and workarounds –

  • Creator
    Topic
  • #48105
    Anonymous
    Participant

      As more and more remote systems are being used, firewalls are being put in place that timeout after one hour and prevent data from flowing. I am interested in your solutions and why you chose them, details of what you could or could not do and why. How you handle outbound ports as well as inbound ports and how your remote vendors handle the same thing.  how does cloverleaf/remote know that the firewall has timed out ?

      We seem to have a firewall that simply stops letting data thru but otherwise does not notify anyone of the fact (no tcp fin). Any one else seen this ? The more detailed your answers/questions the more help it will be for all who read this thread.

    Viewing 16 reply threads
    • Author
      Replies
      • #57649
        Dennis Pfeifer
        Participant

          in working with HDX, their solution was to send a ‘HEARTBEAT’ every 10 minutes ..

          Guess you/we could do the same with a timer proc…

          Basically send a do nothing message ..

          Perhaps a timer proc with a ping? ..

          or .. just a plain cronjob with a ping ..

          Dennis

        • #57650
          Ryan Boone
          Participant

            On AIX —

            The default keepalive setting for AIX is 2 hours. We have a lot of remote connections and had smilar issues until I lowered it to 15 mins. Now we never have that problem (although remote connections go down for other reasons, obviously, so they aren’t hassle-free). The keepalive setting affects all of the cloverleaf sockets.

            To determine current keepalive setting:

            $ no -a | grep tcp_keepidle

                        tcp_keepidle = 14400  (14400=2 hours in 1/2-second intervals)

            To change keep alive setting to 30 minutes (value is in half-second intervals):

            – Login as Root and type the following command:

            no -o tcp_keepidle=3600

            (1800 for 15 minutes)

            Also, add the command to the bottom of /etc/rc.net to auto-reset after reboot.

          • #57651
            Dennis Pfeifer
            Participant

              on Linux .. value is in seconds

              cat /proc/sys/net/ipv4/tcp_keepalive_time

              to change to 15 minues

              echo 900 > /proc/sys/net/ipv4/tcp_keepalive_time

              This is not perm ..

            • #57652
              Bill Bertera
              Participant

                Anyone know where to find keepalive setting on Solaris? thanks

              • #57653
                Anonymous
                Participant

                  looks like a possible solution on the aix side – what about our myriad of vendors however ? They open sockets to us and it would be nice if they also sent keepalives every 30 minutes.  Anyone know of nay problem with the fiewall and the tcp keepalives ?

                • #57654
                  Anonymous
                  Participant

                    I found this associated with windows based systems.  It looks like a standard implementation of tcp includes a default keep alive at 2 hours with a socket open with keepalive modifying it for that socket only. And of course other parms that will cause the socket to close after several failures of the keepalive ack.

                    I am in the porcess of setting this up between an aix and windows system with a brain dead firewall (60 min global timeout,no notification) and will post the results.

                    http://www.winguides.com/registry/display.php/891/

                    and this from another website


                    I-322 If your aborted sessions aren’t properly cleaned up or if your idle but live sessions are dropped inadvertently, you may need to adjust these two registry parameters.

                    Hive: HKEY_LOCAL_MACHINE

                    Key: SystemCurrentControlSetServicesTcpipParameters

                    Value Name: KeepAliveTime

                    Data Type: REG_DWORD

                    Value: 7,200,000  

                    I-323 Hive: HKEY_LOCAL_MACHINE

                     Key: SystemCurrentControlSetServicesTcpipParameters

                     Value Name: KeepAliveInterval

                     Data Type: REG_DWORD

                     Value: 1000  

                    Both values are in milliseconds. The default value for KeepAliveTime is 7,200,000, or 2 hours, and the default for KeepAliveInterval is 1000, or 1 second. KeepAliveTime governs how often Windows NT sends a keep alive packet. A specific application can request that keep-alive packets be sent. If the target system is able, it responds with an acknowledgment. The KeepAliveInterval works with the KeepAliveTime and governs how often keep-alive packets are sent until an acknowledgment is received. If the target machine doesn’t respond and the number of retries exceeds the value of TCPMaxDataRetransmissions, the connection is terminated. Restart your machine for any changes to take effect.

                    Looking at this, the implication is

                    KeepAliveTime governs how often Windows NT sends a keep alive packet  – every 2 hours the system send a keepalaive packet

                    A specific application can request that keep-alive packets be sent. If the target system is able, it responds with an acknowledgment. The KeepAliveInterval works with the KeepAliveTime and governs how often keep-alive packets are sent until an acknowledgment is received.

                    implication – if an app opens a socket with keepalive, a keep alive will be sent every second, if no keepalive ack after 2 hours, ….  While not specifically stated I am assuming the connection may closed ? I thing the other parameters play a bigger part in closing the connection on keepalive timeouts

                  • #57655
                    Anonymous
                    Participant

                      What about using multiserver?

                      This is what I believe it happens (in English) Please let me know if I have the wrong picture:

                      If I

                    • #57656
                      Bill Bertera
                      Participant

                        we’ve got plans to try the multiserver approach. I’ll let you know how it goes.

                      • #57657
                        Anonymous
                        Participant

                          Some advocate a multiserver on cloverleaf to work around the problem.

                          Keep in mind that the problem (firewall) affects both inbound and outbound connections.

                          depending on how cl and vendor connections are set up, the hung connections will be eventually errored out but that time might be execcessive due to the tcp notcpack algorithims.

                          cl inbound

                          some advocate setting up cl as multiserver, and that would definetly allow inbound connections when the sender eventually does something to try to establish a new connection, but how long is it going to take the sender to know there is a problem and bounce their side ? There might be a lot

                          of important messages that need to flow at that time that will be delayed. What might be other ramifications, security concerns, etc. ?

                          cl outbound

                          we still have the problem of when will tcp do its notcpack and cause a socket error so the thread will attempt a reconnect if we don’t do message timeouts, and if we do do message timeouts and resends, if we don’t do a thread down/up in a reasonable amount of time to restablish a new connection, we are still dependent on the tcpnoack. If we do a thread down/up, it doesn’t establish a new connecttion unless the reciever is in a mutliserver mode. There is also an additional problem in that of you shell out from a thread a bg processes that will stop/start the thread, it will ad

                          d the that threads process environment eventually causing that process to panic when it runs out of env space.

                          In these scenarios keep in mind that may be multiple firewalls invloved, 2 or more and all by different vendors – bear in mind both inbound and outbound connections, and possible vendor requirements.


                          Ideally, one should be able to configure a firewall(s) for no timeout on selected connections.

                          They say they cant do that.

                          2nd choice would be tcp keepalives – cloverelaf does not support opening a socket in that manner. And it is not known if all vendor products would open their sockets with keep alives. So maybe the system level tcp keepalives could be changed to 30 minutes instead of 2 hours. this would keep the connections connected.  the vendors would have to either support the socket open with keepalive or be willing to change the system level keepalive to 30 minutes

                          All this is assuming that the firewalls will pass the keepalive packets. I say this because the timeout on the firewall to to stop data and tcp keeplaive would not allow the firwall to do so, defeating that firewall, so whats the purpose of this firewall option?

                          3rd. aplication level keepalive messages – works well, meets the requirements of the firewall. Cloverleaf is easily set up to handle inbound, scheduled resends can do the outbound. requires vendor coding to support. Unkown what it would require of each vendor.

                          4th multiserver – see discussion at top – has its own set of problems

                          comments are solicited on opinions of each method, adtvantages/drawbacks of each for both cloverl

                          eaf and any known vendors.

                        • #57658
                          Anonymous
                          Participant

                            To test socket open with keepalive, I modified hcitcptest to open a socket to the remote system (going thru at least 2 firewalls) with SO_KEEPALIVE,

                            sent a message and received and ack – waited 68 minutes with no messages being sent, sent another message and received the ack.

                            None of the firewalls timed out the connection.

                            I am currently testing the same connection without the SO_KEEPALIVE and

                            expect the system keepalive that will be sent in 2 hours to fail to go thru the firewall and start the notcpack sequences to start and error the socket to determine how long that takes.

                            For anyone who would like to try the same and report on your findings,

                            make a copy of hcictptest to tcptestkeepalive, and add the line (a single line)

                            setsockopt(NS,&SOL_SOCKET,&SO_KEEPALIVE,undef) || warn “setsockopt: $!”;

                            where shown below.

                            start it connecting to your remote as cloverleaf would do

                            for example

                                 tcptestkeepalive -h 192.168.4.4 -p 8075 -t mlp

                            and send a message like MSH||||||

                            leave your test program running.

                            It should be acked OK or rejected unless the system you connected is brain dead ( which some are)

                            Wait at least one hour and then some and send the message again

                            If it is working you will receive the same ack  message other wise nothing.

                            #######################################

                            # init_client – initialize and connect

                            #               to host as client

                            sub init_client {

                               $them = $opt_h;

                               $iaddr = inet_aton($remote);

                               $paddr = sockaddr_in($port, $iaddr);

                               $proto = getprotobyname(‘tcp’);

                               socket(SOCKET, AF_INET, SOCK_STREAM, $proto) || die “socket error: $!”;

                            # use keepalive

                            setsockopt(NS,&SOL_SOCKET,&SO_KEEPALIVE,undef) || warn “setsockopt: $!”;

                               print STDOUT “connecting…nn”;

                               if (connect(SOCKET,$paddr)) {

                                    print STDOUT “Connected to host: $them, port: $portnn”;

                               } else {

                                    die “socket error: $!”;

                               }

                            }

                          • #57659
                            Anonymous
                            Participant

                              Carlos said the below – * are my inserted comments


                              This is what I believe it happens (in English) Please let me know if I have the wrong picture:

                              If I

                            • #57660
                              Kevin Scantlan
                              Participant

                                Does the keep_alive setting need to be set for only the client side or the server side or does it not matter?

                              • #57661
                                Ryan Boone
                                Participant

                                  Once I lowered the keepalive on the engine box, it maintained all of the socket connections. This is very important for us because we connect to a lot of systems outside of our network (client offices, clinics, hospitals, etc). Beforehand, any systems that had a lower keepalive setting would maintain the connection, but those that did not would time out.

                                • #57662
                                  Daniel Lee
                                  Participant

                                    Whenever we connect with a system outside of our network we set up a VPN tunnel for the connection.  Our network guy has some way that he can set up a keep alive on the firewall to keep this tunnel from timing out.  Since he set this up we haven’t had a problem with the tunnel timing out.

                                  • #57663
                                    Bill Bertera
                                    Participant

                                      Has anyone tried “Close after Write” for the firewall problem, when CLV is the client? Does it actually close after each message, or does it wait a certain amount of time in case others are pending, so it doens’t bounce up & down for every message.

                                      thanks

                                    • #57664
                                      David Harrison
                                      Participant

                                        On Solaris, use ndd to inspect or set tcp settings.

                                        To inspect the keepalive:

                                      • ndd -get /dev/tcp tcp_keepalive_interval

                                      • The keepalive interval is in miliseconds and the default is 7200000 (2 hours) and is system wide.

                                        To set the keepalive:

                                      • ndd -set /dev/tcp tcp_keepalive_interval
                                        nnnnnnn
                                  • #57665
                                    Nathan Martin
                                    Participant

                                      I’ll add my 2 cents.

                                      We noticed that our outbound connections (over VPN) have trouble re-connecting when “Wait for ACK Timeout” is set to “-1”.  But, those same connections just fix themselves when configured with a reasonable timeout value… No keepalive changes necessary.

                                      Of course, rather than just reconnect all the time, also treat the problem by applying the suggested keepalive fixes.

                                    • Viewing 16 reply threads
                                      • The forum ‘Cloverleaf’ is closed to new topics and replies.