Homepage › Clovertech Forums › Read Only Archives › Cloverleaf › Cloverleaf › firewall problems and workarounds –
- This topic has 17 replies, 8 voices, and was last updated 18 years, 4 months ago by Nathan Martin.
-
CreatorTopic
-
October 25, 2005 at 8:05 pm #48105AnonymousParticipant
As more and more remote systems are being used, firewalls are being put in place that timeout after one hour and prevent data from flowing. I am interested in your solutions and why you chose them, details of what you could or could not do and why. How you handle outbound ports as well as inbound ports and how your remote vendors handle the same thing. how does cloverleaf/remote know that the firewall has timed out ? We seem to have a firewall that simply stops letting data thru but otherwise does not notify anyone of the fact (no tcp fin). Any one else seen this ? The more detailed your answers/questions the more help it will be for all who read this thread.
-
CreatorTopic
-
AuthorReplies
-
-
October 25, 2005 at 9:47 pm #57649Dennis PfeiferParticipant
in working with HDX, their solution was to send a ‘HEARTBEAT’ every 10 minutes .. Guess you/we could do the same with a timer proc…
Basically send a do nothing message ..
Perhaps a timer proc with a ping? ..
or .. just a plain cronjob with a ping ..
Dennis
-
October 25, 2005 at 10:29 pm #57650Ryan BooneParticipant
On AIX — The default keepalive setting for AIX is 2 hours. We have a lot of remote connections and had smilar issues until I lowered it to 15 mins. Now we never have that problem (although remote connections go down for other reasons, obviously, so they aren’t hassle-free). The keepalive setting affects all of the cloverleaf sockets.
To determine current keepalive setting:
$ no -a | grep tcp_keepidle
tcp_keepidle = 14400 (14400=2 hours in 1/2-second intervals)
To change keep alive setting to 30 minutes (value is in half-second intervals):
– Login as Root and type the following command:
no -o tcp_keepidle=3600
(1800 for 15 minutes)
Also, add the command to the bottom of /etc/rc.net to auto-reset after reboot.
-
October 26, 2005 at 12:39 am #57651Dennis PfeiferParticipant
on Linux .. value is in seconds cat /proc/sys/net/ipv4/tcp_keepalive_time
to change to 15 minues
echo 900 > /proc/sys/net/ipv4/tcp_keepalive_time
This is not perm ..
-
October 26, 2005 at 1:02 pm #57652Bill BerteraParticipant
Anyone know where to find keepalive setting on Solaris? thanks -
October 27, 2005 at 3:42 pm #57653AnonymousParticipant
looks like a possible solution on the aix side – what about our myriad of vendors however ? They open sockets to us and it would be nice if they also sent keepalives every 30 minutes. Anyone know of nay problem with the fiewall and the tcp keepalives ? -
October 27, 2005 at 7:14 pm #57654AnonymousParticipant
I found this associated with windows based systems. It looks like a standard implementation of tcp includes a default keep alive at 2 hours with a socket open with keepalive modifying it for that socket only. And of course other parms that will cause the socket to close after several failures of the keepalive ack. I am in the porcess of setting this up between an aix and windows system with a brain dead firewall (60 min global timeout,no notification) and will post the results.
http://www.winguides.com/registry/display.php/891/ and this from another website
I-322 If your aborted sessions aren’t properly cleaned up or if your idle but live sessions are dropped inadvertently, you may need to adjust these two registry parameters.
Hive: HKEY_LOCAL_MACHINE
Key: SystemCurrentControlSetServicesTcpipParameters
Value Name: KeepAliveTime
Data Type: REG_DWORD
Value: 7,200,000
I-323 Hive: HKEY_LOCAL_MACHINE
Key: SystemCurrentControlSetServicesTcpipParameters
Value Name: KeepAliveInterval
Data Type: REG_DWORD
Value: 1000
Both values are in milliseconds. The default value for KeepAliveTime is 7,200,000, or 2 hours, and the default for KeepAliveInterval is 1000, or 1 second. KeepAliveTime governs how often Windows NT sends a keep alive packet. A specific application can request that keep-alive packets be sent. If the target system is able, it responds with an acknowledgment. The KeepAliveInterval works with the KeepAliveTime and governs how often keep-alive packets are sent until an acknowledgment is received. If the target machine doesn’t respond and the number of retries exceeds the value of TCPMaxDataRetransmissions, the connection is terminated. Restart your machine for any changes to take effect.
Looking at this, the implication is
KeepAliveTime governs how often Windows NT sends a keep alive packet – every 2 hours the system send a keepalaive packet A specific application can request that keep-alive packets be sent. If the target system is able, it responds with an acknowledgment. The KeepAliveInterval works with the KeepAliveTime and governs how often keep-alive packets are sent until an acknowledgment is received.
implication – if an app opens a socket with keepalive, a keep alive will be sent every second, if no keepalive ack after 2 hours, …. While not specifically stated I am assuming the connection may closed ? I thing the other parameters play a bigger part in closing the connection on keepalive timeouts
-
October 28, 2005 at 1:31 pm #57655AnonymousParticipant
What about using multiserver? This is what I believe it happens (in English) Please let me know if I have the wrong picture:
If I
-
October 28, 2005 at 1:36 pm #57656Bill BerteraParticipant
we’ve got plans to try the multiserver approach. I’ll let you know how it goes. -
October 28, 2005 at 4:28 pm #57657AnonymousParticipant
Some advocate a multiserver on cloverleaf to work around the problem. Keep in mind that the problem (firewall) affects both inbound and outbound connections.
depending on how cl and vendor connections are set up, the hung connections will be eventually errored out but that time might be execcessive due to the tcp notcpack algorithims.
cl inbound
some advocate setting up cl as multiserver, and that would definetly allow inbound connections when the sender eventually does something to try to establish a new connection, but how long is it going to take the sender to know there is a problem and bounce their side ? There might be a lot
of important messages that need to flow at that time that will be delayed. What might be other ramifications, security concerns, etc. ?
cl outbound
we still have the problem of when will tcp do its notcpack and cause a socket error so the thread will attempt a reconnect if we don’t do message timeouts, and if we do do message timeouts and resends, if we don’t do a thread down/up in a reasonable amount of time to restablish a new connection, we are still dependent on the tcpnoack. If we do a thread down/up, it doesn’t establish a new connecttion unless the reciever is in a mutliserver mode. There is also an additional problem in that of you shell out from a thread a bg processes that will stop/start the thread, it will ad
d the that threads process environment eventually causing that process to panic when it runs out of env space.
In these scenarios keep in mind that may be multiple firewalls invloved, 2 or more and all by different vendors – bear in mind both inbound and outbound connections, and possible vendor requirements.
Ideally, one should be able to configure a firewall(s) for no timeout on selected connections.
They say they cant do that.
2nd choice would be tcp keepalives – cloverelaf does not support opening a socket in that manner. And it is not known if all vendor products would open their sockets with keep alives. So maybe the system level tcp keepalives could be changed to 30 minutes instead of 2 hours. this would keep the connections connected. the vendors would have to either support the socket open with keepalive or be willing to change the system level keepalive to 30 minutes
All this is assuming that the firewalls will pass the keepalive packets. I say this because the timeout on the firewall to to stop data and tcp keeplaive would not allow the firwall to do so, defeating that firewall, so whats the purpose of this firewall option?
3rd. aplication level keepalive messages – works well, meets the requirements of the firewall. Cloverleaf is easily set up to handle inbound, scheduled resends can do the outbound. requires vendor coding to support. Unkown what it would require of each vendor.
4th multiserver – see discussion at top – has its own set of problems
comments are solicited on opinions of each method, adtvantages/drawbacks of each for both cloverl
eaf and any known vendors.
-
October 28, 2005 at 6:18 pm #57658AnonymousParticipant
To test socket open with keepalive, I modified hcitcptest to open a socket to the remote system (going thru at least 2 firewalls) with SO_KEEPALIVE, sent a message and received and ack – waited 68 minutes with no messages being sent, sent another message and received the ack.
None of the firewalls timed out the connection.
I am currently testing the same connection without the SO_KEEPALIVE and
expect the system keepalive that will be sent in 2 hours to fail to go thru the firewall and start the notcpack sequences to start and error the socket to determine how long that takes.
For anyone who would like to try the same and report on your findings,
make a copy of hcictptest to tcptestkeepalive, and add the line (a single line)
setsockopt(NS,&SOL_SOCKET,&SO_KEEPALIVE,undef) || warn “setsockopt: $!”;
where shown below.
start it connecting to your remote as cloverleaf would do
for example
tcptestkeepalive -h 192.168.4.4 -p 8075 -t mlp
and send a message like MSH||||||
leave your test program running.
It should be acked OK or rejected unless the system you connected is brain dead ( which some are)
Wait at least one hour and then some and send the message again
If it is working you will receive the same ack message other wise nothing.
#######################################
# init_client – initialize and connect
# to host as client
sub init_client {
$them = $opt_h;
$iaddr = inet_aton($remote);
$paddr = sockaddr_in($port, $iaddr);
$proto = getprotobyname(‘tcp’);
socket(SOCKET, AF_INET, SOCK_STREAM, $proto) || die “socket error: $!”;
# use keepalive
setsockopt(NS,&SOL_SOCKET,&SO_KEEPALIVE,undef) || warn “setsockopt: $!”;
print STDOUT “connecting…nn”;
if (connect(SOCKET,$paddr)) {
print STDOUT “Connected to host: $them, port: $portnn”;
} else {
die “socket error: $!”;
}
}
-
October 28, 2005 at 6:42 pm #57659AnonymousParticipant
Carlos said the below – * are my inserted comments
This is what I believe it happens (in English) Please let me know if I have the wrong picture:
If I
-
January 11, 2006 at 8:10 pm #57660Kevin ScantlanParticipant
Does the keep_alive setting need to be set for only the client side or the server side or does it not matter? -
January 11, 2006 at 9:34 pm #57661Ryan BooneParticipant
Once I lowered the keepalive on the engine box, it maintained all of the socket connections. This is very important for us because we connect to a lot of systems outside of our network (client offices, clinics, hospitals, etc). Beforehand, any systems that had a lower keepalive setting would maintain the connection, but those that did not would time out. -
January 12, 2006 at 6:01 pm #57662Daniel LeeParticipant
Whenever we connect with a system outside of our network we set up a VPN tunnel for the connection. Our network guy has some way that he can set up a keep alive on the firewall to keep this tunnel from timing out. Since he set this up we haven’t had a problem with the tunnel timing out. -
May 30, 2006 at 6:00 pm #57663Bill BerteraParticipant
Has anyone tried “Close after Write” for the firewall problem, when CLV is the client? Does it actually close after each message, or does it wait a certain amount of time in case others are pending, so it doens’t bounce up & down for every message. thanks
-
May 31, 2006 at 11:06 am #57664David HarrisonParticipant
On Solaris, use ndd to inspect or set tcp settings. To inspect the keepalive:
- ndd -get /dev/tcp tcp_keepalive_interval
The keepalive interval is in miliseconds and the default is 7200000 (2 hours) and is system wide.
To set the keepalive:
- ndd -set /dev/tcp tcp_keepalive_interval
nnnnnnn
-
May 31, 2006 at 8:51 pm #57665Nathan MartinParticipant
I’ll add my 2 cents. We noticed that our outbound connections (over VPN) have trouble re-connecting when “Wait for ACK Timeout” is set to “-1”. But, those same connections just fix themselves when configured with a reasonable timeout value… No keepalive changes necessary.
Of course, rather than just reconnect all the time, also treat the problem by applying the suggested keepalive fixes.
-
-
AuthorReplies
- The forum ‘Cloverleaf’ is closed to new topics and replies.