Forum Replies Created
-
AuthorReplies
-
Some firewalls preserve existing VPN connections during a failover. Others do not. In your case, it seems your VPN connection went away when the failover occurred. This puts the two hl7 servers in a bad state where they think the connection is still up and without the VPN there is no way to successfully close the connection. When either end sends data, there is no reply because the firewall only creates a new VPN tunnel during opening state. Both sides have to be manually stopped and restarted. There is no way for cloverleaf to know that the VPN has gone away so there is no way cloverleaf would know it needs to do anything and there is nothing cloverleaf could do anyway. Your company should get a firewall that migrates the connection during a failover. Other than that you could try to create some sort of monitor script and cycle the thread. And all the remote sites would have to do the same monitoring.
You need to involve your network, firewall, and VPN people to help you with this problem.
At my site we have 5 or 10 VPN connections, some inbound and some outbound. When the firewall/VPN people want to do an upgrade, we shutdown our side, which forces the remote sites to start trying to reconnect. Once the VPN is back, we restart all our threads. Both inbound and outbound interfaces then start new connections through the firewall/VPN and all is good.
I haven’t used aviatrix, but his page might be useful:
Steve Herber
University of WashingtonSeptember 16, 2019 at 1:46 pm in reply to: Collect IP and Port (date and time) of a Connection #112459I did not mention in my earlier note, but we collect the up/down changes from the alert system.
I think this data would be most valuable as a new alert, one on each connection and another on each disconnection to handle the multiple connection situation server.
Steve Herber
University of WashingtonSeptember 12, 2019 at 3:58 pm in reply to: Collect IP and Port (date and time) of a Connection #112398We have multiple alerts on all of our threads and we track each change to opening, up, or down state and include the date and time. We keep the logs for 90 days. We have another script we call thread history that goes through the logs pulling out the information about the particular thread we specify on the command line.
In you case I would expand the script to do an lsof or netstat and also log the network information for the thread.
Steve Herber
1 prod server with about 400 threads.
Steve Herber
University of WashingtonWe upgraded to
hciversion version: 6.2.2.0P, built: Thu May 17 2018
last week. We have had the signal 11 error in two different sites, one yesterday, and another today. Both occurred while in our resend logic.
I wonder if this problem is solved by the 6.2.3.0 version I just discovered?
[prod:prod:INFO/1:to_scca_raw_siu:11/14/2018 12:40:33] Got signal 11, stacktrace: 10065dac 300080a8 100337
88 100339ec 10048158 10048400 100474e8 10045414 1053684c 10058090 d052ad08 00000000
[prod:prod:INFO/0:to_scca_raw_siu:11/14/2018 12:40:33] Stack backtrace have been written to crashInfo file
[pti :sign:WARN/0:to_scca_raw_siu:11/14/2018 12:40:33] Thread 2 ( to_scca_raw_siu ) received signal 11
[pti :sign:WARN/0:to_scca_raw_siu:11/14/2018 12:40:33] PC = 0x10065dac
PANIC: “0”
PANIC: Calling “pti” for thread p_raw_01_cmd
Scheduler State
Thread Events State Priority Runnable PT Msgs
0 0 SCHED_IDLE 0 0 0,0,0
1 0 SCHED_IDLE 0 0 0,0,0
2 0 SCHED_RUNNING 0 0 1,0,0
3 0 SCHED_IDLE 0 0 0,0,0
4 0 SCHED_IDLE 0 0 0,0,0
5 0 SCHED_IDLE 0 0 0,0,0
6 0 SCHED_IDLE 0 0 0,0,0
Steve Herber
University of WashingtonI have considered adding a nightly cron job on my production system to use rsync to make a backup of the whole production site onto a test server. Then on the test server I could use any version control system I wanted and I would have a place to do analysis of the system without being on the production system and without impacting the production system.
Steve Herber
University of Washington -
AuthorReplies