› Clovertech Forums › Read Only Archives › Cloverleaf › Cloverleaf › Proto Errors in 5.8
Example:
Proto Err:
Error Msg: FTP operation failed: bind failed with errno 67: Address already in use
We have a few threads that intermittently go into a opening status on a proto error. This is seen more so in 5.8, then ever in 5.6. It appears it is only happening in one or two sites.
I think this may be a local binding address issue. We have our server address configured into the local binding address since 5.6. Where it was working with very little or no issues. Most of the threads we are seeing this on are vpn and configured as ftp and tcp-ip connections. Adding an extra alert to stop and restart the thread has helped in some cases. We are running 5.8 on AIX with HA, hardware and OS is not a issue here.
I am posting this to see if anyone has seen the same problem in 5.8 over 5.6.
This could be a networking and timing issue. Where the vendor is closing or freeing up the port. But most of these interfaces have been up for a while without any issues. Not sure if it
Dave,
We have a test 5.8 site we just got running and we’re seeing the “FTP operation failed: bind failed with errno 67: Address already in use” error when we use the fileset-ftp protocol.
Did you ever determine what was causing this error?
thanks,
Steve
Are you sure you have the host address in the correct place?
Under FTP Options there is a field labled:
Host Name or IP Address
That field allows you to bind a specific network in the case of multiple networks like when using HA. It is normally left blank
Under FTP Host Info you see
HOST:
That is where the address of your FTP server goes
Just a thought
THe host address is correct. The thread gets the data, which results in the bind errno 67 msg, and the msg is in a pending state. If I stop and start the thread the file is ftp’ed successfully.
This configuration works fine under 5.3 (we’re upgrading to 5.8).
thanks,
Steve
Strange error message Steve. It does not look like a cURL error
The primary difference from 5.3 is that the FTP protocol now uses cURL.
Have you contacted Support?
I have not contacted support, but will do so.
thanks,
Steve
Steve:
Did you upgrade your OS when going from Cloverleaf 5.6 to AIX 5.8?
If yes, did you upgrade the OS in place or lay down the OS from scratch on another server?
The reason I asked is because I’ve been keeping track of concerns about going from Cloverleaf 5.3 to Cloverleaf 5.8 so I will not get hurt with known concerns.
One of those concerns is that if upgrading AIX 5.3 in place to say AIX 6.1 there is a know concern about the upgrade getting confused and changing system settings to undesired settings and causing problems.
I’ve put our system admins on notice to watch out for it but they already knew about it, but apparently it has burned those less aware.
Here is a URL to look at if interested:
<a href="https://www.ibm.com/developerworks/mydeveloperworks/blogs/cgaix/entry/aix_6_1_migration_iostat_and_maxuproc_change_to_their_defaults?lang=en_us” class=”bbcode_url”>https://www.ibm.com/developerworks/mydeveloperworks/blogs/cgaix/entry/aix_6_1_migration_iostat_and_maxuproc_change_to_their_defaults?lang=en_us
Russ Ross
RussRoss318@gmail.com
Found the problem.
A little background: We’re upgrading from Cloverleaf 5.3 to 5.8, moving to a new AIX 6.1 HA cluster (Russ – we did not upgrade AIX, we started with 6.1, but thanks for your post). Because it’s a cluster we were advised to set LOCAL_IP in NetConfig (aka Local Binding Address in the gui) to the cluster name for all outbound threads. The LOCAL_IP was set to our cluster name on the ftp thread. When I set LOCAL_IP to null I was able to ftp files successfully.
Charlie – I found this in the process log:
[fset:wrte:ERR /0: dDaysurgrb:10/12/2011 11:46:05] Error while trying to write 09302011_QQQQQQ_HELEN_08161954_EH1071119941.rtf.
Detailed error:bind failed with errno 67: Address already in use
Curl errCode:7 Curl error: Couldn’t connect to server
thanks to all for your help.
–Steve
Steve:
I’m always happy to learn of another Cloverleaf shop that is similar to our environment and leverage each others experiences.
We have a HA cluster running on AIX5.3 right now and will also be going to AIX 6.1 when we can, but we do use the service address for interfaces and FTP
This took a bit of work to figure out how to accomplish because the service address does not fail-over transparently like I wish it did as you obviously know by now.
Also, we are running ether channel which is really NIC load balancing with one or more NICs behaving as one the way we implemented it.
So when we fail over I currently use the NIC alias command to have the service address appear before the persistnent address causing the service address to behave like the persistent address does by default.
Here is some of my code in my HA start up script to help give you a tangible example if you are interested:
#—————————————————————————————-
# Set the hostname to the service hostname so that the failover is transparent
# Note: no longer necessary to set hostname to service name so commented out on 11/7/2009
# but did not delete in case future HACMP ever requires this be done again
#—————————————————————————————-
### hostname $HACMP_SERVICE_HOSTNAME
#————————————————————————–
# get netstat snapshot before doing any ehter channel alias reconfiguration
#————————————————————————–
rm -f $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “—————————————————————-” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “netstat -rn before doing any ether channel alias reconfiguration (`date`)” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “—————————————————————-” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
netstat -rn >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “—————————————————————-” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “netstat -in before doing any ether channel alias reconfiguration (`date`)” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “—————————————————————-” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
netstat -in >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
#——————————————————————————————-
# remove the persitent address ether channel alias from en2 to assure all outbound traffic
# will use the service address ether channel alias on en2 as the default address
#
# Note:
# – this is so that all inbound and outbound trafic will failover transparently
# – the hostname is the name of the persistent address
# because it is no longer necessary to set the hostname to the name of the service address
#——————————————————————————————-
ifconfig en2 $HACMP_ORIG_HOSTNAME -alias
#———————————————————————————————————
# add back the ehter channel alias for the persistent address on en2
# now that the ether channel alias for the service address is the default address for all outbound traffic
#———————————————————————————————————
ifconfig en2 $HACMP_ORIG_HOSTNAME netmask 255.255.255.128 alias
#————————————————————————–
# get netstat snapshot after doing any ehter channel alias reconfiguration
#————————————————————————–
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “================================================================” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “netstat -rn after doing any ether channel alias reconfiguration (`date`)” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “================================================================” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
netstat -rn >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “================================================================” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “netstat -in after doing any ether channel alias reconfiguration (`date`)” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “================================================================” >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
netstat -in >> $SCRIPTDIR/os.start__netstat-audit.txt
echo “” >> $SCRIPTDIR/os.start__netstat-audit.txt
We are on AIX 5.3 now with ether channel but in older versions of AIX and before we used ehter channel I detached the NICs and added them back in the desired order to accomplish this.
Aslo a nice featyre with the ether channel aliasing, my hostname command now shows the physical machine name and in the old days before ether channel aliasing I would change the hostname to the service name on HA startup which confused the admins and once in a while they powered off the wrong physical box, OUCH!
I did become aware of a NetConfig setting that allows Cloverleaf in TCP/IP real-time interface to say which IP the interface pretends it is on if it needs to be different from the default IP for outbound traffic.
However, that would require changing just about every interface everytime we upgrade because our upgrades are done from a scratch install on a new LPAR like you might be doing.
I really wanted a fail-over that was as transparent as I could figure out how to make it.
Russ Ross
RussRoss318@gmail.com
Here is an previous post of mine I noticed that has some additonal detail along the same topic if you are interested enough to want some additioanl clarity.
<a href="https://usspvlclovertch2.infor.com/viewtopic.php?p=22642#22642″ class=”bbcode_url”>https://usspvlclovertch2.infor.com/viewtopic.php?p=22642#22642
Russ Ross
RussRoss318@gmail.com
Hi Russ – thanks for the info. I’m meeting with our AIX admin today and will bring this up.
You’re correct in that we are doing a scratch install to the new cluster. Our solution was to write a script that changed all occurrences of “LOCAL_IP {}” (the default) in all our NetConfigs to “LOCAL_IP clustername”. Then we ran into problems with linked threads and fileset-ftp threads. The fix, as noted, is to set LOCAL_IP back to null/empty. We’re testing all the other thread protocols to see if they work.