FTP process leaving connections in FIN_WAIT_2 status

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf FTP process leaving connections in FIN_WAIT_2 status

  • Creator
    Topic
  • #53631
    John Stafford
    Participant

      I have a pretty unique issue that I am hoping to get some help with.

      Background

      We have two Cloverleaf environments (5.5 rev 1), production and development, that run on the same version of AIX (5.3). We have two Cerner environments, production and development, that run on the same version of AIX (5.3).

      We have a number of FTP processes that get files from our Cerner server and put them on foreign systems. These processes were configured before I assumed this position, and have not presented any issues.

      For this reason, when I was tasked with creating a new FTP job from Cerner, I copied one of the existing FTP threads and modified the information for the new directory and files. In our development system, this worked flawlessly.

      However, when I performed the same process in production, it brought the server down after several days of seemingly normal operation. The server administrator identified that the Cloverleaf box had over 30,000 sockets open, all sitting in a CLOSE_WAIT status on his side. Killing the thread closed all of the sockets, but it opens another port every 60 seconds while it is running.

      On the Cloverleaf box, the connections sit in a FIN_WAIT_2 status, which times out after around 10 minutes. The CLOSE_WAIT status on the other box is not able to timeout, and sits there until the FTP connection is closed. Throughout all of this, the files are successfully transferred.

      Troubleshooting

      Workarounds that did not work-

      Copied the working thread in test and modified it to connect to prod.

      Created a new thread from scratch.

      Removed the Directory Parse and route tcl procs.

      Tried modifying the specs on the thread – Changed style (single, eof, etc), read interval, etc.

      Observations

      There is one thing that stands out to me when this thread is running. Whenever it grabs a file, it goes to an Up status, opening a new port. After around 30 seconds (the scan interval), it goes back to Opening. Around 30 seconds later (when there is another file), it goes to Up, again.

      The thread that I copied and is connected to the development Cerner box stays up. I think that this is the crux of the issue that we’re experiencing. Can anyone think of a reason that an inbound FTP thread (that gets the file) would be closing and reopening the connection? Given that we have other, similar FTP jobs that connect to the server without any issue and stay Up, there appears to be something amiss with this thread.

      I appreciate any insight that anyone can provide into this issue.

      TL;DR version – New FTP job opens a new port every time it reads a file and leaves them in a FIN_WAIT_2 status. Thread works in test, but not in prod. Other FTP jobs to the same server work fine in prod. FTP thread goes up when it grabs a file, but reverts back to Opening after the scan interval. Tried a number of methods to change how the thread is configured, to no avail.

    Viewing 4 reply threads
    • Author
      Replies
      • #78352
        Keith McLeod
        Participant

          Try adding -a as an option to the nlst command.

          nlst -a

        • #78353
          John Stafford
          Participant

            Keith,

            Remarkably, this seems to have worked! Might I ask why this would be necessary when connecting to our Production domain, but not in our Development domain? I do not know what the -a flag does.

            Thanks for your help!

          • #78354
            Keith McLeod
            Participant

              It has to do with . and .. and possibly how the server you are connecting to is configured.  I don’t recall the specifics but if you don’t see . and .., I beleive you see the problem you experienced.   You can verify by ftping and running the command on each to see what you get….  I had the issue with an HPUX configuration….

            • #78355
              Chris Williams
              Participant

                nlst throws an error if it doesn’t find any files in the directory. Adding the -a causes nlst to include . and .. thereby keeping the directory from appearing to be empty. However the output will include those two items.

              • #78356
                John Stafford
                Participant

                  Thanks for the updates.

                  Last night, I remembered that the previous interface specialist mentioned that there needs to be a dummy file in the directory, though he never mentioned why. I suppose that this would be it.

                  Would the directory parse tcl produce the behavior that I was experiencing, if there is nothing in the directory?

                  In the interest of completeness, putting a dummy file in the directory also corrected the behavior.

              Viewing 4 reply threads
              • The forum ‘Cloverleaf’ is closed to new topics and replies.