connect to process error

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf connect to process error

  • Creator
    Topic
  • #48724
    Kevin Scantlan
    Participant

      The last couple of weeks we started getting these type of error message on both our production and test sites.  We traced it down to hcicmd contains this error message.  We can create this error on command.  However, we get the same type of error when we use hcisitectl to bring down the monitor and hciprocstatus when we take down a process.  We bounced the monitor daemon and also bounced a process for this example, but to no avail.  It’s not just this process that is getting the error.

      Example:

      Unable to contact process ‘test2_ps16’ on port 56732

      Error was: A remote host refused an attempted connect operation.

      Try to connect with its-goofy

      Response:

      pstop issued for thread ‘gecard_ci_16’

      We are not aware of any changes that we’ve made in prod and test that we can point to in the last 2 weeks.

      Thanks.

    Viewing 9 reply threads
    • Author
      Replies
      • #59498
        Anonymous
        Participant

          Kevin,

          Assuming you are running on the Unix, try the following command.

          netstat -an | grep 56732

          This would show on your side whether any threads are attepting to connect to this port or listen loop or connected.

          Once you have that cleared up, perhaps you may need to work with network folks why it is not able to establish a connection.

          Also I would try a enable_all config in the process level and bounce the threads. A detail log would be easier to trace thru.

          Hope this helps.

          Reggie

        • #59499
          Kevin Scantlan
          Participant

            I turned “enable all” on for the process shown below, but did not see anything in the process log that stood out.  I did the netstat -an command as suggested and you can see the output below.  Anyone see anything that suggests a problem?

            *************************************

            [test2]/hci/qdx5.2/integrator/test2/exec/processes/test2_ps16>cat cmd_port

            50867

            [test2]/hci/qdx5.2/integrator/test2/exec/processes/test2_ps16>ep 50867       < tcp4       0      0  *.50867                *.*                    LISTEN [test2]/hci/qdx5.2/integrator/test2/exec/processes/test2_ps16>ci_16 pstop’   < Unable to contact process ‘test2_ps16’ on port 50867 Error was: A remote host refused an attempted connect operation. Try to connect with its-goofy Response: pstop issued for thread ‘gecard_ci_16’ [test2]/hci/qdx5.2/integrator/test2/exec/processes/test2_ps16>netstat -an | g>

            tcp4       0      0  *.50867                *.*                    LISTEN

            tcp4       0      0  161.130.112.91.57814   161.130.112.91.50867   TIME_WAIT

            [test2]/hci/qdx5.2/integrator/test2/exec/processes/test2_ps16>netstat -an | g>

            tcp4       0      0  *.50867                *.*                    LISTEN

            ***********************************

          • #59500
            Jim Kosloskey
            Participant

              Kevin,

              This is just a wild guess but could this port actually be in use by an application (even a Cloverleaf(R) inbound or outbound thread)?

              Jim Kosloskey

              email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

            • #59501
              Kevin Scantlan
              Participant

                I can’t image another application having control of the port that’s in the cmd_port file.  Additionally, it seems to happen with every with every process and also the monitor daemon when I do the hcisitectl.  So it happens with hcicmd, hcienginerun, hcienginestop, hcisitectl to name a few.  And it’s happening on both our production and test machines.

              • #59502
                Kevin Scantlan
                Participant

                  We came upon the solution while dealing with something we though unrelated.  Here’s the deal:

                  the hcicmd command has a “-h” parameter, which we never use, for the host.  If left off,  it defaults to LOCALHOST.  We have LOCALHOST set up in our hosts file pointing to 127.0.0.1 as we should.   However, our sys admin tells us that AIX by default first looks at the DNS to resolve names, then looks at the hosts file.  This has not been a problem until someone in the network added a DNS entry that had localhost.xxx.yyy (xxx.yyy being our domain), so we were trying to connect to that IP to send the hcicmd to.

                  Our solution has been to have AIX first look at the hosts file, then the DNS for name resolution.  Our sys admin is looking into that.  Hopefully it will not require a reboot of the server.

                • #59503
                  Anonymous
                  Participant

                    Kevin,

                    As per log “A remote host refused an attempted connect operation”.

                    On the destination system, it can connect to one port only.

                    That means your prod and test interfaces to the destination system must use unique ports.

                    I think thats where the problem is.

                    It looks like you posted the contents of the test log. The cmd_port will not give you the exact port for which it is attempting to connect to.

                    Look in the netconfig and then navigate to destination thread’s properties.

                    Look in the port number.

                    Assume the destination port is 5556 and the host ip address is 111.33.44.55

                    then type the following:

                    netstat -an | grep 5556

                    you would see an entry like the following.

                    tcp4       0      0 155.33.333.50162     111.33.44.55.5556       ESTABLISHED

                    If the connection is not established, you would see LISTEN.

                    Your log does not give much information.

                    Once you enable_all you need to bounce the threads, and processes.

                    When the panic occurs again, look in the log or upload the log to this forum, so that I can take a look at it and give you the idea. Both prod and test logs are needed to see why the panic occured.

                    Thanks

                    Reggie

                  • #59504
                    Peter Heggie
                    Participant

                      We just encountered the same problem – multiple commands sent to processes returning this error:

                      Unable to contact process ‘xyz’ on port 12345.

                      This was on two different physical servers.

                      Also impacted an TCL smtp email function, showing this error:

                      error reading “sock49”: connection timed out

                      DNS administrator found an entry for localhost – removed it and flushed the cache, and a few minutes later, all was well.

                      Peter Heggie

                    • #59505
                      Donna Bailey
                      Participant

                        Had the same error a couple of weeks ago and we had a process with a problem with a tcl proc….but when I stopped/started threads in my process I had 2 or 3 threads with this error.  Ended up using the ps -ef |grep threadname and killed the process (had to use -9 too I believe)…don’t know if this will help you or not…

                        Donna

                        Donna Bailey
                        Tele: 315-729-3805
                        dbailey@microstar.health
                        Micro Star Inc.

                      • #59506
                        Bob Richardson
                        Participant

                          Greetings,

                          We are running AIX 6.1 TL 7 Unix of course.

                          The Cloverleaf software uses the ephemeral port range

                          as defined by your system admin:  default is 32K to 64K.

                          Avoid using these ports for any interfaces developed in Cloverleaf.

                          Hcicmd uses this range to get its port for communication with process

                          threads.  Cloverleaf multi-connect server threads (interfaces)

                          also use the ephmeral range.

                          Also: as a rule we avoid ranges below 8K as some system services

                          use these ports for various utilities and services.  Check your /etc/services file if you are curious.

                          We are running the 5.8.5.0 Integrator and plan to apply Revision 6

                          which fixes some issues with hcicmd and server port problems.

                          Hope this helps to narrow down your problem.

                        • #59507
                          Peter Heggie
                          Participant

                            Thank you – the localhost definition in the network DNS was the problem; removing it and flushing the cache immedately solved our problem. None of our ports are below 8k or above 32k. and yes, when we had the problem, we had to use the command line stop which (eventually) used the sig kill.

                            Peter Heggie

                        Viewing 9 reply threads
                        • The forum ‘Cloverleaf’ is closed to new topics and replies.