hcimsiutil commands (any options) hangs certain sites

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf hcimsiutil commands (any options) hangs certain sites

  • Creator
    Topic
  • #53730
    Michael Heldt
    Participant

      cloverleaf: 5.4.1

      aix: 5.3

      Starting July 1 2013, we noticed our cycle saves and thread reset scripts did not finish as expected – BOTH TEST AND PROD.  Upon further investigation, it appears to be the hcimsiutil command running to try and save off and cycle the thread stats.  No changes from our team and according to the SA’s and network folks…no changes were made to anything.  We have also rebooted the lpar…but the issue continues.  Test and prod are on completely seperate physical hosts.

      We have noted that since hsimsiutil gets called from common scripts (ie hcienginestop), those scripts are taking >7 minutes sometimes to return to the command prompt.  Yet looking in a seperate terminal, you can see the process itself is down.  So it appers to be waiting on the return, not necessarily the command to stop the process.

      To add to the fun, some other strangenesst is that is appears to only be on certain sites in both test and prod, and the sites have similar names (phar3dev and phar3pro) between test and production respectively.

      We have a ticket opened with Infor, but we are interested in if others are getting something similar.

      Besides rebooting the lpar, we have also hcimsiutil -R and hcidbinit -AC all sites!

    Viewing 1 reply thread
    • Author
      Replies
      • #78743
        Michael Heldt
        Participant

          1 – In on terminal session, issuing the following command (date is included to see how long it takes to return to a command prompt)

          $ date;hcienginestop -p phar_oshkosh;date

          Wed Jul 3 11:58:11 CDT 2013

          Trying hcicmd…

          Response:

          Process shutdown initiated

          Process ‘phar_oshkosh’ is not running

          Wed Jul 3 12:06:06 CDT 2013

          2 – In a seperate terminal, you can see the process phar_oshkosh has stopped with a normal exit. But the original terminal session that issued the command has NOT returned. The original terminal does state the “Process ‘phar_oshkosh’ is not running” withing ~12 seconds of issuing the command, but it still has not returned to the command prompt!

          $ date;hciprocstatus;date

          Wed Jul 3 11:58:40 CDT 2013

          Process State Message




          phar_hartford running Started at Wed Jul 3 11:56:48 2013

          phar_oshkosh dead Normal exit at Wed Jul 3 11:58:23 2013

          phar_lakeland running Started at Wed Jul 3 08:48:49 2013

          phar_burlington running Started at Wed Jul 3 08:48:39 2013

          phar_grafton running Started at Wed Jul 3 08:48:33 2013

          phar_aph running Started at Wed Jul 3 08:49:19 2013

          phar_manitowoc running Started at Wed Jul 3 08:48:29 2013

          phar_kenosha running Started at Wed Jul 3 08:48:20 2013

          Wed Jul 3 11:58:40 CDT 2013

        • #78744
          Michael Heldt
          Participant

            …and usual it boiled down to a network + /etc/resolv.conf issue.

            Apparently nslookup gets executed somewhere on on the hcimsiutil.  Our host names in netconfig that were simply aliases via /etc/hosts were “hanging”.

            A DNS entry in resolv.conf was removed off the network (not just the dns service stop) and the nslookup would hang waiting for a response.  This completely explained why some sites experienced issues vs others across multiple env’s.

            Thanks gotham for the help!

        Viewing 1 reply thread
        • The forum ‘Cloverleaf’ is closed to new topics and replies.