Jeff Shields

Forum Replies Created

Viewing 5 replies – 1 through 5 (of 5 total)
  • Author
    Replies
  • in reply to: Alerts: repeat until problem is solved? #56453
    Jeff Shields
    Participant

      Thanks for the replies.

      Yes, I figured in the end that if I change the modification time of the default.alrt file by touching it, it will retrigger the alerts, which is what I need. However, I would like to wait 2 minutes before touching the file, otherwise a flood of alerts would be generated.

      I’m just trying to figure out a way to wait 2 minutes without blocking the Monitor Daemon (sleep 120 is no good as we are unable to run hciconnstatus while it’s sleeping).

      Jeff

      in reply to: Alerts failing when engine panics #56096
      Jeff Shields
      Participant

        Thanks for all the responses, I went for the process status alert in the end, this always triggers even if the process panics.

        in reply to: Time configuration – Alerts #56167
        Jeff Shields
        Participant

          You should put 8-17 in the hours slot.

          It’s only the hour portion of the time that’s needed here.

          in reply to: Alerts failing when engine panics #56092
          Jeff Shields
          Participant

            We have alerts of type ‘pstat’, triggered if a thread protocol status is not ‘up’ for 60 seconds, and also of type ‘ipque’ and ‘opque’, triggered if the queue depth is more than zero for 15 minutes.

            These seem to work very well, except if a process panics; all its threads go down but the alert is not triggered. Messages also build up in the recovery database destined for the affected threads, but no alerts are triggered for this either.

            in reply to: Can’t lock semaphore #55962
            Jeff Shields
            Participant

              We’ve had this problem many times. We’ve found that the only way to get rid of it is to:

              – Stop all processes and site daemons

              – Copy the $HCISITEDIR/exec folder to something else (e.g. $HCISITEDIR/exec1)

              – Delete the $HCISITEDIR/exec folder

              – Copy $HCISITEDIR/exec1 back to $HCISITEDIR/exec

              – Restart site daemons and processes

              Don’t ask me why it works, I’ve no idea, it’s just something I tried one day. Maybe deleting the exec directory gets rid of the broken semaphore links?

              Also, check your system semaphore parameters in /etc/system to make sure they are set to the recommended levels.

            Viewing 5 replies – 1 through 5 (of 5 total)