So we’re finally digging into alerts hard. However, there are some questions we have. We’re on 2022.09 and on RHEL 8.10.
First global vs local alerts. It seems there are a few ‘global’ alerts that can be set. Most of these are around system processes (disk space, cpu, etc). It also seems that you can alert on processes globally as well. Is this accurate. When selecting the ‘source’ for process status, you get a list of ALL processes. Does this actually function globally?
With that said, a few variables and interactions I have questions about. When calling a tcl script, according to the documentation and the sample script (which makes NO sense), there is a ‘return’ value. Is this returned into %A? IE: if I run a tcl script that does some additional checking, and return a value, does that get put into %A? It’s really unclear what happens there.
%P and %T vs %p and %t. So according to the documentation, %P and %T return a LIST of triggering variables (comma separated because reasons), However, for %p and %t, the documentation says: ‘If T3 is “dead”, then the alert is triggering again. %T is replaced with T2 and T3. %t is one of these two threads, if %t is replaced with T3, and %p is replaced with P3.’
So if two threads trigger, %T will contain thread2,thread3, but %t says ‘one of these two threads’. How does it know which one? does it trigger it multiple times for each individual thread?
The last one is more of a confirmation, but %R — The repetition value. Say a thread is down, and I pass in %R, it will increment until it is up, once it’s up and the alert goes off again, does it reset the counter?