Hi Keith – If you’re attending the User Conference in Dallas next month, I’ll be doing a presentation on how we handle monitoring and alerting at Christiana Care Health System.
We developed relatively straightforward ksh scripts and tcl procs that reference table-based thread-specific parameters to configure scheduled downtime windows, maximum acceptable inbound feed inactivity duration, maximum acceptable outbound feed queue depth, selective recycling of stuck threads, and selective manual enabling/disabling of each thread