Question about deleting outbound threads and their routes.

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Question about deleting outbound threads and their routes.

  • Creator
    Topic
  • #54510
    Sean Farley
    Participant

      I have a question about deleting threads, and specifically, the routes defined for those threads.

      If you delete the outbound thread, Cloverleaf is supposed to delete the routes as well.  However, I deleted two routes in November, and about a month later, our engine crashed!  After several hours of troubleshooting, we found that our source thread was still trying to send to the two outbound threads that I had deleted.  All I did was delete the outbound threads and clicked “OK” when Cloverleaf informed me that it would delete the routes associated with the threads.  I then bounced the process those threads were part of and verified they actually deleted from the network monitor.

      During the time our engine crashed, the error logs kept showing that messages were trying to be sent to those two threads that I deleted a month prior.  So basically, every message that hit the source thread from the time I deleted the two threads was trying to be routed to those deleted threads (the route was defined to send EVERYTHING).  And this was coming from an ADT feed, so the volume was very high.  I ended up having to rebuild those threads so I could search the recovery database.  Sure enough, there were over a million messages set to go out to those threads.  WHen the process was bounced, the attempt to route all of those messages was just CRUSHING the CPU on the server, and eventually, the process would crash.  We ended up having to move all of the threads in that process to a new process.  When we did that, everything came back up as expected, so it’s like cloverleaf didn’t fully delete those routes, and that original process still had some sort of configuration pointing at the deleted threads.  I got all of those messages deleted out of the recovery database, restarted our engine and everything was back to normal.

      I was just wondering if deleting the outbound thread is the recommended way to delete the route as well.  For now, we go into the source thread and delete the routes completely before removing the threads to prevent this from happening again.  Has anybody else ever seen this?  Is this a possible Cloverleaf 5.8 bug?

      We run a Fedora box with Cloverleaf 5.8.  We have 1 site with about 150 threads.  We also (for the most part) do everything with TCL procs, we don’t use many translations.

      Thanks.

    Viewing 3 reply threads
    • Author
      Replies
      • #81773
        Steve Williams
        Participant

          Sean,

             I have never encountered the problem you’ve experienced and I’ve been doing exactly what you have described for deleting OB threads and their routes since version 3.52. Having said that, you may have a bug in your specific platform version, but I’d like to ask a simple scenario question.

          When this problem occurred, was your ADT source thread in a different process from the deleted destination threads? If it was, then bouncing the destination process would remove the OB threads. But, if the source process was not also bounced at the same time, it would continue to place routed messages into a queue status that could never be served. The resulting problem would then fill up the recovery database with orphaned messages.

        • #81774
          Bob Richardson
          Participant

            Greetings,

            We are running 5.8.7 on an AIX Unix server (6.1 TL9) and when deleting routes or threads in a process you need to coordinate a cycling (and refresh) of the memory regions used by the site’s monitorD program (daemon).

            Sequence:  

            (1) Do the NetConfig edits to remove routes/threads/ or add.

            (2) Shut down the process.

            (3) Apply the edits.

            (4) Start the process.

            (5) Once the process is up then cycle the monitorD daemon.

            Refreshes its memory regions.

            Hope this proves helpful.

          • #81775
            Sean Farley
            Participant

              Looking back at the configurations, the source thread was, in fact, in a different process than the outbound thread.  It is possible that I only bounced the one process and not the other, leaving the configuration in the source process applied. This would explain how the messages got built up in the recovery database.

              I will also look into adding step 5 in Bob’s process to our documentation as that isn’t something we normally do, but it makes perfect sense to me.

              Thank you for the excellent input!

            • #81776
              Russ Ross
              Participant

                For any of the following:

                – create a thread

                – rename a thread

                – modify routes

                I do the steps recommend by Bob Richardson plus I shutdown the site and run our clear_db.ksh script, which does a hcidbinit along with other actions.

                Since we have smaller granular sites along with our home grown site versioning methodology, this is doable for us even for an active production site, becuase the new site version copy is completely idle and not used until we switch the site symbolic link from the current site version to the new site version.

                I’m not comfortable activating changes that modify the cloverleaf DB structure unless the site is completely idle with no messages in the cloverleaf DB, plus I initialize both the cloverrleaf DB and site shared memory region after making the changes to the idle site.

                Doing this has kept us out of trouble that we encountered in our ealier days when we did run into problems with messages accumulating in the cloverleaf DB with nowhere to go.

                Since we have a best practice to minimize cross process routing, I probably haven’t run into a situation of deleting routes that cross a processes but I thought that insight was interesting.

                I will keep in mind the enlightened thought share by Steve Williams when removing routes from the inbound thread if it has routes that cross processes, thanks Steve.

                Russ Ross
                RussRoss318@gmail.com

            Viewing 3 reply threads
            • The forum ‘Cloverleaf’ is closed to new topics and replies.