errMsg in tclMail package

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf errMsg in tclMail package

  • Creator
    Topic
  • #51631
    Greg Eriksen
    Participant

      When we first went up on Cloverleaf, our Quovadx consultant set us up w/ a package called “tclMail” (created or modified by Charlie Bursell?) for sending out alert emails (from our AIX engine).  We’ve installed 5.7 Rev 2 on both our test and prod servers (haven’t really begun migration yet) and I am seeing the following error when sending emails from the upgrade environments:

      Code:

      Mon Mar 15 16:06:10 EDT 2010 Sending alert email to xxxxxx@mmc.org
      can’t read “errMsg”: no such variable
         while executing
      “set InVal $errMsg”
         (”after” script)

      To complicate matters, concurrently our network folks are implementing a new SMTP server, so I’m not sure whether the error is due to that or the 5.7 upgrade.  Our 5.4 release is pointed to the old email server, while 5.7 is pointed to the new one.

      The funny thing is, although the error above is showing in the Netmonitor daemon log, the emails still seem to be successfully sent.

      The section of code in tclMail.pkg that seems to be related to the error messages is:

      Code:

         ###################################################################
         # listn Get data from port
         #
         # Usage:
         # listn
         #
         # Notes:
         # reads data from open Mail Server using established socket
         #
         # Returns:
         # What was read on port
         #
         ###################################################################
         proc listn {} {
      global InVal
      variable SOCK ;# Open Socket
      variable DEBUG

      set InVal “” ;# Just in case

      # Timeout error message
      set errMsg “***** Timeout During Read *****”

      # Set initial timeout to 10 seconds
      # Set subsequent to 1 second
      set timeout 10000

      set ID [after $timeout {set InVal $errMsg}]
      fileevent $SOCK r [list Reader $SOCK]
      vwait InVal
      catch {after abort}
      catch {after cancel $ID}

      if {$DEBUG} {
         puts “RESPONSE: $InVal”
      }

      return $InVal
         }

      The way I’m understanding the code is that the ‘after’ command pushes out the “script” {set InVal $errMsg} to an “event handler” to be executed in 10 seconds.  In the meantime, if the fileevent and vwait commands complete within that time, then the “catch {after cancel $ID}” shuts down the execution of the “script” before it happens.

      I think what is now happening in 5.7 is that the new email server is timing out for some reason, causing the script to execute the {set InVal $errMsg}, where before it never did.  So I’m going to speak to the network folks to see what they can do about that.

      But the other issue is this error with the $errMsg variable not being defined.  I think this might be a bug in tclMail (at least this version).  The man page for the ‘after’ command says that, “The command will be executed at global level (outside the context of any Tcl procedure).”  So maybe the script doesn’t error on the variable InVal, because that has been made global at the top of the proc, but it can’t find errMsg because it hasn’t been made global?

      Wondering if anyone else has seen this error?  Should I try to fix the tclMail package (by inserting “global errMsg”), or is there another revision available that addresses this bug?  I’m not that familiar with working with tcl packages, or how to recreate the package index.

    Viewing 11 reply threads
    • Author
      Replies
      • #71034
        Charlie Bursell
        Participant

          Strange!  After all these years this never showed up.  Perhaps you are the first to have the after script time out  ðŸ™‚

          Try changing

              set ID [after $timeout {set InVal $errMsg}]

          To

              set ID [after $timeout “set InVal $errMsg”]

          Perhaps the braces are disabling the variable but I don’t really think so since the error message stated “while executing set InVal $errMsg”

          It is worth a try though.  as you can see the variable errMsg *IS* set:

               set errMsg “***** Timeout During Read *****”

          Let me know how it goes

        • #71035
          Bob Richardson
          Participant

            Greetings,

            This error intrigued me and I found this link on the web:

            http://www.astro.princeton.edu/~rhl/Tcl-Tk_docs/tcl/after.n.html

            with some details on the after script option.  The line of interest is:

            “The command will be executed at global level (outside the context of any Tcl procedure)”  which suggests that the variable “errMsg” needs to be declared as a global.

            Ok… never used this package but may in the future.

            Back to my hole in the ground… this is still Winter in Minnesota.

            — BobR

          • #71036
            Charlie Bursell
            Participant

              If what Bob is sayin is true, and who can doubt Bob  ðŸ˜€ , then try these changes

              Change

              # Timeout error message

                set errMsg “***** Timeout During Read *****”

              To

              # Timeout error message

                set ::errMsg “***** Timeout During Read *****”

              Change

              set ID [after $timeout {set InVal $errMsg}]

              To

              set ID [after $timeout {set InVal $::errMsg}]

              This puts the variable errMsg in the global scope.

              My problem is I haven’t been able to make the after fail

            • #71037
              Greg Eriksen
              Participant

                Thanks, Charlie, but now I’m having the same problem as you – I can’t get the timeouts to happen anymore!

                Right before I tried changing the braces to quotes (from your first reply), the timeouts seemed to stop occurring.  I changed back to braces, set the repeat on the alert to happen every 3 minutes and let it run the rest of the day, and still no timeouts.  Before I left for the day, I got the idea that maybe the frequent emails were keeping something “awake” on the SMTP relay server and thus preventing the timeouts, so I slowed down the repeat to 61 minutes and let it run overnight, and still no timeouts.  If I can’t duplicate the problem, then I can’t figure out which coding change would solve it.

                I did contact the admin for the SMTP server yesterday, so maybe he changed something that eliminated the timeouts.  He did hint that he had been working with McAfee on some performance issues.  So maybe the timeouts were just a transitory issue that will never happen again.

                I am going to make the code modifications you suggested in your most recent reply, and leave it at that.  Hopefully that will fix the “no such variable” error if a timeout ever occurs again.  And if not, at least the timeouts and the variable error didn’t seem to be preventing the emails from being successfully sent.

              • #71038
                Jennifer Hardesty
                Participant

                  Greetings!

                  I work with Greg, who is currently not here, but we are revisiting this issue because it has become increasingly problematic.

                  To follow up on Greg’s last post, Greg did make those changes, but it did not fix the “no such variable” error.  

                  Also the timeouts often prevent an email/alert from being successfully sent.  I have been unable to determine why sometimes an email will be sometimes sent despite the timeout and why other times it won’t.

                  Interestingly, we did a test were we dropped a batch of files to be picked up and generate a batch of email/alerts.  Every time there is more than one file, almost all of them result in timeout errors with perhaps one successful unerrored send.  Yet, when we drop only one file in the folder to be picked up, it is always successfully sent without an error.

                  At first, someone suggested that perhaps the SMTP server connection was being overloaded by so many requests at once, but by that logic, I thought that the first request in the batch should always be successful, but it never is.  Plus, we are getting the error on every alert.

                  Of course, the AIX/UNIX command mail appears to work fine.

                  I’ve been trying to find some documentation on tclMail but I can’t seem to find anything on it.  Does anyone have any?

                  Does anyone have any other suggestions on how to fix these timeout issues?

                • #71039
                  Charlie Bursell
                  Participant

                    The only documentation you will find for tclMail is in the proc itself.  But, that is not bad

                    There is a DEBUG flag in the proc you can turn on that will provide more insight

                    It would be hard to do more without the specific of exactly what you are dong.  I don’t know if this problem can be handled under the auspices of Support or not but I would start with a Support call.

                  • #71040
                    Scott Folley
                    Participant

                      I do not have the tclMail package but I took the source that was originally posted and modified it to what you see below and that took care of the complaint about errMsg for me.

                      What you might want to do is try loading the package on your development box and dropping the timeout to 1 as I did below.  This forced the timeout to occur because I gave it no time to work.  This allowed me to recreate the issue and play with the code until it worked properly.  If you want to post the full package I will play around with that too if you like.

                      Hope that helps.

                      Code:


                      proc listn {} {
                        global InVal
                        variable SOCK      ;# Open Socket
                        variable DEBUG
                        set DEBUG 1
                        set InVal “”      ;# Just in case

                        # Timeout error message
                        set ::errMsg “***** Timeout During Read *****”

                        # Set initial timeout to 10 seconds
                        # Set subsequent to 1 second
                        set timeout 1

                        set ID [after $timeout {set InVal $::errMsg}]
                        #fileevent $SOCK r [list Reader $SOCK]
                        vwait InVal
                        catch {after abort}
                        catch {after cancel $ID}

                        if {$DEBUG} {
                            puts “RESPONSE: $InVal”
                        }

                        return $InVal
                      }

                      listn

                    • #71041
                      Jennifer Hardesty
                      Participant

                        So here’s a curious question.  Since we have both 5.4 and 5.7 running on the same server and we have two versions of tclMail/alertTrigger tclproc running on that same server — one for each version — wouldn’t they both be competing for the same smtp port/socket?  And if so, wouldn’t that cause one of them to repeatedly timeout when attempting to connect to that port/socket?

                      • #71042
                        David Barr
                        Participant

                          Jennifer Hardesty wrote:

                          wouldn’t they both be competing for the same smtp port/socket?

                        • #71043
                          Jennifer Hardesty
                          Participant

                            O.K. Well, further drilling into this issue…I have managed to find the place where it always fails when it fails.

                            Code:

                            Wed May 12 16:45:40 EDT 2010 Sending alert email to interdo@mmc.org
                            Connecting to mail server: XXXXXX.mmc.org on port 25…
                            Connected to XXXXXX.mmc.org 220 XXXXXXXX.mmc.org EWSVA/SMTP Ready.
                            SOCK: sock33
                            Send HELO command…DOMAIN: mmc.org…
                            COMMAND: HELO mmc.org
                            RESPONSE: 250 Requested mail action okay, completed.
                            Initiate Message Transfer…USER: CLOVERLEAF@mmc.org
                            COMMAND: MAIL FROM: CLOVERLEAF@mmc.org
                            RESPONSE: 250 Requested mail action okay, completed.
                            Send RCPT command…To: XXXX@mmc.org
                            COMMAND: RCPT TO: XXXX@mmc.org
                            RESPONSE: ***** Timeout During Read *****
                            error sending email

                            It always fails on the fileevent call inside listn when called by talkn for the Send RCPT/Recipient Identification process.  It never makes it inside the Reader proc on that one.

                            What could be causing this sort of breakdown in functionality?

                          • #71044
                            David Barr
                            Participant

                              Did you try increasing the value of the “timeout” variable?

                            • #71045
                              Scott Folley
                              Participant

                                It may seem odd but smtp is really just a telnet session on port 25 where you use particular commands to interact with the SMTP server.  Below is a transcript of an interactive session that I did with our SMTP server, what I had to type in is in bold:

                                integrator$ telnet 192.168.1.100 25

                                Trying 192.168.1.100…

                                Connected to 192.168.1.100.

                                Escape character is ‘^]’.

                                220 server.domain.com ESMTP Sendmail 8.13.8/8.13.8; Wed, 12 May 2010 17:36:07 -0500

                                HELO 192.168.1.100

                                250 server.domain.com Hello cloverleaf [192.168.1.101], pleased to meet you

                                MAIL FROM: sfolley@domain.com

                                250 2.1.0 sfolley@domain.com… Sender ok

                                RCPT TO: sfolley@domain.com

                                250 2.1.5 sfolley@domain.com… Recipient ok

                                DATA

                                354 Enter mail, end with “.” on a line by itself

                                SUBJECT: This is a test

                                This is a test

                                .

                                250 2.0.0 o4CMa7CL026247 Message accepted for delivery

                                QUIT

                                221 2.0.0 server.domain.com closing connection

                                Connection closed by foreign host.

                                Your best bet would be to try something like this so that you can determine why it is taking so long to find the recipient’s address because that appears to be the case.

                            Viewing 11 reply threads
                            • The forum ‘Cloverleaf’ is closed to new topics and replies.