Weird Thing

This topic has 14 replies, 7 voices, and was last updated 17 years, 5 months ago by John Mercogliano.

Creator

Topic
March 5, 2008 at 12:59 pm #49878
Deanna Norman
Participant
Hi,

Just wondering if anybody know what this is…

We have a thread listening on TCP/IP that receives messages, translates it to another thread and then pushes to another thread that writes to a directory. We receive about 300 messages a day

The issue is, whenever we restart the Network Monitor for the above process, it seems to queue up all previous messages that have been sent in the past and they go through each thread again. Thus we get duplicates.

Does anybody know what can cause this?

Thanks,
Creator

Topic

Viewing 13 reply threads

Author

Replies
- March 5, 2008 at 3:16 pm #63973
  Mark McDaid
  Participant
  I would check the recovery database while the thread is running to see if messages are still sitting there after they are sent, maybe in state 14. Disclaimer: I’m fairly new to Cloverleaf, but I would think the messages must still be in the recovery database if they are sent again when the thread/process is restarted. Hope that helps.
- March 5, 2008 at 4:28 pm #63974
  Deanna Norman
  Participant
  Yeah that’s what I was thinking.. but both recovery and error DB are empty.
- March 5, 2008 at 5:05 pm #63975
  Deanna Norman
  Participant
  Actually they are still there.. I changed my search options and all are there.. since Feb 19th! Now I have to figure out why they are staying there even though they successfully get through.
- March 5, 2008 at 5:08 pm #63976
  Mark McDaid
  Participant
  Are there any tcl procs that perform any processing on the messages? You might look there for a cause. Are any recover_33 procs used on that thread? I’m just throwing out ideas of things you might want to search through to find the culprit. Good luck.
- March 5, 2008 at 5:46 pm #63977
  Deanna Norman
  Participant
  this is my tcl proc that moves the msg from thread 2 to thread 3… I’m thinking that this is what makes a copy of the msg and stores it in the recovery DB… You think that is my problem?
  
  Code: ###################################################################### # Name: tps_transfer_msg # Purpose: # UPoC type: tps # Args: tps keyedlist containing the following keys: # MODE run mode (”start”, “run” or “time”) # MSGID message handle # ARGS user-supplied arguments: # # # Returns: tps disposition list: # # proc tps_transfer_msg { args } { keylget args MODE mode ;# Fetch mode set dispList {} ;# Nothing to return switch -exact — $mode { start { # Perform special init functions # N.B.: there may or may not be a MSGID key in args } run { # ‘run’ mode always has a MSGID; fetch and process it keylget args MSGID mh set overmh [msgcreate -meta {USERECOVERDB true} [msgget $mh]] lappend dispList “OVER $overmh” } time { # Timer-based processing # N.B.: there may or may not be a MSGID key in args } shutdown { # Doing some clean-up work } } return $dispList }
- March 5, 2008 at 5:59 pm #63978
  Mark McDaid
  Participant
  Not sure, but I do notice that a copy of the original message is made, and that copy is given a disposition of OVER to send it back the other direction. However, the original message is not given a disposition in the proc. I’m pretty sure this results in a memory leak, and that if the original message is not needed, you would need to give it a disposition of KILL. I’m not sure from just that small section of code, though, why the original message was copied. Like I said, I’m fairly new to Cloverleaf, just took the Level 2 class last month, so take what I say with a grain of salt.
- March 5, 2008 at 6:02 pm #63979
  Deanna Norman
  Participant
  That gotta be it.. I’m creating a copy.. sending it over to the next thread, but the original stays.
- March 6, 2008 at 3:53 pm #63980
  Russ Ross
  Participant
  I was just talking with co-worker Jim Kosloskey yesterday and he mentioned doing an interface using OVER and I remembered I had a problem similar to yours with an inhereited interface with an OVER so OVER can cause this behaivor.
  
  I’m not saying that is your specific problem but it is possible.
  
  Sounds like you have a handle on that possiblity.
  
  There are some things I want everyone to be aware of to help stay out of other confusing database madness.
  
  Whenever doing any of the following here we require stopping all processes in the site, make sure the database is empty, and shut the site down (stop lock manager) so everything is idle:
  
  – create a new thread
  
  – delete an existing thread
  
  – rename an existing thread
  
  Russ Ross
  RussRoss318@gmail.com
- March 6, 2008 at 3:57 pm #63981
  Mark McDaid
  Participant
  Thanks for those tips, Russ. I’m getting ready to implement a new thread on our production site and that is definitely good info to know.
- March 6, 2008 at 8:23 pm #63982
  Todd Lundstedt
  Participant
  Shutting down the site to create a thread? Wow! That’s a bit over kill, don’t ya think? You must have one process per site, or something like that. There’s no way on earth we could do that with our setup (15 processes, 100+ threads).
  
  We regularly add, delete (seldom change) threads with only stopping the process. Now, if we got some crazy IPC stuff going on, we take a little extra care. But mostly, we make our NetConfig changes, stop the process, save the changes, start the process.
- March 6, 2008 at 8:58 pm #63983
  Russ Ross
  Participant
  Yes we have opted towards creating many smaller sites as opposed to a few consolidated sites which we had when I first came to MD Anderson Cancer Center a decade ago.
  
  The word opted might be misleading, actually it was more like forced to many smaller sites to better utilize our limited resources and be able to have down time with less impact plus much more seemless upgrades.
  
  Currently I just ran our site/thread counting script and we currently have
  
  68 prodctuion sites with 506 threads altogether (average 7 – 8 threads per site)
  
  and
  
  130 test sites with 784 threads altogether (average 6 threads per site)
  
  Personally I find creating many smaller sites has been one of the best improvements we have done and I could never go back to many threads in larger sites.
  
  Literally, cloverleaf was imploading when we had many threads in larger consolidated sites.
  
  I would like to thank co-worker Jim Kosloskey for helping us to see the light about creating many smaller sites.
  
  Some people would argue against it and say it is a personal preference, but at some point an opinion becomes a fact with enough experience and this is how I feel about numerous smaller sites.
  
  Russ Ross
  RussRoss318@gmail.com
- March 7, 2008 at 12:57 pm #63984
  Michael Hertel
  Participant
  One advantage to many sites is that each has it’s own lock manager/recovery database.
  
  Therefore if you have a huge transaction volume, the lock manager does not become the bottle neck.
  
  We’ve gone the route of throwing bigger hardware and SAN drives at the problem. So we stick with the few sites concept. It makes daily support much easier for us.
- March 7, 2008 at 1:56 pm #63985
  Steve Carter
  Participant
  Utilizing more sites with fewer threads may work OK in smaller environment. However, the configuration of an environment must take into account many different variables. What works well in some shops could be a disaster in others.
  
  We are currently running 4 Cloverleaf boxes:
  
  Development – 121 sites – 1021 threads
  
  Testing (QA) – 172 sites – 3762 threads
  
  Production (1) – 132 sites – 2832 threads
  
  Production (2) – 8 sites – 27 threads
  
  The QA and Production(1) environments continue to grow everyday.
  
  As you can see, trying to run with an average of 10 threads per site would create a ridiculous number of sites. The overhead from the monitor daemons alone (without any monitoring) would negate any advantage that this setup ‘might’ create.
  
  Our environments are well monitored and relatively easy to support. Based on our needs, this suits us best.
  
  I don’t disagree that your setup is what works best in your case, but I do disagree that ‘an opinion becomes a fact’.
  
  I’ve spent the past 10 years watching our environment grow from 1 server with 2 sites to what it is today. I can tell you that the way our servers are architected is what works best for us.
  
  Steve
- March 7, 2008 at 6:12 pm #63986
  John Mercogliano
  Participant
  One thing I noticed in your tps is that you are not killing or continueing the message handle associated with the $mh so that message will stay in your recovery database.
  
  John
  
  John Mercogliano
  Sentara Healthcare
  Hampton Roads, VA
Author

Replies

Viewing 13 reply threads

The forum ‘Cloverleaf’ is closed to new topics and replies.