Log control

Clovertech Forums Cloverleaf Log control

  • Creator
    Topic
  • #121729
    Jason Russell
    Participant

      So as we’re implementing our Cloverleaf, one of the things I’ve noticed they don’t really do well is log control. Most of their controls are size and number of files based, and there is no time-based controls. What we’re trying to do is keep logs for x number of days, then purge them. I’ve been whipping up a fairly straightforward TCL script that should do this, but I’ve noticed there are commands that are available “inside” the engine instance, and some that are only available “outside” of the engine instance. Let’s take a look at the current set up of the code (It’s not complete, so bear with me, still have quite a lot to flesh out, probably a lot that can be made to work easier/better):

      As a note, I have “Log History” enabled in each individual site, so it creates the “LogHistory” folder in the process folder, and dates them when the process restarts. I’m also wanting to add a DAILY cycle to this, then remove the older files.

      [code]#TCL script to force all processes to cycle logs, then delete logs older than a set number of days.

      set clearDays 10
      set hciRoot ${::env(HCIROOT)}
      set servIniPort [open ${hciRoot}/server/server.ini]
      set servIniList [split [read $servIniPort] \n]
      #puts $servIniList
      close $servIniPort
      set environs [lsearch -regexp -inline $servIniList “^environs=”]
      #puts $environs
      set siteList [split [lindex [split $environs =] 1] “;”]
      #puts $siteList
      foreach sitePath $siteList {
      set site [lindex [split $sitePath /] end]
      exec [${hciRoot}/bin/hcisetenv -site $site]
      netconfig load ${sitePath}/NetConfig
      set procList [netcofig get process list]
      #puts “Process List: $procList”
      #foreach process $procList {
      #     puts “Will runn hcicmd -p $process -c \”.output_cycle\””
      #     #hcicmd -p $process -c “. output_cycle”
      #}
      #exec [find ${sitePath}/exec/processes -maxdepth 3 -mtime $clearDays -type f -name *.log]
      #exec [find ${sitePath}/exec/processes -maxdepth 3 -mtime $clearDays -type f -name *.err]
      }[/code]

      Essentially I’m trying to do the following:

      • Pull the server.ini to get a list of sites
      • Iterate of the list of sites
        • Get the processes from each site
        • Force the process to cycle the output log
      • remove the extra files

      So all of this is to say, I find it odd you can do certain things inside a TCL/Engine instance, and certain things outside of it. I would write this in bash, but I don’t want to keep calling a TCL script to do other things. Some of the things I’ve noticed you can NOT do “inside” of a TCL script: setsite (or call the hcisetenv directly). The main thing I’ve notice you can’t do “outside” of a TCL script is the really fun one I found: netconfig. I’m using ‘netconfig get process list’ and will probably start using other things from this as it’s become a pretty useful tool to get data from the NetConfig file itself.

      So all of that to say, is there an easier way to do this? I’d rather avoid having to do a scheduler event inside of each site, I’d rather have this globally. I can manually move the files for each site, but I would like to keep using the tools as much as possible, as it writes the header out when you cycle the output. I could cycle the output in a bash script, but then I lose my easy access to the ‘netconfig’ extension (and calling a TCL script is slow and I don’t want to do that over and over). Am I missing something or am I going to have to just write around certain aspects? (Losing the log cycle would be my choice).

    Viewing 10 reply threads
    • Author
      Replies
      • #121732
        David Barr
        Participant

          If you want to change sites within a TCL script, you need to do something like this:

          eval [exec $::env(CL_INSTALL_DIR)/integrator/sbin/hcisetenv -site tcl $site]

           

        • #121733
          David Barr
          Participant

            I think your exec command may work, but you’re missing the “eval”.

          • #121734
            Jason Russell
            Participant

              #TCL script to force all processes to cycle logs, then delete logs older than a set number of days.

              set clearDays 10
              set hciRoot ${::env(HCIROOT)}
              set servIniPort [open ${hciRoot}/server/server.ini]
              set servIniList [split [read $servIniPort] \n]
              close $servIniPort
              set environs [lsearch -regexp -inline $servIniList “environs=”]
              #puts $environs
              set siteList [split [lindex [split $environs =] 1] “;”]
              #puts $siteList
              foreach sitePath $siteList {
              set site [lindex [split $sitePath /] end]
              puts “Site: $site”
              set setsiteRes [catch {eval [exec ${hciRoot}/sbin/hcisetenv -site tcl ${site}]} err]
              if { $setsiteRes } {
              puts “Error in setting site:”
              puts $err
              }
              set showroot [exec showroot]
              puts $showroot
              netconfig load ${sitePath}/NetConfig
              set procList [netconfig get process list]
              if { $procList == “” } { continue }
              #puts “Site: $site”
              #puts “Process List: $procList”
              #foreach process $procList {
              # puts “Will run hcicmd -p $process -c \”.output_cycle\””
              # set cycleRes [catch {eval [exec hcicmd -p $process -c “. output_cycle”]} err]
              # if { $cycleRes } {
              # puts $err
              # continue
              # }
              #}
              #exec [find ${sitePath}/exec/processes -maxdepth 3 -mtime $clearDays -type f -name *.log]
              #exec [find ${sitePath}/exec/processes -maxdepth 3 -mtime $clearDays -type f -name *.err]
              #
              }

               

              So interestingly enough, it all calls correctly, but doesn’t seem to actually change the site. Seems to be heading to just manually moving and deleting the file.

            • #121735
              David Barr
              Participant

                If you’re doing an output cycle, wouldn’t you want to be removing *.log.old and *.err.old? I haven’t tested this, so this is just a guess.

              • #121736
                Jason Russell
                Participant

                  No. If you have log history turned on, it creates a folder in the process’ directory. When the engine cycles (whether you restart the process or force it via command), it takes the .log and .err and datetime stamps them, and moves them to the subfolder.

                  It won’t let me post a screenshot (too lazy to save and upload), but the options under keeping log history are you can have them removed at x number of logs, or x number of KB of size. you can also automatically cycle by size, meaning once the log gets so large it automatically cycles out the log. None of these options are really what we want. We want time based cycling. We could probably do this via scheduler, but again this is something we don’t want to do multiple times, looking for a more global approach.

                • #121737
                  Jay Hammond
                  Participant

                    We’ve been using the attached ksh (we’re on AIX 7.2) script to remove process logs older than 30 days for a while now. It doesn’t use any TCL scripts. We call the command from a cron job and write that output to a file in the hci user’s home directory. I use a similar script to clean up monitor daemon logs older than 30 days as well.

                    We have Log History set to keep up to 150 files and the folder size up to 1 GB.  We also have the process configuration set to cycle logs after they reach 10 MB.

                    I’ve attached the script because I can’t figure out how to format the text so that it doesn’t look worse than it does in the script. It’s over-commented, but I forget how some commands work and like to have somewhere to remind myself.

                    I’ve commented out the rm command (which will delete files it’s presented) and the echo command just below it will return a list of the files that would be deleted.

                    Be careful with this if you aren’t sure what it is doing.

                    EDIT: Well, it wouldn’t let me upload the file – I assume because it has no extension. So, here’s the ugly pasted code:

                     

                    #!/usr/bin/ksh
                    #########################################################################################################
                    #########################################################################################################
                    ## Name:        delete_process_loghistory_all_sites_to_logfile
                    ##
                    ## Purpose:     Delete LogHistory files, in each process directory, in all sites,
                    ## that are older than 30 days. BE CAREFUL WITH THIS!!!!
                    ##
                    ## Usage: delete_process_loghistory_all_sites_to_logfile
                    ##
                    ## Author:      Jay Hammond
                    ##
                    ## Date:        06/02/2022
                    ##
                    #########################################################################################################
                    #########################################################################################################
                    echo
                    SCRIPT=$(basename $0)
                    script_start_time=$(date +”%B %e, %Y %T %Z %p”)
                    echo “=======================  $SCRIPT begin time:  $script_start_time =======================”
                    # Set fonts for Help text.
                    # NORM=$(tput sgr0)   #   Normal
                    # BOLD=$(tput bold)   #   Bold
                    # REV=$(tput smso)    #   Reverse fore and background colors
                    # UND=$(tput smul)    #   Underline
                    sites=$(ls $HCIROOT/*/NetConfig | awk -F\/ ‘{print $5}’ | grep -v siteProto| grep -v templates | grep -v -E “^$”)
                    # Variablize the current site so we don’t have to hardcode it
                    # current_site=$(echo $HCISITE)
                    # Let’s setsite to the current site and print out the name
                    # setsite $current_site
                    # echo “==========${REV}$site${NORM}==========”
                    # echo “”
                    for site in $sites;
                    do
                    setsite $site
                    echo
                    echo “SITE:  $site <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<“
                    echo
                    # Variablize the list of processes so that we can iterate through it
                    process_list=$(grep “process ” $HCISITEDIR/NetConfig | awk ‘{print $2}’ | sort -d -f)
                    # For each process in the list of this site’s processes
                    for process in $process_list;
                    do
                    # Set and variablize the loghistory_folder location
                    loghistory_folder=$HCISITEDIR/exec/processes/$process/LogHistory
                    # Change Directory to the current process’s loghistory_folder
                    cd $loghistory_folder
                    # -d = True, if the specified file exists and is a directory.
                    # If there’s a loghistory_folder
                    if [ -d “${loghistory_folder}” ];
                    then
                    # -z = True, if length of the specified string is 0.
                    # If the loghistory_folder is not empty
                    if [ ! -z $(ls -A $loghistory_folder) ];
                    then
                    echo “PROCESS:  $process”
                    echo “Searching for LogHistory files in $loghistory_folder…”
                    # Variablize the folder contents (’cause I’m lazy) to take action
                    # based on what’s returned.
                    files_found=$(find $HCISITEDIR/exec/processes/$process/LogHistory -type f)
                    echo
                    # -n = True, if the length of the specified string is nonzero.
                    # If we don’t find any files in the folder
                    if [ ! -n  “${files_found}” ];
                    then
                    echo “>>>>>>>>>>>>>>Search returned no LogHistory files for the $process process in the $site site.<<<<<<<<<<<<<<“
                    echo
                    else
                    # Variablize the lists of files to keep and to delete so that we can echo them out pretty.
                    # In the current (.) directory, without recursing into any other folders that may be in the
                                        #   current directory (\( -name . -o -prune \), find files with an extension containing err
                                        #   or log (-regextype extended -iregex “.*err|.*log”), that are older than 30 days (-mtime +30)
                    files_to_keep=$(find . \( -name . -o -prune \) -regextype extended -iregex “.*err|.*log” -mtime -30)
                    files_to_delete=$(find . \( -name . -o -prune \) -regextype extended -iregex “.*err|.*log” -mtime +30)
                    # -z = True, if length of the specified string is 0.
                    # If the files_to_delete variable is not empty
                    if [ ! -z “${files_to_delete}” ];
                    then
                    echo “The following files are older than 30 days and are being deleted:”
                    echo  “${files_to_delete}”
                    echo
                    for file in $files_to_delete;
                    do
                    # Echo each file name
                    echo “>>>>>>> $file is being deleted now:”
                    echo
                    # Remove files
                                                #   -e = Displays a message after each file is deleted.
                                                #   -f = Does not prompt before removing a write-protected file. Does not display an error message or return
                                                #           error status if a specified file does not exist.
                                                # UNCOMMENT THE ‘rm -ef $file’ LINE TO ACTIVATE DELETING FILES IF YOU SO DESIRE AND ARE SURE YOU WANT THEM REMOVED
                                                #   FOREVER. OTHERWISE, THE ECHO BELOW IT WILL JUST RETURN THE FILES THAT WOULD BE DELETED.
                    # rm -ef $file
                    echo $file
                    echo
                    done
                    else
                    echo “No files older than 30 days were found to delete for the $process process.”
                    fi
                    # -z = True, if length of the specified string is 0.
                    # If the files_to_keep variable is not empty
                    if [ ! -z “${files_to_keep}” ];
                    then
                    echo
                    echo “These files to remain in the folder:”
                    echo
                    # For each file in the list of files_to_keep
                    for file in $files_to_keep;
                    do
                    # Echo each file name
                    echo $file
                    done
                    fi
                    echo
                    fi
                    else
                    echo “PROCESS:  $process has a LogHistory folder, but it’s empty.”
                    echo
                    fi
                    else
                    echo “PROCESS:  $process does not have a LogHistory folder.”
                    echo
                    fi
                        done
                    done
                    echo

                     

                     

                    • This reply was modified 2 days, 22 hours ago by Jay Hammond.
                    • #121739
                      Jay Hammond
                      Participant

                        Script attached. Remove the file extension if using in a *nix environment.

                        Attachments:
                        You must be logged in to view attached files.
                    • #121741
                      Jason Russell
                      Participant

                        I think the point of using TCL vs BASH was an easy way to get active running logs. It’d be easy enough in *sh to get to the integrator root, look down our site names (We have a naming convention that needs to be followed), but the intention of this script is to look at actively running processes and grab those. I’m copying that script into vim to look over it. I’m fairly proficient in KSH/BASH (KSH is our go-to, looking to move away into something more modern, but it definitely works). First blush looks pretty straightforward.

                      • #121742
                        Jason Russell
                        Participant

                          Jay, thanks for the script. I’m probably going to co-opt it and make some changes (I work with absolute paths, don’t CD around so things don’t get changed and I don’t get down the wrong directory, got bitten once by that). The only thing I noticed that I saw that was odd, was there seems to be a mix of tabs and spacing in the code (multiple people working on it potentially?). Not sure if it was in the source or something weird happened when you uploaded it.

                          Looks like grepping the processes out of NetConfig the most convenient way. I’m still debating if I want to use the server.ini to get a list of the active sites, or if I want to do all. Otherwise, that is a straightforward script, thank you. I’ll post my update to it if you’d like when I’m done.

                        • #121743
                          Jay Hammond
                          Participant

                            Sure, Jason. I’d like to see it when you get yours updated.

                            I’m not sure what happened with the tabs/spacing in the code. I tend to reuse code from other scripts so I could have copied some sections that used tab characters instead of spaces.

                          • #121744
                            Jason Russell
                            Participant

                              So I think the primary difference between our codes is the fact that yours is mean to be manually run, mine is automatic. It writes to a log file so I get log files each day of what happened. I may update so that the log files are also kept for a period of 14 days for posterity, but this is neither here nor there. I’ve removed a lot of the comments as this code doesn’t really need a lot, it’s pretty straightforward.

                              log=”/sitedata1/tmp/clLogClean.log”
                              logDays=14

                              SCRIPT=$(basename $0)
                              script_start_time=$(date +”%B %e, %Y %T %Z %p”)

                              echo “======================= $SCRIPT begin time: $script_start_time =======================” > ${log}

                              # Pull site list from server.ini. Script assumes no spaces in site names.
                              siteList=$(grep “environs” ${HCIROOT}/server/server.ini | awk -F “=” ‘{print $2}’ | sed ‘s/;/ /g’)

                              for siteDir in $siteList;
                              do
                              site=${siteDir##*/}

                              setsite $site

                              echo “SITE: $site <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<” >> ${log}

                              process_list=$(grep “process ” $HCISITEDIR/NetConfig | awk ‘{print $2}’ | sort -d -f)

                              for process in $process_list;
                              do

                              loghistory_folder=$HCISITEDIR/exec/processes/$process/LogHistory
                              echo “PROCESS: $process” >> ${log}
                              echo “Cycling logs for ${process}…” >> ${log}
                              hcicmd -p $process -c “. output_cycle” >> ${log}

                              if [ -d “${loghistory_folder}” ];
                              then

                              echo “Removing files older than 14 days:” >> ${log}
                              find ${loghistory_folder} -type f -mtime +${logDays} -print -delete >> ${log}

                              else

                              echo “PROCESS: $process does not have a LogHistory folder.” >> ${log}

                              fi
                              echo >> ${log}

                              done
                              echo >> ${log}

                              done

                              echo >> ${log}

                              I’ll get a proper file uploaded momentarily.

                            • #121752
                              Jason Russell
                              Participant

                                The script works great, now I have to figure out how to get it to pull the environment variables in cron so it will actually run automatically.

                            Viewing 10 reply threads
                            • You must be logged in to reply to this topic.