Convert large text file to smaller individual files

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Convert large text file to smaller individual files

  • Creator
    Topic
  • #50425
    Henry Bauer
    Participant

      I have several large text files I would like to parse into smaller individual text files for processing. The files have nl after each line and CR at the point I would like to break it down. Best way of accomplishing this? Thanks for any input.

    Viewing 9 reply threads
    • Author
      Replies
      • #66061
        Anonymous
        Participant

          There are many ways to do this. I think it boils down to what you are most comfortable doing.

          You could create a script at say the unix level to break it out.

          Or you could create a TCL inbound tps that could break it out.

          The options are all over the place.

          Me I would do it in an inbound TPS proc. But I’m not sure if you are doing any translation or if this is just a raw feed kind of thing.

        • #66062
          Henry Bauer
          Participant

            This would be a raw feed. I would prefer tcl, just unsure of how to do it exactly.

          • #66063
            Michael Hertel
            Participant

              One way I would do it:

              1) Read message.

              2) msgset $mh “”

              3) Split message on x0d

              4) Foreach loop to create a new messages with each item in the list

              5) Set the metadata in each new message for a new filename

              6) lappend Continue each new message to the dispList

              7) lappend Kill the original message to the dispList

              Hope this helps,

              -mh

            • #66064
              Henry Bauer
              Participant

                I have the logic splitting the file apart. But when I try to continue the first message it won’t process the other files. I can see the code splitting the file apart but when I add the continue it stops. Any thoughts would be greatly appreciated.

                foreach segment $segmentList {

                                 set str [llength $segment]

                                 if {$str > 2} {

                                  set block [lappend block $segment]

                                     }

                                if {$str < 1} {

                                    msgset $fmh [join $block n]

                                    set dvr_ctl_str “{FILESET {{OBFILE $file_name}}}”

                                    msgmetaset $fmh DRIVERCTL $dvr_ctl_str

                                 

                                    return “{CONTINUE $fmh} {KILL $mh}”

                                    set fmh  “”

                                    set block “”                  

                                    set ctr_name “${HciConnName}”

                                    set ctr_val  [format %02d [CtrNextValue $ctr_name]]

                                    set name [fmtclock [getclock] %d%H%M]

                                    set dot .

                                    set file_name “IB$name$dot$ctr_val”

                                  }

                               }

              • #66065
                Michael Hertel
                Participant

                  Could you post the entire proc?

                  When you:

                  return “{CONTINUE $fmh} {KILL $mh}”

                  That stops the proc completely and gives control back to the engine at that point.

                  You probably want to lappend to dispList and return $dispList after all messages are created.

                • #66066
                  Henry Bauer
                  Participant

                    Here you go.

                    ######################################################################

                    # Name: streamline_file

                    # Purpose:

                    # UPoC type: tps

                    # Args: tps keyedlist containing the following keys:

                    #       MODE    run mode (“start”, “run” or “time”)

                    #       MSGID   message handle

                    #       ARGS    user-supplied arguments:

                    #              

                    #

                    # Returns: tps disposition list:

                    #          

                    #

                    proc streamline_file { args } {

                       keylget args MODE mode               ;# Fetch mode

                       global HciConnName cntFile outName fmtLen

                       set dispList {} ;# Nothing to return

                       switch -exact — $mode {

                           start {

                              # Perform special init functions

                    # N.B.: there may or may not be a MSGID key in args

                              # Initialiaze the counter with the name of the

                              # thread that called CtrInitCounter.  The counter

                              # is intialized to 1, has a max value of 99 and

                              # takes the default rollover value of 1.  See the

                              # Cloverleaf TCL reference on Counter Commands for

                              # more information.

                              if {[catch [CtrInitCounter “${HciConnName}” file 1 99] cerr]} {

                              echo “Could not initialize counter”

                                 }

                               }

                             

                           run {

                       # ‘run’ mode always has a MSGID; fetch and process it

                               keylget args MSGID mh

                               set msg [msgget $mh] ;# Get the message data

                               set fmh “”            

                               set segmentList [split $msg n] ;# Split the message into a valid list

                               set ctr_name “${HciConnName}”

                               set ctr_val  [format %02d [CtrNextValue $ctr_name]]

                               set name [fmtclock [getclock] %d%H%M]

                               set dot .

                               set file_name “IB$name$dot$ctr_val”

                               set block “”

                               set str “”

                             

                             

                              # The fileset dirver’s configuration in NetConfig

                              # may be overridden by setting the equivelant key

                              # to the desired value in the messaga meta data

                              # field DRIVERCTL.

                              #

                              # Here the name of the outbound field is changed

                              # from what is configured in NetConfig to the

                              # value supplied by the user by setting the user

                              # argument NAME.

                    foreach segment $segmentList {

                                    set str [llength $segment]

                                    if {$str > 2} {

                                     set block [lappend block $segment]

                                        }

                                   if {$str < 1} {

                                       msgset $fmh [join $block n]

                                       set dvr_ctl_str “{FILESET {{OBFILE $file_name}}}”

                                       msgmetaset $fmh DRIVERCTL $dvr_ctl_str

                                     

                                       return “{CONTINUE $fmh} {KILL $mh}”

                                       set fmh  “”

                                       set block “”                  

                                       set ctr_name “${HciConnName}”

                                       set ctr_val  [format %02d [CtrNextValue $ctr_name]]

                                       set name [fmtclock [getclock] %d%H%M]

                                       set dot .

                                       set file_name “IB$name$dot$ctr_val”

                                     }

                                  }

                          }

                           time {

                               # Timer-based processing

                       # N.B.: there may or may not be a MSGID key in args

                           }

                           

                           shutdown {

                       # Doing some clean-up work

                    }

                       }

                       return $dispList

                    }

                  • #66067
                    Henry Bauer
                    Participant

                      Plus get these error codes.

                      message0 message0 message0

                      [0:TEST] ‘message0’ (returned by ‘streamline_file ‘) does not match { }

                      [0:TEST] ‘message0’ (returned by ‘streamline_file ‘) does not match { }

                      [0:TEST] ‘message0’ (returned by ‘streamline_file ‘) does not match { }

                      do not know how to fix that.

                    • #66068
                      Michael Hertel
                      Participant

                        You need to create additional messages with either msgcopy or msgcreate.

                        So do this:

                          keylget args MSGID mh

                          set msg [msgget $mh] ;# Get the message data

                          msgset $mh {}

                        ….

                          set fmh [msgcopy $mh]

                          msgset $fmh [join $block n]

                          set dvr_ctl_str “{FILESET {{OBFILE $file_name}}}”

                          msgmetaset $fmh DRIVERCTL $dvr_ctl_str

                          lappend dispList “{CONTINUE $fmh}”

                          lappend dispList “{KILL $mh}”

                          return $dispList

                        }

                      • #66069
                        Henry Bauer
                        Participant

                          Looks closer now have added the code you recommended and now it is throwing error of “bad msgId”.

                        • #66070
                          Michael Hertel
                          Participant

                            Could you post the error or call me? 206-515-5987

                        Viewing 9 reply threads
                        • The forum ‘Cloverleaf’ is closed to new topics and replies.