<0xa0> appears in document

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf <0xa0> appears in document

  • Creator
    Topic
  • #50925
    Matthew Hill
    Participant

    I am processing Transcription documents from a vendor. When some of the documents, not all, are processed the last line may contain a special character seen as <0xa0>. The last line may have other text besides the special character. I’m trying to use regsub to replace the character with a NULL with no luck. It seems as if regsub passes over the special character like it is not even there. Does anyone have any suggestions on how to search and replace this special character?

    have tried using the following in a copy command with a counter to find the last line to interogate:

    set res $xlateInVals

    regsub -all {[^[:print:][:punct:][:blank:],%,$,^.*,+,=,~]} $res {} res

    set xlateOutVals

Viewing 6 reply threads
  • Author
    Replies
    • #68028
      Jim Kosloskey
      Participant

      Matthew,

      First thing is xlateInVals is a list. use some sort of list notation to get the first element of xlateInVals (if there is only one element).

      Also, I don’t think a0 hex is any of the normally defined special characters so you might need to include xa0 in your regsub.

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

    • #68029
      Sergey Sevastyanov
      Participant

      Matthew

      Quote:

      set res $xlateInVals

      regsub -all {[^[:print:][:punct:][:blank:],%,$,^.*,+,=,~]} $res {} res

      set xlateOutVals

      1. xlateInVals is a list, so your 1st statement should look more like:

      set res [lindex $xlateInVals 0]

      2. Your regular expression is not correct

      3. I’m not sure I understand why you put result between “><", but I assume that's what you need

      It would be good if you could include example of your document

    • #68030
      Charlie Bursell
      Participant

      Your regexp syas that anything that is *NOT* one of those characters to change to empty string.  That would not leave you much.

      I don’t know if this is Hex A0 or literal you are seeing but you don’t even need regsub.

      set inp [lindex $xlateInVals 0]

      set xlateOutVals

        ]

        or if literal

        set xlateOutVals

          ]

        1. #68031
          Matthew Hill
          Participant

          Charlie Bursell wrote:

          Your regexp syas that anything that is *NOT* one of those characters to change to empty string.

        2. #68032
          Michael Hertel
          Participant

          Can you do an hcihd on the message to find out what the hex representation of the character is?

          Also can you ask the vendor if they have a coding error?

          Maybe they meant instead of .

        3. #68033
          Charlie Bursell
          Participant

          I don’t see why you cannot use string map

          set res [string map {xa0 ”} [lindex $xlateInVals 0]]

        4. #68034
          Ray Barnes
          Participant

          Agree re: string map, just used a similar approach successfully:

          set xlateOutVals

            ]]

        Viewing 6 reply threads
        • The forum ‘Cloverleaf’ is closed to new topics and replies.

        Forum Statistics

        Registered Users
        5,125
        Forums
        28
        Topics
        9,294
        Replies
        34,439
        Topic Tags
        287
        Empty Topic Tags
        10