workaround for Unicode and the clickText

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
sp27
Posts: 135
Joined: Mon May 09, 2011 3:01 pm

workaround for Unicode and the clickText

Post by sp27 » Sun Jun 19, 2011 6:29 am

Any user of LC 4.6.1 who needs to use "the clickText" in a double-byte (Unicode) field will discover that it just doesn't work. I am posting here a workaround that resulted from a day-long discussions among jacque, Bernd, Mark S., Malte, Richmond and myself.

Code: Select all

on mouseUp
   --this is attached to field  "TextToClick" that contains bilingual (Russian+English) text;
   --this field has its lockText set to true;
   --the purpose of this exercise is to retrieve and display in another field the word that the user has clicked;
   --NOTE:  the mouseChunk and the mouseText are useless in a Unicode field;
   --equally useless is the select command when used with these expressions, as in "select  the mouseChunk";
   
   local locStart, locEnd
   local locEntireText
   local locEscapeCounter
   
   if  the mouseCharChunk is empty then
      set the unicodeText of field "ClickedWord" to uniEncode("You clicked an empty space.", "UTF8")
      exit mouseUp
   end if
   
   put word 2 of the mouseCharChunk into locStart
   put word 4 of the mouseCharChunk into locEnd
   
   --a strategy based on "the number of words in char 1 to locEnd" bombs when the text before locEnd contains the upper case Russian  R (1056);
   --this is probably because the first byte in the two-byte representation of 1056 evaluates to 32, and LC takes it for a word delimiter;
   
   --relying on the accuracy of the values that are returned by the mouseCharChunk is dubious because these are the positions of bytes, not characters:
   -- one byte for each Roman character and two bytes for each non-Roman character; this klills a couple of other strategies
   
   --the strategy below is based of "the selection" and is not dependent on the accuracy of the mouseCharChunk values: the correct chunk is selected anyway
   set useUnicode to true
   put  the unicodeText of field "TextToClick" into locEntireText --this is UTF16
   put uniDecode(locEntireText, "UTF8") into locEntireText --this is UTF8
   
   --look for a word boundary to the left of the click
   repeat until (locStart < 1)
      if byteToNum(byte locStart of locEntireText) is among items of 9, 10, 32 then
         add 1 to locStart
         exit repeat
      end if
      subtract 1 from locStart
   end repeat
   
   --look for a word boundary to the right of the click
   repeat until (locEnd >= length(locEntireText))
      if byteToNum(byte locEnd of locEntireText) is among items of 9, 10, 32 then
         subtract 1 from locEnd
         exit repeat
      end if
      add 1 to locEnd
   end repeat
   
   select char locStart to locEnd of field "TextToClick" 
   set the unicodeText of field "ClickedWord" to the unicodeText of the selection
   
   pass mouseUp
end mouseUp

Hope this may help someone some day,

sp27

Post Reply