Unicode Field routines

jcollett · Post by **jcollett** » Tue Sep 07, 2010 8:35 pm

Thank you Mark. But the character 上 is NOT a line feed. It means "above", or "last" (as in last week), and heaps of other things. It is the unicodeText conversion which declares it to be a line feed. Dare I suggest that "unicodeText()" has a bug?
It means that if I write "We started this conversation last week.", the unicodeText will render it as "We started this conversation
week."
Revolution is great, but I have neither the time nor the inclination to bother with niggly little details like this. JC

Mark · Post by **Mark** » Tue Sep 07, 2010 8:43 pm

Hi JC,

When you say "it is not a linefeed" it means that you haven't fully understood what I wrote. You are completely right, it isn't a linefeed. However, it consists of two bytes (like all unicode in RunRev fields). The first byte is the same as linefeed and the second byte is a NULL. Together, they represent the 上 character. There are certainly bugs in RunRev's unicodeText, but this is NOT one of them.

Best regards,

Mark

jcollett · Post by **jcollett** » Tue Sep 07, 2010 8:44 pm

Thank you Mark. But the character 上 is NOT a line feed. It means "above", or "last" (as in last week), and heaps of other things. It is the unicodeText conversion which wrongly declares it to be a line feed. Dare I suggest that "unicodeText()" has a bug?
It means that if I write "We started this conversation last week.", the unicodeText will render it as "We started this conversation
week."
Revolution is great, but I have neither the time nor the inclination to bother with niggly little details like this. JC

Mark · Post by **Mark** » Tue Sep 07, 2010 9:11 pm

JC,

Why are you posting the same mesage twice?!

Mark

jcollett · Post by **jcollett** » Tue Sep 07, 2010 11:29 pm

Mark, I have tried my best with your suggestion, but Field 2 remains empty. JohnC

Mark · Post by **Mark** » Tue Sep 07, 2010 11:41 pm

Hi John,

Maybe you should post (a simple version of) your script.

Best,

Mark

jcollett · Post by **jcollett** » Wed Sep 08, 2010 12:59 am

Hello Mark. Meantime I have been back to UnicodeFldRoutines.rev, the one in Japanese with the problem character in red. I think the writer may have not quite finished, because the line of commentary which says "I think you set line 4 into the field" (where line 5 was expected) was wrong; line 5 had appeared. So I have borrowed a user-function from that stack which uses UnicodeLineOffset(tdata,i) to set a startchar and an endchar for specifying each line. I got it to work on a 10-line dialogue, each line of which went into a field of its own, including one line which included the problem "above" character. This relies on the lines being numbered, and looks for 1, 2, 3 etc. I will seek some less intrusive method of handling that.
(Sorry about the double-sending of messages. I am not familiar enough with message sending protocols, and will try not to do it again.) JC

ttbo · Post by **ttbo** » Sat Jan 28, 2012 9:34 pm

I am also having a problem processing Japanese or Chinese text that contains the character 上. I need to be able to count lines in a text field and lines that contain this character are always split at the point where the character falls. It appears as a single line in the field but LiveCode processes the line as two lines. So if I want a routine that highlights text line by line, it messes up as soon as it hits the character 上. Is there a workaround for this? Thanks.

Mark · Post by **Mark** » Sun Jan 29, 2012 1:09 am

Hi,

This function returns the correct number of lines of the unicodeText of a field, even if 上 is in the field. Make sure to use the unicodeText: ucNumOfLines(the unicodeText of fld x) instead of the text of the field or the field itself.

Code: Select all

function ucNumOfLines theData
     put 0 into myCounter
     put lf & NULL into myUCReturn
     repeat with x = 1 to number of bytes of theData step 2
          if byte x to x+1 of theData is myUCReturn then add 1 to myCounter
     end repeat
     return myCounter
end ucNumOfLines

This function might be a little slow. You could try using a repeat forever with offset instead of counting from x to the number of bytes.

Kind regards,

Mark

LiveCode Forums.

Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines

Re: Unicode Field routines