Page 1 of 1

loosing some UTF8 chars

Posted: Mon Nov 04, 2013 9:38 pm
by atout66
Hi to all,

In my script below I load an URL with its source code (I want it!), but before to past it into the field "field_01", I want to give a index number to each line of source code.

So, untill the line:

Code: Select all

set the unicodeText of fld "field_01" to uniEncode (IT,"UTF8")
everything goes well, and the source code is preserved.

Here is the script:

Code: Select all

on mouseUp
   get URL "http://fr.wiktionary.org/wiki/accompagner"
   set the text of field "field_01" to null 
   wait 2 seconds  -- to see the change
   put IT into field "field_01"
   set the unicodeText of fld "field_01" to uniEncode (IT,"UTF8") -- Everything OK untill here
   repeat with tLineNum = 1 to the number of lines of field "field_01"
      put line tLineNum of field "field_01" into tLaLigne
      set the Text of line tLineNum of field "field_01" to tLineNum&&":"&&tLaLigne
	--set the unicodeText of line tLineNum of field "field_01" to tLineNum&&":"&&tLaLigne 
--///**\\\ this syntax returns chinese chars and LC becomes very, very slow !!!
   end repeat
end mouseUp
Unfortunatly, I must make a mistake in the repeat loop because I can notice some error which look like I lost some UTF8 chars.

For example, at line 78 I can read :
title="prononciation API">/a.k??.pa.?e/
but if I check the source code in my browser of this URL it should be
title="prononciation API">/a.kɔ̃.pa.ɲe/
Any idea where is my error ?

BTW, this handler is very slow with 409 lines of HTML

Thanks in advance, Jean-Paul.

Re: loosing some UTF8 chars

Posted: Tue Nov 05, 2013 5:23 am
by Simon
Hi Jean-Paul,
You are very close to the answer you want, all the right bits are there. :)

One thing you should not be doing is going through the lines of a field in your repeat loop because that is VERY slow compared to sticking the information into a variable first, doing your processing, then populating the field with the results. Really!

So, put the url into a variable
Go though each line in the variable in your repeat loop, adding the line numbers and putting them into another var.
Then put the completed var into the field...
Now you already have the correct code for populating the field as you have shown on the 6th line posted.

If you are still stuck just ask.

Simon

Re: loosing some UTF8 chars

Posted: Fri Nov 08, 2013 7:59 am
by atout66
Thanks Simon, I'll follow your advice ;-)