The characters from an html file lose their accents

Deploying to Mac OS? Ask Mac OS specific questions here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
Mag
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 802
Joined: Fri Nov 16, 2012 10:51 pm

The characters from an html file lose their accents

Post by Mag » Wed Feb 27, 2013 10:32 pm

Hi, all,

I'm getting text from an html file using this code:

set the htmlText of the templateField to thePageContent
put the text of the templateField into textVar

Everything works except accented characters lose their accents, for example instead of "è" it displays "√®"...

Some one has a solution better than this one? :D

replace "√†" with "à" in fixedText
replace "√®" with "è" in fixedText
replace "√π" with "ù" in fixedText
... and so on...

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am
Location: Palo Alto

Re: The characters from an html file lose their accents

Post by Simon » Wed Feb 27, 2013 10:40 pm

Hi Mag,
Is the "è" encoded as è or è in the html file?

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

Mag
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 802
Joined: Fri Nov 16, 2012 10:51 pm

Re: The characters from an html file lose their accents

Post by Mag » Thu Feb 28, 2013 12:25 am

Hi Simon. One of the html page which I use for tests it reads:

<li><a href="/jobs/it/">Opportunità di lavoro</a></li>
It is this one: http://www.apple.it

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am
Location: Palo Alto

Re: The characters from an html file lose their accents

Post by Simon » Thu Feb 28, 2013 1:41 am

Hi Mag,
Sorry I can't seem to get it to work either, :( I tried different encoding but I keep getting:
à not à
from the line:<li><a href="/jobs/it/">Opportunità di lavoro</a></li> at http://www.apple.com/it/

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

Mag
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 802
Joined: Fri Nov 16, 2012 10:51 pm

Re: The characters from an html file lose their accents

Post by Mag » Thu Feb 28, 2013 1:49 am

Thank you Simon. In the source code there is this text but I don't know what meaning it has...

Code: Select all

 type="text/javascript" charset="utf-8"
And also don't know if this affects the accented chars in some way...

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am
Location: Palo Alto

Re: The characters from an html file lose their accents

Post by Simon » Thu Feb 28, 2013 1:52 am

Yes, I tried UTF8 but it made even more of a mess.

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

snm
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 253
Joined: Fri Dec 09, 2011 11:17 am
Location: Warszawa / Poland

Re: The characters from an html file lose their accents

Post by snm » Thu Feb 28, 2013 7:12 am

Try

Code: Select all

set the unicodeText of fld "field" to uniEncode (theFieldContent, "UTF8")
You should get proper text in field.

Marek

Mag
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 802
Joined: Fri Nov 16, 2012 10:51 pm

Re: The characters from an html file lose their accents

Post by Mag » Thu Feb 28, 2013 1:11 pm

It works fine! :D

Thank you so much Marek and Simon

snm
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 253
Joined: Fri Dec 09, 2011 11:17 am
Location: Warszawa / Poland

Re: The characters from an html file lose their accents

Post by snm » Thu Feb 28, 2013 2:06 pm

You are always welcome, just ask. Next time you can help somebody.

Marek

jaguayo
Posts: 10
Joined: Sun Jun 14, 2009 8:12 pm

Re: The characters from an html file lose their accents

Post by jaguayo » Sun Dec 29, 2013 1:02 pm

Hello Mag:

In your post put "√†" for a "à", "√®" for a "è"....
Where can I find the codes for "á", "é", "í", "ó" y "ú"

Thanks.

Joseba

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 9287
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: The characters from an html file lose their accents

Post by richmond62 » Tue Dec 31, 2013 12:31 am

Quick quote from the Livecode documentation:

"Special characters (whose ASCII value is greater than 127) are encoded as HTML entities. LiveCode recognizes the following named entities:

Á &Aacute;
Á &aacute;
Acirc;
acirc;
acute;
AElig;
aelig;
Agrave;
agrave;
Aring;
aring;
Atilde;
atilde;
Auml;
auml;
brvbar;
Ccedil;
ccedil;
cedil;
cent;
copy;
curren;
° deg;
divide;
éEacute;
éeacute;
Ecirc;
ecirc;
Egrave;
egrave;
ETH;
eth;
Euml;
euml;
frac12;
frac14;
frac34;
gt;
Iacute;
iacute;
Icirc;
icirc;
iexcl;
Igrave;
igrave;
iquest;
Iuml;
iuml;
laquo;
lt;
macr;
micro;
middot;
nbsp;
not;
Ntilde;
ntilde;
Oacute;
oacute;
Ocirc;
ocirc;
Ograve;
ograve;
ordf;
ordm;
Oslash;
oslash;
Otilde;
otilde;
Ouml;
ouml;
para;
plusmn;
pound;
raquo;
reg;
sect;
shy;
sup1;
sup2;
sup3;
szlig;
THORN;
thorn;
times;
Uacute;
uacute;
Ucirc;
ucirc;
Ugrave;
ugrave;
uml;
Uuml;
uuml;
Yacute;
yacute;
yen;
yuml;

Unicode characters whose numeric value is greater than 255 are encoded as "bignum" entities, with a leading ampersand and trailing semicolon. For example, the Japanese character whose numeric value is 12387 is encoded as "#12387;"."

This is why this stack doesn't do a very good job [attached].
HTMLer.rev.zip
HTML import
(7.44 KiB) Downloaded 327 times

jaguayo
Posts: 10
Joined: Sun Jun 14, 2009 8:12 pm

Re: The characters from an html file lose their accents

Post by jaguayo » Tue Dec 31, 2013 11:07 am

Thanks Richmond62 !!
With your stack solve the problem.
Thank you very much.

Un saludo.

Joseba

Post Reply

Return to “Mac OS”