Page 1 of 1
character encoding - never ending story
Posted: Sun Feb 26, 2017 9:56 pm
by UKMC
Hi altogether,
I have a MariaDB (10.0.29) with character set latin1 and collate latin1_german1_ci
When I save the text "(ÄÖÜäöüßÉ) from livecode to the DB without special dealing, in the database is stored "(????????)"
After searching the forum, I introduced the following function in my livecode script:
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text
and sent this to the database.
Result: characters are stored correctly.
But when retrieving the data again, I get "(ÄÖÜäöüßÉ)"
Now I do not know how to translate them back to the original in livecode. I tried
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text (does not work)
mactoiso("(ÄÖÜäöüßÉ)") results in "(ÄÖÜäöüßÉ)"
The following solution works, but it seems to me very inelegant:
function Umlautdecodierung eingabe
--Ä
replace "Ä" with "Ä" in eingabe
-- Ö
replace "Ö" with "Ö" in eingabe
-- Ü
replace "Ãœ" with "Ü" in eingabe
-- ä
replace "ä" with "ä" in eingabe
-- ö
replace "ö" with "ö" in eingabe
-- ü
replace "ü" with "ü" in eingabe
-- ß
replace "ß" with "ß" in eingabe
-- É
replace "É" with "Ö" in eingabe
return eingabe
end Umlautdecodierung
I hope one of you has the better idea to solve this problem.
By the way: My favourite solution would be not to have converting at all as this makes scripting much complicated. Perhaps you know the magic configuration for my MariaDB.
Best regards
Ulrich
Re: character encoding - never ending story
Posted: Mon Feb 27, 2017 12:43 pm
by AndyP
I think you will also need to set the db to default UTF8.
see here>
https://mariadb.com/kb/en/mariadb/setti ... ollations/
Re: character encoding - never ending story
Posted: Mon Feb 27, 2017 8:42 pm
by jacque
The uniEncode and uniDecode functions are deprecated and shouldn't be used with LC 7 or above, though they do still work. But the new functions are much easier to work with and I recommend them. See textEncode() and textDecode() in the dictionary. They do all the work for you, as long as you know the correct character set you're working with (and you do.)
Or you can follow AndyP's suggestion, and set the database to use UTF8.
Re: character encoding - never ending story
Posted: Thu Mar 09, 2017 1:12 pm
by MaxV
result is standard ASCII: %C4%D6%DC%E4%F6%FC%DF%C9
Code: Select all
pur urldecode("%C4%D6%DC%E4%F6%FC%DF%C9")
result is you chars: ÄÖÜäöüßÉ
Re: character encoding - never ending story - mySQL
Posted: Thu May 18, 2017 8:55 pm
by Hans-Helmut
I am lost at the moment and asking for help. I have to use international characters UTF-8 encoded, mainly Russian, German and English together.
Even though all appears fine when looking on the server and using phpMyAdmin to browse the database, in LiveCode trying all kinds of settings, it does not work.
Code: Select all
#Simplified code snippet without error checking:
on mouseUp
global gConnectionID
put "SELECT name FROM party" into tSQL
put revDataFromQuery(tab, cr, gConnectionID, tSQL) into tList
put textDecode ( tList , "UTF-8") into field "data"
end mouseUp
The Russian character string is "аловуе" - (alowue)
Selecting this record results in "??????"
Settings
LiveCode: 8.1.4 (rc 2)
OS: Windows 2000, 64bit, latest update
Server: Localhost via UNIX socket
Server type: MySQL
Server version: 5.6.33-log - MySQL Community Server (GPL)
Protocol version: 10
User: b@localhost
Server charset: UTF-8 Unicode (utf8)
Server connection collation: utf8_general_ci
User language: English
Database: b_address
Table: party
Column: name
Collumn collation: utf8_general_ci
Re: character encoding - never ending story
Posted: Fri May 19, 2017 7:33 am
by Hans-Helmut
I am still stuck with Russian text in mySQL and LC... )
For now, I can not go through PHP or server side scripting. I need to use the direct connection as described.
All settings in the MySQL database are for UTF-8.
Russian characters are visible on the server side. But they do not render on the client side.
Executing through LiveCode using textDecode()
1. Special Latin-1 characters are not shown. A "Müller" will become "Mller". // Why? Wrong.
2. Any Russian character will not render at all: "Димитрий" will become "?????????" // Why? Wrong.
Executing without textDecode()
1. Special Latin-1 characters are shown. A "Müller" is still "Müller" with "u-Umlaut".
2. Any Russian character will not render: "Димитрий" will become "?????????"
Is this a bug in LiveCode?
Is there still something wrong on the server-side settings?
I really need this. For this, I can not use PHP. And a LiveCode server installation is not permitted.
Thanks for any help.
Re: character encoding - never ending story
Posted: Fri May 19, 2017 10:13 am
by bangkok
Hans-Helmut wrote:
All settings in the MySQL database are for UTF-8.
Russian characters are visible on the server side. But they do not render on the client side.
Before to do your select, try to execute this query :
Code: Select all
revExecuteSQL gConnectionID, "SET NAMES 'utf8'"
Re: character encoding - never ending story
Posted: Fri May 19, 2017 5:27 pm
by jacque
As mentioned above, you need to use textDecode() to translate the incoming text to a format LC can use. Most databases use UTF8 so I think it's safe to assume that.
Code: Select all
put textDecode(data, "UTF8") into tString
Edit : I just saw you are already using textDecode so ignore the above.
Re: character encoding - never ending story
Posted: Sat May 20, 2017 8:22 pm
by MaxV
Use
urlencode and
urldecode functions, all chars are translate to standard ASCII and put data safely in a database, then with
urldecode come back in your charset.
Urlencode and urldecode functions are the best way to preserve data. See
http://livecode.wikia.com/wiki/URLEncode
Examples:
put urlencode("Müller")
=
M%FCller
put urlencode(textEncode("Димитрий","UTF8"))
=
%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9
put urldecode("M%FCller")
=
Müller
put textdecode(urldecode("%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9"),"UTF8")
=
Димитрий
As you can see the urlencode function uses always just plan ASCII that is compatible with any charset, so data are compatible with any database in the world!!!
Why I added
textencode/textdecode with Russian chars? Because my PC is UTF16, but URLencode/urldecode works only with UTF8 chars. Livecode always works with PC encoding, in my case UTF16, so I needed to add the
textencode with chars like Russian that have different hexadecimal values from UTF8 in my PC.
Re: character encoding - never ending story
Posted: Sat May 20, 2017 10:46 pm
by jacque
my PC is UTF16, but URLencode/urldecode works only with UTF8 chars.
Actually, textEncode/textDecode work with nine different encodings:
"ASCII"
"UTF-16"
"UTF-16BE"
"UTF-16LE"
"UTF-32"
"UTF-32BE"
"UTF-32LE"
"UTF-8"
"CP1252"
Livecode always works with PC encoding
When importing or opening files, LC uses the machine native encoding which will vary depending on the OS.