character encoding - never ending story

Creating desktop or client-server database solutions?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
UKMC
Posts: 50
Joined: Sun Jan 08, 2017 12:01 pm

character encoding - never ending story

Post by UKMC » Sun Feb 26, 2017 9:56 pm

Hi altogether,

I have a MariaDB (10.0.29) with character set latin1 and collate latin1_german1_ci

When I save the text "(ÄÖÜäöüßÉ) from livecode to the DB without special dealing, in the database is stored "(????????)"

After searching the forum, I introduced the following function in my livecode script:
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text
and sent this to the database.

Result: characters are stored correctly.

But when retrieving the data again, I get "(ÄÖÜäöüßÉ)"
Now I do not know how to translate them back to the original in livecode. I tried
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text (does not work)
mactoiso("(ÄÖÜäöüßÉ)") results in "(ÄÖÜäöüßÉ)"

The following solution works, but it seems to me very inelegant:

function Umlautdecodierung eingabe
--Ä
replace "Ä" with "Ä" in eingabe
-- Ö
replace "Ö" with "Ö" in eingabe
-- Ü
replace "Ãœ" with "Ü" in eingabe
-- ä
replace "ä" with "ä" in eingabe
-- ö
replace "ö" with "ö" in eingabe
-- ü
replace "ü" with "ü" in eingabe
-- ß
replace "ß" with "ß" in eingabe
-- É
replace "É" with "Ö" in eingabe

return eingabe
end Umlautdecodierung


I hope one of you has the better idea to solve this problem.

By the way: My favourite solution would be not to have converting at all as this makes scripting much complicated. Perhaps you know the magic configuration for my MariaDB.

Best regards


Ulrich

AndyP
Posts: 614
Joined: Wed Aug 27, 2008 12:57 pm
Location: Seeheim, Germany (ex UK)
Contact:

Re: character encoding - never ending story

Post by AndyP » Mon Feb 27, 2017 12:43 pm

I think you will also need to set the db to default UTF8.

see here>https://mariadb.com/kb/en/mariadb/setti ... ollations/
Andy Piddock
https://livecode1001.blogspot.com Built with LiveCode
https://github.com/AndyPiddock/TinyIDE Mini IDE alternative
https://github.com/AndyPiddock/Seth Editor color theming
http://livecodeshare.runrev.com/stack/897/ LiveCode-Multi-Search

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7210
Joined: Sat Apr 08, 2006 8:31 pm
Location: Minneapolis MN
Contact:

Re: character encoding - never ending story

Post by jacque » Mon Feb 27, 2017 8:42 pm

The uniEncode and uniDecode functions are deprecated and shouldn't be used with LC 7 or above, though they do still work. But the new functions are much easier to work with and I recommend them. See textEncode() and textDecode() in the dictionary. They do all the work for you, as long as you know the correct character set you're working with (and you do.)

Or you can follow AndyP's suggestion, and set the database to use UTF8.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

MaxV
Posts: 1579
Joined: Tue May 28, 2013 2:20 pm
Location: Italy
Contact:

Re: character encoding - never ending story

Post by MaxV » Thu Mar 09, 2017 1:12 pm

Code: Select all

put urlencode("ÄÖÜäöüßÉ")
result is standard ASCII: %C4%D6%DC%E4%F6%FC%DF%C9

Code: Select all

pur urldecode("%C4%D6%DC%E4%F6%FC%DF%C9")
result is you chars: ÄÖÜäöüßÉ

:D
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w

Hans-Helmut
Posts: 57
Joined: Sat Jan 14, 2017 6:44 pm

Re: character encoding - never ending story - mySQL

Post by Hans-Helmut » Thu May 18, 2017 8:55 pm

I am lost at the moment and asking for help. I have to use international characters UTF-8 encoded, mainly Russian, German and English together.
Even though all appears fine when looking on the server and using phpMyAdmin to browse the database, in LiveCode trying all kinds of settings, it does not work.

Code: Select all

#Simplified code snippet without error checking:
on mouseUp
global gConnectionID
   put "SELECT name FROM party" into tSQL
   put revDataFromQuery(tab, cr, gConnectionID, tSQL) into tList
   put textDecode ( tList , "UTF-8")  into field "data"
end mouseUp
The Russian character string is "аловуе" - (alowue)
Selecting this record results in "??????"

Settings
LiveCode: 8.1.4 (rc 2)
OS: Windows 2000, 64bit, latest update

Server: Localhost via UNIX socket
Server type: MySQL
Server version: 5.6.33-log - MySQL Community Server (GPL)
Protocol version: 10
User: b@localhost
Server charset: UTF-8 Unicode (utf8)

Server connection collation: utf8_general_ci
User language: English
Database: b_address
Table: party
Column: name
Collumn collation: utf8_general_ci

Hans-Helmut
Posts: 57
Joined: Sat Jan 14, 2017 6:44 pm

Re: character encoding - never ending story

Post by Hans-Helmut » Fri May 19, 2017 7:33 am

I am still stuck with Russian text in mySQL and LC... )

For now, I can not go through PHP or server side scripting. I need to use the direct connection as described.

All settings in the MySQL database are for UTF-8.

Russian characters are visible on the server side. But they do not render on the client side.

Executing through LiveCode using textDecode()

1. Special Latin-1 characters are not shown. A "Müller" will become "Mller". // Why? Wrong.
2. Any Russian character will not render at all: "Димитрий" will become "?????????" // Why? Wrong.

Executing without textDecode()

1. Special Latin-1 characters are shown. A "Müller" is still "Müller" with "u-Umlaut".
2. Any Russian character will not render: "Димитрий" will become "?????????"

Is this a bug in LiveCode?
Is there still something wrong on the server-side settings?

I really need this. For this, I can not use PHP. And a LiveCode server installation is not permitted.

Thanks for any help.

bangkok
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 937
Joined: Fri Aug 15, 2008 7:15 am

Re: character encoding - never ending story

Post by bangkok » Fri May 19, 2017 10:13 am

Hans-Helmut wrote: All settings in the MySQL database are for UTF-8.

Russian characters are visible on the server side. But they do not render on the client side.
Before to do your select, try to execute this query :

Code: Select all

   revExecuteSQL gConnectionID, "SET NAMES 'utf8'"

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7210
Joined: Sat Apr 08, 2006 8:31 pm
Location: Minneapolis MN
Contact:

Re: character encoding - never ending story

Post by jacque » Fri May 19, 2017 5:27 pm

As mentioned above, you need to use textDecode() to translate the incoming text to a format LC can use. Most databases use UTF8 so I think it's safe to assume that.

Code: Select all

put textDecode(data, "UTF8") into tString
Edit : I just saw you are already using textDecode so ignore the above.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

MaxV
Posts: 1579
Joined: Tue May 28, 2013 2:20 pm
Location: Italy
Contact:

Re: character encoding - never ending story

Post by MaxV » Sat May 20, 2017 8:22 pm

Use urlencode and urldecode functions, all chars are translate to standard ASCII and put data safely in a database, then with urldecode come back in your charset.
Urlencode and urldecode functions are the best way to preserve data. See http://livecode.wikia.com/wiki/URLEncode

Examples:

put urlencode("Müller")
=
M%FCller

put urlencode(textEncode("Димитрий","UTF8"))
=
%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9

put urldecode("M%FCller")
=
Müller

put textdecode(urldecode("%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9"),"UTF8")
=
Димитрий

As you can see the urlencode function uses always just plan ASCII that is compatible with any charset, so data are compatible with any database in the world!!! :D
Why I added textencode/textdecode with Russian chars? Because my PC is UTF16, but URLencode/urldecode works only with UTF8 chars. Livecode always works with PC encoding, in my case UTF16, so I needed to add the textencode with chars like Russian that have different hexadecimal values from UTF8 in my PC.
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7210
Joined: Sat Apr 08, 2006 8:31 pm
Location: Minneapolis MN
Contact:

Re: character encoding - never ending story

Post by jacque » Sat May 20, 2017 10:46 pm

my PC is UTF16, but URLencode/urldecode works only with UTF8 chars.
Actually, textEncode/textDecode work with nine different encodings:

"ASCII"
"UTF-16"
"UTF-16BE"
"UTF-16LE"
"UTF-32"
"UTF-32BE"
"UTF-32LE"
"UTF-8"
"CP1252"
Livecode always works with PC encoding
When importing or opening files, LC uses the machine native encoding which will vary depending on the OS.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

Post Reply

Return to “Databases”