turning on Unicode for arrayEncode/arrayDecode?

Discussion about LiveCode Global Jam events and activities

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Fri May 23, 2014 8:55 pm

Is there a trick to using the full roundtrip arrayEncode and arrayDecode for Unicode? It is not working for me.

Also, I'm having some problems with combine involving Unicode.
Last edited by DarScott on Fri May 23, 2014 9:08 pm, edited 1 time in total.

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode

Post by DarScott » Fri May 23, 2014 9:08 pm

Ah, combine works for Unicode the first time after APPLY, but not after that. That makes testing the Unicode roundtrip in arrayEncode/arrayDecode more interesting.

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Fri May 23, 2014 9:38 pm

Yikes! My mistake. I thought this was fixed this time around.

The combine problem is still a bug, though.

capellan
Posts: 654
Joined: Wed Aug 15, 2007 11:09 pm

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by capellan » Fri May 23, 2014 9:42 pm

Hi Dar,

Did you have a stack or a screenshot that
display this bug?

Thanks in advance!

Al

livecodeali
Livecode Staff Member
Livecode Staff Member
Posts: 192
Joined: Thu Apr 18, 2013 2:48 pm

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by livecodeali » Fri May 23, 2014 10:09 pm

Hi! 7.0 dp 4 onwards takes an optional extra parameter which is essentially the stackfileversion - we originally had this defaulting to 7.0 format but it caused some problems, so the default is now the legacy arrayEncode. Unfortunately we forgot to document it for the release.

You can preserve Unicode by using arrayEncode(tArray, "7.0").

Ali

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Fri May 23, 2014 10:39 pm

Yay! It works!

I'm not able to reproduce the combine problem, now.

I got a ? for a Unicode char in a key at times. Now, I don't see it. It might be one of those elusive things.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9802
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by FourthWorld » Sat May 24, 2014 10:36 pm

I like the flexibility, but dislike the syntax. It means that all new use of arrays going forward will be encumbered with an extra argument just to account for the arcana of version histories. Ugh.

Any chance we can just change the array flag in the file, from the 0x05 it is now to maybe 0x06 for Unicode-aware versions?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Sat May 24, 2014 10:50 pm

Hi, Richard!

Well, the use is mitigated. I only needed to use it with arrayEncode(). I didn't even try it with arrayDecode(), it just worked.

I am confused about the array code comment.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9802
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by FourthWorld » Sat May 24, 2014 11:09 pm

Ah, right, WRITING arrays requires the arg, while READING is automatic, probably doing something like my suggestion.

The array file format uses 0x05 as a flag for elements that are arrays, with other op codes for other data types (text, integers, etc.). Because the array file is itself an array, the first byte of the file is 0x05.

The writing is a sticky issue: on the one hand I can see the need to maintain the older non-Unicode array format for interoperability with older engine versions. But I dislike having to have all future generations saddled with the historical knowledge about older formats.

Ideally Unicode should just work.

I appreciate the problem, and the solution, but I still don't like it. I don't have an alternative in mind, though, so I guess we'll just accept it and move on.

Still, rather than a version number could the second argument be something with some mnemonic value, perhaps "Unicode"?

I'd hate to see folks using v14 having to think back to which version of LC they needed to start using a version number for storing Unicode values in array files.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Sat May 24, 2014 11:20 pm

Well, this is the problem we all have with creating files that last longer than current version of file format.

We still need a way to create old compatible files as well as the new, say, Unicode files.

The question is then, what is the default? What should happen if I make a new standalone with 7, but don't make any changes to the stack? It should still be fully compatible with the old files, right? Well, whether right or not, I suspect that is the thinking.

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Sat May 24, 2014 11:54 pm

By the way, the new 7.0 arrayEncode() makes smaller binary strings and they compress well. The arrayDecode is still slow, though.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9802
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by FourthWorld » Wed May 28, 2014 2:45 pm

FWIW I just flagged the new argument as a bug, hopefully prompting consideration of at least the two option presented there for dealing with the format change:
http://quality.runrev.com/show_bug.cgi?id=12547
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

livecodeali
Livecode Staff Member
Livecode Staff Member
Posts: 192
Joined: Thu Apr 18, 2013 2:48 pm

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by livecodeali » Wed May 28, 2014 5:16 pm

Hi all,

I have commented on the bug report, though it probably would have been better to do so here. So I'll effectively just copy & paste :-)

It is not necessary to remember which version this was added, the arrayEncode will preserve unicode for any stackfileversion >= 7.0. The only other solution available, I think, to enable unicode to "just work" in this case is to traverse the entire array prior to the encode, and check to see if it can be encoded losslessly in the legacy format; if not then it encodes in the new format.

The main disadvantage of this route of course is the extra time expense of the array traversal.

livecodeali
Livecode Staff Member
Livecode Staff Member
Posts: 192
Joined: Thu Apr 18, 2013 2:48 pm

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by livecodeali » Wed May 28, 2014 5:18 pm

Perhaps that could be the default behaviour, with the other parameter being an optional one to force the new or legacy version.

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: turning on Unicode for arrayEncode/arrayDecode?

Post by DarScott » Wed May 28, 2014 6:37 pm

I'm not seeing any issues with the default being an adaptive encoding.

A consideration is the wording and size of the paragraph describing this in the dictionary. It might be handy to have a name for this case that can be used as the second parameter and that is the default. Otherwise, the dictionary has to somehow say the default is not one of the options.

I wonder if it would be handy to allow a LiveCode version to be used as the encoding version and that is rounded down to a known encoding version. I don't have any strong opinion on this.

Locked

Return to “LiveCode Global Jam”