turning on Unicode for arrayEncode/arrayDecode?
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
turning on Unicode for arrayEncode/arrayDecode?
Is there a trick to using the full roundtrip arrayEncode and arrayDecode for Unicode? It is not working for me.
Also, I'm having some problems with combine involving Unicode.
Also, I'm having some problems with combine involving Unicode.
Last edited by DarScott on Fri May 23, 2014 9:08 pm, edited 1 time in total.
Re: turning on Unicode for arrayEncode/arrayDecode
Ah, combine works for Unicode the first time after APPLY, but not after that. That makes testing the Unicode roundtrip in arrayEncode/arrayDecode more interesting.
Re: turning on Unicode for arrayEncode/arrayDecode?
Yikes! My mistake. I thought this was fixed this time around.
The combine problem is still a bug, though.
The combine problem is still a bug, though.
Re: turning on Unicode for arrayEncode/arrayDecode?
Hi Dar,
Did you have a stack or a screenshot that
display this bug?
Thanks in advance!
Al
Did you have a stack or a screenshot that
display this bug?
Thanks in advance!
Al
-
- Livecode Staff Member
- Posts: 192
- Joined: Thu Apr 18, 2013 2:48 pm
Re: turning on Unicode for arrayEncode/arrayDecode?
Hi! 7.0 dp 4 onwards takes an optional extra parameter which is essentially the stackfileversion - we originally had this defaulting to 7.0 format but it caused some problems, so the default is now the legacy arrayEncode. Unfortunately we forgot to document it for the release.
You can preserve Unicode by using arrayEncode(tArray, "7.0").
Ali
You can preserve Unicode by using arrayEncode(tArray, "7.0").
Ali
Re: turning on Unicode for arrayEncode/arrayDecode?
Yay! It works!
I'm not able to reproduce the combine problem, now.
I got a ? for a Unicode char in a key at times. Now, I don't see it. It might be one of those elusive things.
I'm not able to reproduce the combine problem, now.
I got a ? for a Unicode char in a key at times. Now, I don't see it. It might be one of those elusive things.
-
- VIP Livecode Opensource Backer
- Posts: 9857
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: turning on Unicode for arrayEncode/arrayDecode?
I like the flexibility, but dislike the syntax. It means that all new use of arrays going forward will be encumbered with an extra argument just to account for the arcana of version histories. Ugh.
Any chance we can just change the array flag in the file, from the 0x05 it is now to maybe 0x06 for Unicode-aware versions?
Any chance we can just change the array flag in the file, from the 0x05 it is now to maybe 0x06 for Unicode-aware versions?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: turning on Unicode for arrayEncode/arrayDecode?
Hi, Richard!
Well, the use is mitigated. I only needed to use it with arrayEncode(). I didn't even try it with arrayDecode(), it just worked.
I am confused about the array code comment.
Well, the use is mitigated. I only needed to use it with arrayEncode(). I didn't even try it with arrayDecode(), it just worked.
I am confused about the array code comment.
-
- VIP Livecode Opensource Backer
- Posts: 9857
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: turning on Unicode for arrayEncode/arrayDecode?
Ah, right, WRITING arrays requires the arg, while READING is automatic, probably doing something like my suggestion.
The array file format uses 0x05 as a flag for elements that are arrays, with other op codes for other data types (text, integers, etc.). Because the array file is itself an array, the first byte of the file is 0x05.
The writing is a sticky issue: on the one hand I can see the need to maintain the older non-Unicode array format for interoperability with older engine versions. But I dislike having to have all future generations saddled with the historical knowledge about older formats.
Ideally Unicode should just work.
I appreciate the problem, and the solution, but I still don't like it. I don't have an alternative in mind, though, so I guess we'll just accept it and move on.
Still, rather than a version number could the second argument be something with some mnemonic value, perhaps "Unicode"?
I'd hate to see folks using v14 having to think back to which version of LC they needed to start using a version number for storing Unicode values in array files.
The array file format uses 0x05 as a flag for elements that are arrays, with other op codes for other data types (text, integers, etc.). Because the array file is itself an array, the first byte of the file is 0x05.
The writing is a sticky issue: on the one hand I can see the need to maintain the older non-Unicode array format for interoperability with older engine versions. But I dislike having to have all future generations saddled with the historical knowledge about older formats.
Ideally Unicode should just work.
I appreciate the problem, and the solution, but I still don't like it. I don't have an alternative in mind, though, so I guess we'll just accept it and move on.
Still, rather than a version number could the second argument be something with some mnemonic value, perhaps "Unicode"?
I'd hate to see folks using v14 having to think back to which version of LC they needed to start using a version number for storing Unicode values in array files.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: turning on Unicode for arrayEncode/arrayDecode?
Well, this is the problem we all have with creating files that last longer than current version of file format.
We still need a way to create old compatible files as well as the new, say, Unicode files.
The question is then, what is the default? What should happen if I make a new standalone with 7, but don't make any changes to the stack? It should still be fully compatible with the old files, right? Well, whether right or not, I suspect that is the thinking.
We still need a way to create old compatible files as well as the new, say, Unicode files.
The question is then, what is the default? What should happen if I make a new standalone with 7, but don't make any changes to the stack? It should still be fully compatible with the old files, right? Well, whether right or not, I suspect that is the thinking.
Re: turning on Unicode for arrayEncode/arrayDecode?
By the way, the new 7.0 arrayEncode() makes smaller binary strings and they compress well. The arrayDecode is still slow, though.
-
- VIP Livecode Opensource Backer
- Posts: 9857
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: turning on Unicode for arrayEncode/arrayDecode?
FWIW I just flagged the new argument as a bug, hopefully prompting consideration of at least the two option presented there for dealing with the format change:
http://quality.runrev.com/show_bug.cgi?id=12547
http://quality.runrev.com/show_bug.cgi?id=12547
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
-
- Livecode Staff Member
- Posts: 192
- Joined: Thu Apr 18, 2013 2:48 pm
Re: turning on Unicode for arrayEncode/arrayDecode?
Hi all,
I have commented on the bug report, though it probably would have been better to do so here. So I'll effectively just copy & paste
It is not necessary to remember which version this was added, the arrayEncode will preserve unicode for any stackfileversion >= 7.0. The only other solution available, I think, to enable unicode to "just work" in this case is to traverse the entire array prior to the encode, and check to see if it can be encoded losslessly in the legacy format; if not then it encodes in the new format.
The main disadvantage of this route of course is the extra time expense of the array traversal.
I have commented on the bug report, though it probably would have been better to do so here. So I'll effectively just copy & paste
It is not necessary to remember which version this was added, the arrayEncode will preserve unicode for any stackfileversion >= 7.0. The only other solution available, I think, to enable unicode to "just work" in this case is to traverse the entire array prior to the encode, and check to see if it can be encoded losslessly in the legacy format; if not then it encodes in the new format.
The main disadvantage of this route of course is the extra time expense of the array traversal.
-
- Livecode Staff Member
- Posts: 192
- Joined: Thu Apr 18, 2013 2:48 pm
Re: turning on Unicode for arrayEncode/arrayDecode?
Perhaps that could be the default behaviour, with the other parameter being an optional one to force the new or legacy version.
Re: turning on Unicode for arrayEncode/arrayDecode?
I'm not seeing any issues with the default being an adaptive encoding.
A consideration is the wording and size of the paragraph describing this in the dictionary. It might be handy to have a name for this case that can be used as the second parameter and that is the default. Otherwise, the dictionary has to somehow say the default is not one of the options.
I wonder if it would be handy to allow a LiveCode version to be used as the encoding version and that is rounded down to a known encoding version. I don't have any strong opinion on this.
A consideration is the wording and size of the paragraph describing this in the dictionary. It might be handy to have a name for this case that can be used as the second parameter and that is the default. Otherwise, the dictionary has to somehow say the default is not one of the options.
I wonder if it would be handy to allow a LiveCode version to be used as the encoding version and that is rounded down to a known encoding version. I don't have any strong opinion on this.