turning on Unicode for arrayEncode/arrayDecode?

DarScott · Post by **DarScott** » Fri May 23, 2014 8:55 pm

Is there a trick to using the full roundtrip arrayEncode and arrayDecode for Unicode? It is not working for me.

Also, I'm having some problems with combine involving Unicode.

DarScott · Post by **DarScott** » Fri May 23, 2014 9:08 pm

Ah, combine works for Unicode the first time after APPLY, but not after that. That makes testing the Unicode roundtrip in arrayEncode/arrayDecode more interesting.

DarScott · Post by **DarScott** » Fri May 23, 2014 9:38 pm

Yikes! My mistake. I thought this was fixed this time around.

The combine problem is still a bug, though.

capellan · Post by **capellan** » Fri May 23, 2014 9:42 pm

Hi Dar,

Did you have a stack or a screenshot that
display this bug?

Thanks in advance!

Al

livecodeali · Post by **livecodeali** » Fri May 23, 2014 10:09 pm

Hi! 7.0 dp 4 onwards takes an optional extra parameter which is essentially the stackfileversion - we originally had this defaulting to 7.0 format but it caused some problems, so the default is now the legacy arrayEncode. Unfortunately we forgot to document it for the release.

You can preserve Unicode by using arrayEncode(tArray, "7.0").

Ali

DarScott · Post by **DarScott** » Fri May 23, 2014 10:39 pm

Yay! It works!

I'm not able to reproduce the combine problem, now.

I got a ? for a Unicode char in a key at times. Now, I don't see it. It might be one of those elusive things.

FourthWorld · Post by **FourthWorld** » Sat May 24, 2014 10:36 pm

I like the flexibility, but dislike the syntax. It means that all new use of arrays going forward will be encumbered with an extra argument just to account for the arcana of version histories. Ugh.

Any chance we can just change the array flag in the file, from the 0x05 it is now to maybe 0x06 for Unicode-aware versions?

DarScott · Post by **DarScott** » Sat May 24, 2014 10:50 pm

Hi, Richard!

Well, the use is mitigated. I only needed to use it with arrayEncode(). I didn't even try it with arrayDecode(), it just worked.

I am confused about the array code comment.

FourthWorld · Post by **FourthWorld** » Sat May 24, 2014 11:09 pm

Ah, right, WRITING arrays requires the arg, while READING is automatic, probably doing something like my suggestion.

The array file format uses 0x05 as a flag for elements that are arrays, with other op codes for other data types (text, integers, etc.). Because the array file is itself an array, the first byte of the file is 0x05.

The writing is a sticky issue: on the one hand I can see the need to maintain the older non-Unicode array format for interoperability with older engine versions. But I dislike having to have all future generations saddled with the historical knowledge about older formats.

Ideally Unicode should just work.

I appreciate the problem, and the solution, but I still don't like it. I don't have an alternative in mind, though, so I guess we'll just accept it and move on.

Still, rather than a version number could the second argument be something with some mnemonic value, perhaps "Unicode"?

I'd hate to see folks using v14 having to think back to which version of LC they needed to start using a version number for storing Unicode values in array files.

DarScott · Post by **DarScott** » Sat May 24, 2014 11:20 pm

Well, this is the problem we all have with creating files that last longer than current version of file format.

We still need a way to create old compatible files as well as the new, say, Unicode files.

The question is then, what is the default? What should happen if I make a new standalone with 7, but don't make any changes to the stack? It should still be fully compatible with the old files, right? Well, whether right or not, I suspect that is the thinking.

DarScott · Post by **DarScott** » Sat May 24, 2014 11:54 pm

By the way, the new 7.0 arrayEncode() makes smaller binary strings and they compress well. The arrayDecode is still slow, though.

FourthWorld · Post by **FourthWorld** » Wed May 28, 2014 2:45 pm

FWIW I just flagged the new argument as a bug, hopefully prompting consideration of at least the two option presented there for dealing with the format change:
http://quality.runrev.com/show_bug.cgi?id=12547

livecodeali · Post by **livecodeali** » Wed May 28, 2014 5:16 pm

Hi all,

I have commented on the bug report, though it probably would have been better to do so here. So I'll effectively just copy & paste

It is not necessary to remember which version this was added, the arrayEncode will preserve unicode for any stackfileversion >= 7.0. The only other solution available, I think, to enable unicode to "just work" in this case is to traverse the entire array prior to the encode, and check to see if it can be encoded losslessly in the legacy format; if not then it encodes in the new format.

The main disadvantage of this route of course is the extra time expense of the array traversal.

livecodeali · Post by **livecodeali** » Wed May 28, 2014 5:18 pm

Perhaps that could be the default behaviour, with the other parameter being an optional one to force the new or legacy version.

DarScott · Post by **DarScott** » Wed May 28, 2014 6:37 pm

I'm not seeing any issues with the default being an adaptive encoding.

A consideration is the wording and size of the paragraph describing this in the dictionary. It might be handy to have a name for this case that can be used as the second parameter and that is the default. Otherwise, the dictionary has to somehow say the default is not one of the options.

I wonder if it would be handy to allow a LiveCode version to be used as the encoding version and that is rounded down to a known encoding version. I don't have any strong opinion on this.

LiveCode Forums

turning on Unicode for arrayEncode/arrayDecode?

turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?

Re: turning on Unicode for arrayEncode/arrayDecode?