Page 1 of 1
Save data using URL command mangles bit values
Posted: Sat May 21, 2022 8:40 am
by Simon Knight
Hi,
I have some code that attempts to create a valid xmp sidecar file. These files are used to store metadata about images and other media files. A xmp file is a form of xml so is mostly text. However there is an exception in the first line which includes three high value characters.
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
The three characters are between the two single quotes following begin= . This header is common to all xmp files so my code stores the header in a variable. I have confirmed that the three characters are in the stream of bytes and they are EF BB BF. My code uses the URL command to write the final file and this strips the characters replacing them with a question mark.
Code: Select all
put tXMPhead & cr & tXMPKeywords & tXMPTail into tXMPData
put tXMPdata into URL ("binfile:" & pRawFileDetailsA[tKey]["XMPfilePath"])
I have confirmed that tXMPhead contains the characters but when written to file they get replaced.
I have also tried the following code :
Code: Select all
put pRawFileDetailsA[tKey]["XMPfilePath"] into tFilePath
Open file tFilePath for binary write
write tXMPdata to file tFilePath
close file tFilePath
This also fails.
The results of the Livecode file operations:

- Character decode of LC created file
- 2022-05-21-082250-Screenshot 2022-05-21 at 08.22.45.png (16.2 KiB) Viewed 4232 times
Hex values:

- Hex decode of LC created file
- 2022-05-21-082258-Screenshot 2022-05-21 at 08.22.52.png (20.76 KiB) Viewed 4232 times
The screen shots below show portions of the variable that is being saved into a file. They are taken from a hex editor. I placed a break point in my code and copied the contents of the variable into a text file and then opened the text file in the hex editor.

- Character decode of variable
- 2022-05-21-081740-Screenshot 2022-05-21 at 08.17.34.png (20.11 KiB) Viewed 4232 times
best wishes
Simon
Re: Save data using URL command mangles bit values
Posted: Sat May 21, 2022 8:41 am
by Simon Knight
here is the hex value of the variable described above. (forum would not allow me to post a 4th image)

- Hex of variable
- 2022-05-21-081747-Screenshot 2022-05-21 at 08.17.41.png (27.4 KiB) Viewed 4231 times
Re: Save data using URL command mangles bit values
Posted: Sat May 21, 2022 9:15 am
by LCMark
How is tXMPhead being constructed? It looks like you are mixing text and binary data. Since you are constructing a binary file, you need to make sure that all parts are binary to stop the engine applying default conversions from text to binary (in particular 'invalid' chars will map to ?).
If you do:
Code: Select all
put "<?xpacket begin='" & numToByte(0xEF) & numToByte(0xBB) & numToByte(0xBF) & "' id='W5M0MpCehiHzreSzNTczkc9d'?>" into tXMPhead
I suspect the problem will go away.
Re: Save data using URL command mangles bit values
Posted: Sat May 21, 2022 9:58 am
by Simon Knight
Yes it works - thanks.
What I find confusing is why or how does using NumtoByte stop the engine from stripping these values ? I can understand that the engine looks at a stream of bytes and checks to see if they are within normal printable ASCII character values meaning that it sees the strange non character values and strips them out. But I don't understand how NumToByte is working. Is it adding a special control byte to tell the engine to pass the values on into the file ? Obviously I have no need to know how it works but it would be good to have some greater understanding.
Simon
Re: Save data using URL command mangles bit values
Posted: Sat May 21, 2022 1:20 pm
by LCMark
So I can't say for sure what was going on in your case as I don't know what your original code was doing (I've tried to reproduce the effect you described and cannot).
Prior to 7 there was no difference between binary data and text - they could be the same because text was only ever single-byte values interpreted relative to the native text encoding (e.g. MacRoman on macOS and Windows-1252 on Windows). When we moved to 7, text became (from script's point of view) a sequence of characters (relative to Unicode) - so the 1-1 mapping between text and binary data (via the native encoding) no longer existed, however we still needed to keep compatibility with existing code.
Internally the engine has a separate datatype for binary data (data) vs text (string). Byte operations generate data, text operations generate strings, and operations which makes sense on both will convert from data to string unless all operands are data.
Converting from data to string is done via the native text encoding (as it always implicitly was - even though the engine had to do nothing to achieve this prior to 7) - similarly, when you use a string in a context expecting data, the engine will convert to the native text encoding.
So if you have a string and put it into a binfile, the first thing the engine does is convert the string to binary data by mapping each character to the matching character in the native text encoding - if the character cannot be represented there, then it is replaced with ?.
In this case, it looks to me like you actually managed to have U+FFFD as a character in your tXMLhead variable (rather than three characters you expected) - this isn't present in the native encoding so maps to ?.
If you can share your original (not working) code I can probably work out precisely what was going on in your case though.
Re: Save data using URL command mangles bit values
Posted: Sat May 21, 2022 1:53 pm
by Simon Knight
Mark,
Thanks for taking the time to write your response and to offer to look at my stack.
I have created an archive file that includes an image file which means that it is to large to fit in the forum. If all is well it should be available on this link
https://www.dropbox.com/s/pjdi0l9gys162 ... k.zip?dl=0
The handler of interest is in the stack and named "UpdateCreateXMPSidecars"
This reads a custom property of the stack which holds the boiler plate "text" copied from a valid xmp file.
I have disabled all but the button of interest. Pressing the button "Add Filename -tags- to xmp as keywords" will prompt for a folder of images and -tagwords- in the filename to a xmp sidecar file.
I hope that all makes some sense.
Simon
Stack files minus example dng image file.
- Stacks.zip
- main stack is named SideCarSync...
- (24.2 KiB) Downloaded 146 times
Re: Save data using URL command mangles bit values
Posted: Sat May 21, 2022 2:19 pm
by LCMark
So `the XMLHead` custom property of your main stack contains a string which contains U+FFFD (essentially the 'unknown character' unicode character) at the place you were expecting three bytes. So why that is there, rather than what you expected, depends on how you set that custom property.
Its worth pointing out the 3 byte sequence which you were expecting is actually the UTF-8 encoded version of U+FFFD - so an alternative way (and morally correct way!) to do this is just do textEncode as utf-8 before putting into the binfile url.