proprietary file format, why not and why....

Want to talk about something that isn't covered by another category?

Moderators: Klaus, FourthWorld, heatherlaine, kevinmiller, robinmiller

sphere
Posts: 1056
Joined: Sat Sep 27, 2014 10:32 am
Location: Earth, Except when i Jump

proprietary file format, why not and why....

Post by sphere » Mon Oct 31, 2016 7:24 pm

HI,

i know that to open or save Word or Excell formats have been asked before.

They are proprietary file formats.

Maybe someone can explain how come that Libre Office and Open Office can do this, read and write it?
They are free programs. Are they paying for it to be able to do that?

Is it possible for LC to handle Libre Office formats? feature?

Just some questions which came to my mind, because we all know that saving an excell sheet to CSV and then import it to the datagrid is pure hell :twisted:

Can anyone share a bright light on this?

Thanks!

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7839
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: proprietary file format, why not and why....

Post by FourthWorld » Mon Oct 31, 2016 7:45 pm

See "File formats and metadata" here:
https://en.m.wikipedia.org/wiki/Microsoft_Office
Richard Gaskin
Community volunteer LiveCode Community Liaison

LiveCode development, training, and consulting services: Fourth World Systems: http://FourthWorld.com
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

andrewferguson
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 184
Joined: Wed Apr 10, 2013 5:09 pm

Re: proprietary file format, why not and why....

Post by andrewferguson » Mon Oct 31, 2016 11:06 pm

It should be possible to read word files using LiveCode, if that is what you want to do. WordLib is a 3rd party add on for LiveCode by Curry Kenworthy. Website here: http://livecodeaddons.com/. The add-ons you may be interested in are WordLib and SpreadLib.

Disclaimer: I have never used these plugins myself, I just found them through searching online. You will need a commercial license of LiveCode for them to work.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4713
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: proprietary file format, why not and why....

Post by richmond62 » Tue Nov 01, 2016 7:25 pm

then import it to the datagrid is pure hell
Possibly importing it into a Table Field is not quite so hellish.

However, I, obviously, don't understand how to use the itemDelimiter properly with a Table Field:
spreadSH.png
Snapshot of spread sheet created in LibreOffice
spreadSH.png (4.32 KiB) Viewed 4718 times
blah.png
Messy result owing to my silly code
messin aboot.livecode.zip
Stack to modify
(744 Bytes) Downloaded 146 times

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4713
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: proprietary file format, why not and why....

Post by richmond62 » Tue Nov 01, 2016 7:26 pm

spreadSH.csv.zip
CSV exported from Libre Office
(168 Bytes) Downloaded 141 times

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7839
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: proprietary file format, why not and why....

Post by FourthWorld » Tue Nov 01, 2016 7:46 pm

richmond62 wrote:
spreadSH.csv.zip
Yes, spreadsheets can import and export various forms of CSV, but the OP is looking for guidance on how to work with the native formats that Microsoft Excel and LibreOffice Calc use.

In the link I provided earlier, the modern form of both formats are based on Zip archives. You can use LC's revZip externals to open them and find the file containing the contents. I believe both Microsoft Office and LibreOffice use XML for the format of most files in the Zip archive.

In LibreOffice it seems a spreadsheet's content is stored in content.xml. I haven't used Microsoft Office in so many years I have no idea if they follow the same naming convention, but I'll bet they've documented it at MSDN.

I've found that in Linux (and possibly also Mac and Windows) the easiest way to poke around in an Microsot Office or LibreOffice file is to just change the name to add ".zip" to it (e,g. "mySpreadsheet.ods" becomes "mySpreadsheet.ods.zip"), and then right-click on it and choose "Extract Here" - as with any other Zip file, that'll produce a folder with the contents of the Zip archive.
Richard Gaskin
Community volunteer LiveCode Community Liaison

LiveCode development, training, and consulting services: Fourth World Systems: http://FourthWorld.com
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4713
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: proprietary file format, why not and why....

Post by richmond62 » Tue Nov 01, 2016 7:50 pm

contents.png
Well, now there's an opportunity for buckets of fun
we all might find hard to resist . . .

The data one is looking for seems to reside in the 'content.xml' file,
but one still has to strip out all the "other" code.

sphere
Posts: 1056
Joined: Sat Sep 27, 2014 10:32 am
Location: Earth, Except when i Jump

Re: proprietary file format, why not and why....

Post by sphere » Tue Nov 01, 2016 8:58 pm

Guys, (and Dolls),

i'm not looking for guidance how to import CSV. I know.
I even used this parser https://github.com/macMikey/csvToTextwhich works great , but not for all. I also did after that some extra filtering.
If a cell in Excell is set to (don't know how to call it in english) have multiple lines in it. Then that's a problem.
Even CSV on MS office is way different than LibreOffice.

And i know that the xlsx and xdoc have a few more files in it.

So before i can import an Excell file i have to check every line and cell to see if there are no double lines in it, yes you can turn it off in a menu i know.
Then save it as CSV and import it. And it works but never 100% correct without faults.

So i just wanted to know, why those free Office's can do it. And LC has a hard time to get it decent into a datagrid or table field.
Is there no way to implement it like LibreOffice does?

There was also this story, CSV has to die...

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4713
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: proprietary file format, why not and why....

Post by richmond62 » Tue Nov 01, 2016 9:17 pm

http://www.fourthworld.com/embassy/arti ... t-die.html

Richard Gaskin wearing a different hat.

"The problem with CSV is that the comma is very commonly used in data, making it a uniquely stupid choice as a delimiter."

That makes perfect sense.

HOWEVER: which character should we choose?

Personally I rather like "^", but somebody will object.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7839
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: proprietary file format, why not and why....

Post by FourthWorld » Tue Nov 01, 2016 9:45 pm

richmond62 wrote:http://www.fourthworld.com/embassy/arti ... t-die.html

Richard Gaskin wearing a different hat.

"The problem with CSV is that the comma is very commonly used in data, making it a uniquely stupid choice as a delimiter."

That makes perfect sense.

HOWEVER: which character should we choose?
Ideally something not commonly found in content. The article you linked to suggests the delimiters FileMaker Pro uses, so at least you'd be in good company with one of the few companies that's made any attempt at all to deliver consistent export for several years.
Richard Gaskin
Community volunteer LiveCode Community Liaison

LiveCode development, training, and consulting services: Fourth World Systems: http://FourthWorld.com
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4713
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: proprietary file format, why not and why....

Post by richmond62 » Wed Nov 02, 2016 12:54 pm

There is a way to export a CSV file from LibreOffice Calc where you
can choose what the item delimiter is:

https://ask.libreoffice.org/en/question ... ed-output/

Here (just to be bloody-minded) is a Carat-delimited text-file ["^"]:
spreadSH^.csv.zip
(170 Bytes) Downloaded 130 times

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7839
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: proprietary file format, why not and why....

Post by FourthWorld » Wed Nov 02, 2016 2:44 pm

sphere wrote:I even used this parser https://github.com/macMikey/csvToTextwhich works great , but not for all. I also did after that some extra filtering.
If a cell in Excell is set to (don't know how to call it in english) have multiple lines in it. Then that's a problem.
Indeed it would be. The Tweedly algo there was designed to handle not only in-data commas, but also in-data returns. It works well with the sample data sets provided in my "CSV Must Die" article.

If you've found data sets of unaltered exports from Excel or LibreOffice which aren't parsed correctly by the Tweedy algo, please share them so we can refine the code to handle them.
Richard Gaskin
Community volunteer LiveCode Community Liaison

LiveCode development, training, and consulting services: Fourth World Systems: http://FourthWorld.com
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4713
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: proprietary file format, why not and why....

Post by richmond62 » Wed Nov 02, 2016 7:40 pm

Part of the riddle would seem to be how to get Livecode to "see" inside
an .ODS archive.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7839
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: proprietary file format, why not and why....

Post by FourthWorld » Wed Nov 02, 2016 8:02 pm

richmond62 wrote:Part of the riddle would seem to be how to get Livecode to "see" inside
an .ODS archive.
Already covered above: modern office formats are Zip files, so you can poke around in them and read/write contents with the revZip external. For those internal files that are in XML format the revXML commands will come in handy.
Richard Gaskin
Community volunteer LiveCode Community Liaison

LiveCode development, training, and consulting services: Fourth World Systems: http://FourthWorld.com
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

sphere
Posts: 1056
Joined: Sat Sep 27, 2014 10:32 am
Location: Earth, Except when i Jump

Re: proprietary file format, why not and why....

Post by sphere » Thu Nov 03, 2016 11:13 pm

Richard,

i have to see if i can share one or just a part of it.
Because it's from my job and most probably not allowed to send off to anyone.
I wrote an stack for it to import it in the datagrid after having it ported to csv and even txt, to keep track of some things and then export it again to a csv file. I had the "tweedy algo" do some magic and even then did some "filtering" and even then some cells wen't wrong.

I will see, i'll let you know if i can share some data.

Post Reply

Return to “Off-Topic”