regex problem: convert CSV to comma delimited

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

regex problem: convert CSV to comma delimited

Post by kaveh1000 » Thu Jul 24, 2014 8:01 am

I am reading in a file that is standard CSV, e.g.

option_with_value,option,mm,120,200," one, two, three",174,2,0

I need to convert the commas to tabs, but not of course the commas inside quotation marks. I was trying to write a loop to change any commas inside quotations to, say, ••comma••, then convert other commas to tabs, and convert ••comma•• to a normal comma again.

Here is my replacetext command:

put replacetext (thetext, """(.*),(.*)""", """\1•••comma•••\2""") into thetext

but I get:

compilation error at line 742 (Function: separator is not a ',') near "(.*),(.*)", char 47

Not sure if I am not escaping " correctly, or I can't use "," inside search string. Any help appreciated.
Kaveh

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: regex problem: convert CSV to comma delimited

Post by Thierry » Thu Jul 24, 2014 8:13 am

kaveh1000 wrote: put replacetext (thetext, """(.*),(.*)""", """\1•••comma•••\2""") into thetext
AFAIK,
you can't use back references in the replace part of replacetext()

But this is not related to the error you mentionned.

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: regex problem: convert CSV to comma delimited

Post by kaveh1000 » Thu Jul 24, 2014 8:25 am

Thanks Thierry. By back reference you mean \1, \2?
Kaveh

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: regex problem: convert CSV to comma delimited

Post by Thierry » Thu Jul 24, 2014 8:27 am

kaveh1000 wrote:Thanks Thierry. By back reference you mean \1, \2?
Yep.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: regex problem: convert CSV to comma delimited

Post by Thierry » Thu Jul 24, 2014 9:19 am

kaveh1000 wrote:I am reading in a file that is standard CSV, e.g.
option_with_value,option,mm,120,200," one, two, three",174,2,0
I need to convert the commas to tabs,
but not of course the commas inside quotation marks.
Ok, here is a quick attempt (half tested):

This should work for a one line text entry:

Code: Select all

on mouseUp
   put line 1 of field "Ftest" into T
   # T: option_with_value,option,mm,120,200," one, two, three",174,2,0
   # Regex: \,(?=([^"]*"[^"]*")*[^"]*$)
   put  replacetext( T,  the Regex of me , " - " )
end mouseUp
Message box: option_with_value - option - mm - 120 - 200 - " one, two, three" - 174 - 2 - 0

Or, if you want to process *all* your CVS file:

Code: Select all

on mouseUp
   put  field "Ftest" into T
   # Regex2:  (?ms)\,(?=([^"]*"[^"]*")*[^"]*$)
   put  replacetext( T,  the Regex2 of me , " - " )
end mouseUp
HTH, and you can check "look ahead" in regex documentation to understand the trick...

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: regex problem: convert CSV to comma delimited

Post by kaveh1000 » Thu Jul 24, 2014 9:52 am

I am grateful for this, Thierry...

I did try look-ahead, but my regexp skills were not good enough. This seems to work and I promise to read the code carefully later and fully understand what it is doing. ;-)

Thanks again..
Kaveh

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: regex problem: convert CSV to comma delimited

Post by Thierry » Thu Jul 24, 2014 9:58 am

kaveh1000 wrote:I am grateful for this, Thierry...
You're welcome!
This seems to work and I promise to read the code carefully later and fully understand what it is doing. ;-)
Thanks again..
Great!
I'll come one of these days in London with a regex quiz.. :)

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: regex problem: convert CSV to comma delimited

Post by Thierry » Thu Jul 24, 2014 10:08 am

kaveh1000 wrote:... This seems to work...
Just forgot to tell you something..

I've put the regex in a custom property to NOT to fight
with the "devil-escaping-quotes" problem in Livecode which was probably the
reason of the error you did mention in your 1st post.

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: regex problem: convert CSV to comma delimited

Post by kaveh1000 » Thu Jul 24, 2014 10:29 am

Yup, I got that, Thierry, and a great trick. I love custom properties of LiveCode. So clever. :-)
Kaveh

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10044
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: regex problem: convert CSV to comma delimited

Post by FourthWorld » Thu Jul 24, 2014 3:15 pm

If you have the option of using any other format you'll be much happier. CSV is notoriously ill-conceived, responsible for the loss of millions of hours every year.

But if you have no choice, Alex Tweedly came up with a good way to handle the insanity that is CSV, which I tweaked for this article on why CSV must die:

http://www.fourthworld.com/embassy/arti ... t-die.html
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: regex problem: convert CSV to comma delimited

Post by kaveh1000 » Thu Jul 24, 2014 3:25 pm

Thanks Richard

Unfortunately no one ever defined CSV properly and I don't like its association with Excel. But it is so simple and human readable. I am looking at alternatives like YAML.

Incidentally there has just been a conference on CSV:

http://csvconf.com/

I will certainly look at your article. I am interested. :-)

(I bought your hypercard book when it first came out!)
Kaveh

[-hh]
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2262
Joined: Thu Feb 28, 2013 11:52 pm

Re: regex problem: convert CSV to comma delimited

Post by [-hh] » Fri Jul 25, 2014 1:57 am

..........
Last edited by [-hh] on Wed Aug 13, 2014 3:48 pm, edited 1 time in total.
shiftLock happens

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: regex problem: convert CSV to comma delimited

Post by kaveh1000 » Fri Jul 25, 2014 5:00 am

indeed that is my question and my aim, i.e. regexp is not mandatory. And your solution is a great new approach. I have learned several things already from it. :-)
Kaveh

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10044
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: regex problem: convert CSV to comma delimited

Post by FourthWorld » Fri Jul 25, 2014 8:15 pm

kaveh1000 wrote:Unfortunately no one ever defined CSV properly and I don't like its association with Excel. But it is so simple and human readable.
Delimited data can be a great thing, provided the delimiters used aren't as commonly part of the data as are commas.

Tab-delimited is nearly optimal - see the notes near the end of the article for the super-simple format so many millions already use.

Even pipe-delimited is better than CSV.

Heck, tossing all your data out the window and hitting yourself with a hammer is better than CSV. ;)
I am looking at alternatives like YAML.
YAML's great for human-readability, but all formats involve trade-offs and YAML's is with processing time. If human-writability is a key concern it can be worth the time to write a parser, but JSON or even tab-delimited may do quite well depending on your needs.
Incidentally there has just been a conference on CSV:
http://csvconf.com/
Thanks for the link. I wish they had one in the States. Delimited data is so great for so many things that it makes XML quite obviously bloated and overused for many projects that depend on it. Just any delimiter than a comma, please. ;)
(I bought your hypercard book when it first came out!)
I wish I could say "Thanks", but I'm afraid you must be thinking of one of the more accomplished contributors to these forums. My only printed XTalk work was the SuperTalk Language Guide, and I've not written a HyperCard book.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: regex problem: convert CSV to comma delimited

Post by kaveh1000 » Fri Jul 25, 2014 8:42 pm

Hi Richard

I thought I must be missing something but it is now so obvious that comma is the worst possible delimiter, and quotes can get you in a mess too. I actually had a tab delimited "CSV" file but I wanted it to more "standard". Should have just kept it the way it was! Might go back.

And sorry for mixing you up. My memory is associating you with HyperCard for ever... ;-)
Kaveh

Post Reply